Speech Intelligibility Demo
The PSCR program, with support from DHS S&T, is studying the intelligibility of speech that is encumbered with background noise and then digitally encoded. The Listen tab that follows provides some example MRT recordings from the study.
Audio codecs provide efficient (low data rate) digital representations of audio signals. When the signal is speech alone, a speech-specific signal model leads to efficient coding with good intelligibility. But when significant levels of background noise are combined with speech, broader or more robust signal models are required and these in turn typically require higher data rates. Thus one will expect to experience higher intelligibility for the examples that use higher bit-rates.
Audio Details and Samples
FM — Software simulation of analog FM land-mobile radio (full quieting signal, no interference).
P25-HR — APCO Project 25 half-rate codec (AMBE+2,™ version 1.6).
P25-FR — APCO Project 25 full-rate codec (AMBE+2,™ version 1.6).
AMR — Adaptive Multi-Rate codec. From http://www.3gpp.org/DynaReport/26104.htm. ANSI-C code for the floating-point Adaptive Multi-Rate (AMR) speech codec, ETSI/3GPP TS 26.104, Rev. 10.0.0, Apr. 2011.
AMR-WB — Adaptive Multi-Rate Wideband codec. From http://www.3gpp.org/DynaReport/26204.htm. Speech codec speech processing functions; Adaptive Multi-Rate - Wideband (AMR-WB) speech codec; ANSI-C code, ETSI/3GPP TS 26.204, Rev. 10.0.0, Apr. 2011.
EVS — 3GPP Enhanced Voice Services Codec. From http://www.3gpp.org/DynaReport/26442.htm, Codec for Enhanced Voice Services (EVS); ANSI-C code (fixed-point) 26442-c00-ANSI-C_source_code.zip, (Version 12.0.0) Sept. 2014.
Opus — IETF Opus Interactive Audio Codec. From http://www.opus-codec.org, libopus 1.1, Oct. 2014.
Raw data rate required between encoder and decoder in bits per second. Does not include any forward error correction. No bit errors or packet loss.
Narrowband — Supported audio bandwidth has a nominal upper limit between 3 and 4 kHz.
Wideband — Supported audio bandwidth has a nominal upper limit between 7 and 8 kHz.
Fullband — Supported audio bandwidth has a nominal upper limit near 20 kHz.
Club — Speech combined with recorded nightclub noise at 0 dB SNR.
Siren — Speech combined with recorded fire truck siren at 0 dB SNR.
Quiet — No extra noise added to speech.
All — Contains club, siren, and quiet versions, in that order.
Each audio sample in this demonstration includes this speech in this order:
Female speaker — “Please select the word pin.”
Male speaker — “Please select the word hark.”
Female speaker — “Please select the word wig.”
Male speaker — “Please select the word tip.”
Certain commercial products, organizations, and companies are identified in this audio demonstration material to specify adequately the technical aspects of the available audio files. In no case does such identification imply recommendation or endorsement by PSCR, NIST CTL, or NTIA ITS, nor does it imply that the products, organizations, or companies identified are necessarily the best available for the particular application or use.
1 | 2