We propose a monaural intrusive speech intelligibility prediction (SIP) algorithm called STGI based on detecting glimpses in short-time segments in a spectro-temporal modulation decomposition of the input speech signals. Unlike existing glimpse-based SIP methods, the application of STGI is not limited to additive uncorrelated noise; STGI can be employed in a broad range of degradation conditions. Our results show that STGI performs consistently well across 15 datasets covering degradation conditions including modulated noise, noise reduction processing, reverberation, near-end listening enhancement, checkerboard noise, and gated noise.
Cite as: Edraki, A., Chan, W.-Y., Jensen, J., Fogerty, D. (2021) A Spectro-Temporal Glimpsing Index (STGI) for Speech Intelligibility Prediction. Proc. Interspeech 2021, 206-210, doi: 10.21437/Interspeech.2021-605
@inproceedings{edraki21_interspeech, author={Amin Edraki and Wai-Yip Chan and Jesper Jensen and Daniel Fogerty}, title={{A Spectro-Temporal Glimpsing Index (STGI) for Speech Intelligibility Prediction}}, year=2021, booktitle={Proc. Interspeech 2021}, pages={206--210}, doi={10.21437/Interspeech.2021-605} }