ISCA Archive Interspeech 2022
ISCA Archive Interspeech 2022

FFM: A Frame Filtering Mechanism To Accelerate Inference Speed For Conformer In Speech Recognition

Zongfeng Quan, Nick J.C. Wang, Wei Chu, Tao Wei, Shaojun Wang, Jing Xiao

This paper proposes a frame filtering mechanism (FFM) to accelerate inference speed for speech recognition. The FFM consists of three parts: one frame invalid indicator distinguishing whether the frame is invalid or not, one filtering strategy removing invalid frames, and one extractor attention block recalling useful information from filtered frames. The feature sequence will become shorter after FFM block. As a result, the inference is accelerated. Compared to other downsampling approaches on LibriSpeech, our method can achieve best WER with lowest RTF. Experiments on Aishell-1 show that our approach reduces the sequence length by up to 73% and achieves 21.1%--34.5% relative RTF reduction with relative WER increasing no more than 5.8\%.


doi: 10.21437/Interspeech.2022-656

Cite as: Quan, Z., Wang, N.J.C., Chu, W., Wei, T., Wang, S., Xiao, J. (2022) FFM: A Frame Filtering Mechanism To Accelerate Inference Speed For Conformer In Speech Recognition. Proc. Interspeech 2022, 3153-3157, doi: 10.21437/Interspeech.2022-656

@inproceedings{quan22_interspeech,
  author={Zongfeng Quan and Nick J.C. Wang and Wei Chu and Tao Wei and Shaojun Wang and Jing Xiao},
  title={{FFM: A Frame Filtering Mechanism To Accelerate Inference Speed For Conformer In Speech Recognition}},
  year=2022,
  booktitle={Proc. Interspeech 2022},
  pages={3153--3157},
  doi={10.21437/Interspeech.2022-656}
}