Skip to main content
Log in

Mel scaled M-band wavelet filter bank for speech recognition

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

A Mel scaled M-band wavelet filter bank structure is used to extract the robust acoustic feature for speech recognition application. The proposed filter bank can provide flexibility of frequency partition that decomposes the speech signal into the M-frequency band. To estimate the difference between Mel scaled M-band wavelet and dyadic wavelet filter bank, relative bandwidth deviation (RBD) and root mean square bandwidth deviation (RMSBD) with respect to baseline (Mel filter bank bandwidth) is calculated. Proposed filter bank gives 40.90 and 49.84% reduction for RBD and RMSBD respectively, over 24-dyadic wavelet filter bank. Feature extraction from the proposed filter bank using AMUAV corpus shows an improvement in terms of word recognition accuracy (WRA) at all SNR range (20 dB to 0 dB) over baseline (MFCC) features. For AMUAV corpus, the proposed feature shows the maximum improvement in WRA of 3.93% over baseline features and 3.90% over dyadic wavelet filter bank features. When applied to the VidTIMIT corpus, proposed features show the maximum improvement in WRA of 1.64% over baseline features and 4.43% over dyadic features.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

Download references

Acknowledgements

The authors would like to acknowledge Institution of Electronics and Telecommunication Engineers (IETE) for sponsoring the research fellowship during this period of research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Prashant Upadhyaya.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Upadhyaya, P., Farooq, O. & Abidi, M.R. Mel scaled M-band wavelet filter bank for speech recognition. Int J Speech Technol 21, 797–807 (2018). https://doi.org/10.1007/s10772-018-9545-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-018-9545-2

Keywords

Navigation