Formosa Speech in the Wild Corpus for Improving Taiwanese Mandarin Speech-Enabled Human-Computer Interaction

580 Accesses
10 Citations
1 Altmetric
Explore all metrics

Abstract

Mandarin in Taiwan is notably different from other variants of Mandarin in terms of lexical use and accents. However, from an investment perspective, it remains debated whether the general-purpose Mandarin speech recognition (MSR) systems are sufficient for supporting human-computer interaction in Taiwan. In addressing this question, we established the Formosa (an ancient name of Taiwan given by the Portuguese) Speech in the Wild (FSW) (Liao 2018) project to (1) collect large-scale Taiwanese Mandarin speech to boost Taiwanese-specific MSR technique development, and (2) host a Formosa Speech Recognition (FSR) challenge (Liao 2018) to promote the corpus as well as to evaluate the performance of the available Taiwanese-specific MSR systems. The FSW project has focused on transcribing spontaneous Taiwanese Mandarin speech selected from real-life, multi-genre broadcast radio speech provided by Taiwan’s National Education Radio (2018). We plan to publicly release about 3000 hours of speech data at the end of 2019. FSR-2018 (Liao 2018) was the culmination of FSW’s events in the year 2018, which featured a Taiwanese broadcast Mandarin speech recognition evaluation campaign using released corpora. The challenge was also an official activity (Liao 2018) of the 11^th International Symposium on Chinese Spoken Language Processing (ISCSLP) [22]. At the end of 2018, the first 4 volumes of the FSW Corpus, NER-Trs-Vol1∼4, a total of 610.2 hours of speech data, were released to support two events, Formosa Grand Challenge, Talk to AI (FGC) (Ministry of Science And Technology Taiwan 2018) (Dec. 2017 ∼ Mar. 2019) and FSR-2018 challenge (Liao 2018) (June 2018 ∼ Nov. 2018), which had 147 and 27 participating teams respectively. For FSR-2018, 30 recognition results on the final-test set were submitted by 16 teams. The evaluation results revealed that the best Taiwanese-specific MSR system achieved an 8.1% Chinese character error rate (CER). As reference, the performances of iFlyTek’s (ISCSLP 2018) and Google’s (2018) commercial MSR systems which were not optimized for this task were 18.8% and 20.6% CERs, respectively. Taken together, we argued that a Taiwanese-specific MSR system is necessary for improving the performance of Taiwanese Mandarin speech-enabled human-computer interaction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CORAA ASR: a large corpus of spontaneous and prepared speech manually validated for speech recognition in Brazilian Portuguese

Article Open access 21 November 2022

Arnaldo Candido Junior, Edresson Casanova, … Sandra Maria Aluísio

Chhattisgarhi speech corpus for research and development in automatic speech recognition

Article 16 February 2018

Narendra D. Londhe & Ghanahshyam B. Kshirsagar

A Systematic Review on Automatic Speech Recognition for Odia Language

References

Boersma, P., & Weenink, D. (2018). Praat: doing phonetics by computer. http://www.fon.hum.uva.nl/praat/. Accessed: 2019-01-28.
Bu, H., Du, J., Na, X., Wu, B., Zheng, H. (2018). AISHELL-1: an open-source Mandarin speech corpus and a speech recognition baseline. In 2017 20th Conference of the oriental chapter of international committee for coordination and standardization of speech databases and assessment techniques, O-COCOSDA 2017 (pp. 1–5), DOI https://doi.org/10.1109/ICSDA.2017.8384449.
Chan, W., Jaitly, N., Le, Q., Vinyals, O. (2016). Listen, attend and spell: a neural network for large vocabulary conversational speech recognition. In ICASSP, IEEE International conference on acoustics, speech and signal processing - proceedings (pp. 4960–4964), DOI https://doi.org/10.1109/ICASSP.2016.7472621.
Chang, H.j., Chao, W.c., Lo, T.h., Chen, B. (2018). NTNU Speech recognition system at FSR 2018. In Formosa speech recognition challenge workshop. https://drive.google.com/file/d/1W2T76fyUj4mSFcKa7Z2kieVZMmdYVoWf. Accessed 2019-01-20.
Chang, Y.H.S., Liao, Y.F., Wang, S.M., Wang, J.H., Wang, S.Y., Chen, J.W., Chen, Y.D. (2017). Development of a large-scale Mandarin radio speech corpus. In 2017 IEEE International conference on consumer electronics - Taiwan, ICCE-TW 2017 (pp. 359–360), DOI https://doi.org/10.1109/ICCE-China.2017.7991144.
Chen, L.h., Hu, C.k., Hung, L.j., Lin, C.w. (2018). Towards a robust Taiwanese Mandarin automatic speech recognition system with Kaldi toolkit. In Formosa speech recognition challenge workshop. https://drive.google.com/file/d/15p5T43Qb3XVGkQbPTlH1MlhpdUa-Z-nL. Accessed 2019-01-20.
Chiu, C.C., Sainath, T.N., Wu, Y., Prabhavalkar, R., Nguyen, P., Chen, Z., Kannan, A., Weiss, R.J., Rao, K., Gonina, E., Jaitly, N., Li, B., Chorowski, J., Bacchiani, M. (2018). State-of-the-art speech recognition with sequence-to-sequence models. In ICASSP, IEEE International conference on acoustics, speech and signal processing - proceedings (pp. 4774–4778), DOI https://doi.org/10.1109/ICASSP.2018.8462105.
Du, J., Na, X., Liu, X., Bu, H. (2018). AISHELL-2: transforming Mandarin ASR research into industrial scale. arXiv:1808.10583. Accessed 2019-01-20.
ELRA Catalogue. (2006). Taiwan Mandarin speecon database – ELRA catalogue. http://catalogue.elra.info/en-us/repository/browse/ELRA-S0212/. Accessed 2019-01-18.
ESPnet. (2018). ESPnet: end-to-end speech processing toolkit. https://github.com/espnet/espnet. Accessed 2019-01-26.
Facebook Research. (2018). GitHub - facebookresearch/wav2letter: Facebook AI research automatic speech recognition toolkit. https://github.com/facebookresearch/wav2letter. Accessed 2019-01-28.
Ghahremani, P., BabaAli, B., Povey, D., Riedhammer, K., Trmal, J., Khudanpur, S. (2014). A pitch extraction algorithm tuned for automatic speech recognition. In 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP).
Ghahremani, P., Manohar, V., Hadian, H., Povey, D., Khudanpur, S. (2017). Investigation of transfer learning for ASR using LF-MMI trained neural networks. In ASRU 2017. http://www.danielpovey.com/files/2017_asru_transfer_learning.pdf. Accessed 2019-01-26.
Google. (2018). Cloud speech-to-text. https://cloud.google.com/speech-to-text/. Accessed 2019-01-26.
Hannun, A., Case, C., Casper, J., Catanzaro, B., Diamos, G., Elsen, E., Prenger, R., Satheesh, S., Sengupta, S., Coates, A., Ng, A.Y. (2014). Deep speech: scaling up end-to-end speech recognition. arXiv:http://arXiv.org/abs/1412.5567v2.
Hori, T., Cho, J., Watanabe, S. (2018). End-to-end speech recognition with word-based RNN language models. arXiv:1808.02608.
Hsu, W.H. (2018). A preliminary study on speaker diarization for automatic transcription of broadcast radio speech. Ph.D. thesis, National Taipei university of Technology. https://ir.lib.ntut.edu.tw/wSite/ct?mp=ntut&xItem=71271&ctNode=447. Accessed 2019-01-13.
Huang, C., & Chen, K. (1998). Academia Sinica balanced corpus of Modern Chinese. http://ckip.iis.sinica.edu.tw/CKIP/engversion/20corpus.htm. Accessed 2019-01-27.
Huang, C.R. (2009). Tagged Chinese gigaword version 2.0. http://www.ldc.upenn.edu/Catalog/catalogEntry.jsp?catalogId=LDC2009T14. Accessed 2019-01-26.
Hung, H.t. (2018). The AlexHT system for FSR challenge. In Formosa speech recognition challenge workshop. https://drive.google.com/file/d/15hjrTipVW0QOxb_tp_UAV9C2evdORdLu. Accessed 2019-01-20.
IFlyTek. (2018). iFLYTEK open platform —— China’s first artificial intelligence open platform for mobile internet and intelligent hardware developers. http://global.xfyun.cn/. Accessed 2019-01-19.
ISCSLP. (2018). ISCSLP2018 - the 11th international symposium on chinese spoken language processing (ISCSLP 2018). http://iscslp2018.org/. Accessed 2019-01-26.
Jaitly, N., & Hinton, G.E. (2013). Vocal tract length perturbation (VTLP) improves speech recognition. In ICML Workshop on deep learning for audio, speech and language. https://pdfs.semanticscholar.org/3de0/616eb3cd4554fdf9fd65c9c82f2605a17413.pdf. Accessed 2019-01-26.
Kaldi-ASR. (2018). Kaldi speech recognition toolkit. https://github.com/kaldi-asr/kaldi. Accessed 2019-01-27.
Kanda, N., Fujita, Y., Nagamatsu, K. (2018). Lattice-free state-level minimum Bayes risk training of acoustic models.
KingLine Data Center. (2018). KingLine data center. http://kingline.speechocean.com/. Accessed 2019-01-26.
KingLine Data Center. (2018). Taiwanese and english mixed speech recognition corpus (Mobile)-Sentences-1026 Speakers_ASR-Corpus_Commercial Resources_KingLine Data Center. http://kingline.speechocean.com/exchange.php?id=14927&act=view. Accessed 2019-01-27.
KingLine Data Center. (2018). Taiwanese speech recognition corpus (desktop)-conversation-300 Speakers_ASR-Corpus_ Commercial Resources_KingLine Data Center. http://kingline.speechocean.com/exchange.php?id=19262&act=view. Accessed 2019-01-27.
KingLine Data Center. (2018). Taiwanese speech recognition corpus (desktop)-sentences-204 Speakers_ASR-Corpus_Commercial Resources_KingLine Data Center. http://kingline.speechocean.com/exchange.php?act=view&id=1548. Accessed 2019-01-27.
KingLine Data Center. (2018). Taiwanese speech recognition corpus (mobile)-conversation-300 Speakers_ASR-Corpus_ Commercial Resources_KingLine Data Center. http://kingline.speechocean.com/exchange.php?id=19228&act=view. Accessed 2019-01-27.
KingLine Data Center. (2018). Taiwanese speech recognition corpus (mobile)-Ssentences-5232 Speakers_ASR-Corpus_ Commercial Resources_KingLine Data Center. http://kingline.speechocean.com/exchange.php?id=766&act=view. Accessed 2019-01-27.
Ko, T., Peddinti, V., Povey, D., Khudanpur, S. (2015). Audio augmentation for speech recognition. In Proceedings of the annual conference of the international speech communication association, INTERSPEECH. https://www.danielpovey.com/files/2015_interspeech_augmentation.pdf. Accessed 2019-01-26 (pp. 3586–3589).
Lee, H.s., Chen, K.y., Tsao, Y., Wang, H.m. (2018). The AS Kaldi-based Taiwanese Mandarin ASR system for FSR-2018. In Formosa speech recognition challenge workshop. https://drive.google.com/file/d/15gsMr_ZtT6Wuotz8-T9Gysj-7-tJz4Mw. Accessed 2019-01-20.
Liang, H.b., & Wang, Y.r. (2018). The NCTU ASR system for Formosa speech recognition challenge 2018. In Formosa speech recognition challenge workshop. https://drive.google.com/file/d/15inv3RHf9bTxwhwqrXwWbqNcxAfDgoxl. Accessed 2019-01-20.
Liao, Y.F. (2018). Call for FSR-2018 participants - ISCSLP2018. http://iscslp2018.org/CFParticipants.html. Accessed 2019-01-26.
Liao, Y.F. (2018). Formosa speech in the wild corpus. https://sites.google.com/speech.ntut.edu.tw/fsw/home/corpus. Accessed 2019-01-26.
Liao, Y.F. (2018). Formosa speech in the wild project - GitLab server. https://speech.nchc.org.tw. Accessed 2019-01-27.
Liao, Y.F. (2018). Formosa speech recognition challenge 2018. https://sites.google.com/speech.ntut.edu.tw/fsw/home/challenge. Accessed 2019-01-26.
Liao, Y.F. (2018). Formosa speech recognition challenge 2018. https://sites.google.com/speech.ntut.edu.tw/fsw/home/workshop. Accessed 2019-01-26.
Liao, Y.F. (2018). Formosa speech recognition recipe. https://github.com/yfliao/kaldi/tree/master/egs/formosa. Accessed 2019-01-28.
Liao, Y.F. (2018). Formosa speech recognition recipe. https://github.com/kaldi-asr/kaldi/tree/master/egs/formosa. Accessed 2019-01-28.
Liao, Y.F. (2018). Kaldi pull request #2474-formosa_speech recipe and database for Taiwanese Mandarin speech recognition. https://github.com/kaldi-asr/kaldi/pull/2474. Accessed 2019-01-28.
Liao, Y.F., Chang, Y.H.S., Wang, S.Y., Chen, J.W., Wang, S.M., Wang, J.H. (2018). A progress report of the Taiwan Mandarin radio speech corpus project. In 2017 20th conference of the oriental chapter of international committee for coordination and standardization of speech databases and assessment techniques, O-COCOSDA 2017 (pp. 1–6), DOI https://doi.org/10.1109/ICSDA.2017.8384450.
Liao, Y.F., Hsu, W.H., Lin, Y.C., Chang, Y.h.S., Pleva, M. (2018). Formosa speech recognition challenge 2018: data, plan and baselines, IEEE.
Linguistic Data Consortium. (1996). CALLFRIEND Mandarin Chinese-Taiwan dialect - linguistic data consortium. https://catalog.ldc.upenn.edu/LDC96S56. Accessed 2019-01-27.
Linguistic Data Consortium. (1998). Taiwanese Putonghua speech and transcripts - linguistic data consortium. https://catalog.ldc.upenn.edu/LDC98S72. Accessed 2019-01-27.
Linguistic Data Consortium. (2008). U.o.P.: linguistic data consortium webpage. https://www.ldc.upenn.edu/. Accessed 2019-01-26.
Lu, M.p., & Chen, C.p. (2018). NSYSU team for the Formosa speech recognition challenge 2018. In Formosa speech recognition challenge workshop. https://drive.google.com/file/d/15ndD-mwfM3JZ0DX_6BfArxSdb5J1dDYQ. Accessed 2019-01-20.
Lu, M.p., & Chen, C.p. (2018). NSYSU team for the Formosa speech recognition challenge 2018. In: Formosa speech recognition challenge workshop. https://drive.google.com/file/d/15ndD-mwfM3JZ0DX_6BfArxSdb5J1dDYQ. Accessed 2019-01-20.
Manohar, V., Hadian, H., Povey, D., Khudanpur, S. (2018). Semi-supervised training of acoustic models using lattice-free MMI. In ICASSP 2018 (pp. 4844–4848).
Mikolov, T., Kombrink, S., Deoras, A., Burget, L., Černocký, J. (2011). RNNLM - recurrent neural network language modeling toolkit. In Proceedings of ASRU 2011. http://www.fit.vutbr.cz/imikolov/rnnlm/rnnlm-demo.pdf. Accessed 2019-01-20 (pp. 196–201).
Milivojević, Z., Savić, N., Brodić, D. (2017). Three-parametric cubic interpolation for estimating the fundamental frequency of the speech signal. Computing and Informatics, 36(2), 449–469.
Article MathSciNet Google Scholar
Ministry of Science And Technology Taiwan. (2018). Formosa grand challenge, talk to AI. https://fgc.stpi.narl.org.tw/activity/techai2018. Accessed 2019-01-26.
Mozilla. (2013). Common voice. https://voice.mozilla.org/zh-TW. Accessed 2019-01-27.
Mozilla. (2018). Project DeepSpeech. https://github.com/mozilla/DeepSpeech. Accessed: 2019-01-28.
National Education Radio. (2018). National education radio. https://www.ner.gov.tw/english. Accessed 2019-01-20.
National Statistics Taiwan. (2010). 2010 population and housing census in 2010. https://www.stat.gov.tw/public/Attachment/21081884771.pdf. Accessed 2019-01-13.
Phonetics Laboratory. (2018). U.o.P.: the penn phonetics lab forced aligner. https://babel.ling.upenn.edu/phonetics/old_website_2015/p2fa/index.html. Accessed 2019-01-26.
Povey, D., Cheng, G., Wang, Y., Li, K., Xu, H., Yarmohamadi, M., Khudanpur, S. (2018). Semi-orthogonal low-rank matrix factorization for deep neural networks. In Interspeech 2018. https://doi.org/10.21437/Interspeech.2018-1417, http://www.danielpovey.com/files/2018_interspeech_tdnnf.pdf, (Vol. 2 pp. 3743–3747).
Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlíček, P., Qian, Y., Schwarz, P., Silovsk’y, J.S., Stemmer, G., Vesel’y, K.V. (2011). The Kaldi speech recognition toolkit. In ASRU 2011. http://kaldi.sf.net/.
Povey, D., Peddinti, V., Galvez, D., Ghahremani, P., Manohar, V., Na, X., Wang, Y., Khudanpur, S. (2016). Purely sequence-trained neural networks for ASR based on lattice-free MMI. In Proceedings of the annual conference of the international speech communication association, INTERSPEECH (pp. 2751–2755), DOI https://doi.org/10.21437/Interspeech.2016-595.
Sak, H., Senior, A.W., Rao, K., Beaufays, F. (2015). Fast and accurate recurrent neural network acoustic models for speech recognition. In INTERSPEECH 2015, 16th annual conference of the international speech communication association, Dresden, Germany, September 6-10, 2015. http://www.isca-speech.org/archive/interspeech_2015/i15_1468.html (pp. 1468–1472).
SpeechOcean. (2018). Speech data services - text data and image data services - speech datasets database. http://en.speechocean.com/. Accessed 2019-01-26.
Steering Committee for the Test Of Proficiency-Huayu. (2018). The test of Chinese as a foreign language (TOCFL). https://www.sc-top.org.tw/english/eng_index.php. Accessed 2019-01-26.
Tan, T., Qian, Y., Hu, H., Zhou, Y., Ding, W., Yu, K. (2018). Adaptive very deep convolutional residual network for noise robust speech recognition. IEEE/ACM Transactions on Audio Speech and Language Processing, 26 (8), 1393–1405. https://doi.org/10.1109/TASLP.2018.2825432.
Article Google Scholar
Tang, H., Lu, L., Kong, L., Gimpel, K., Livescu, K., Dyer, C., Smith, N.A., Renals, S. (2017). End-to-end neural segmental models for speech recognition. IEEE Journal on Selected Topics in Signal Processing, 11(8), 1254–1264. https://doi.org/10.1109/JSTSP.2017.2752462.
Article Google Scholar
The Association for Computational Linguistics and Chinese Language Processing. (2000). Database - the association for computational linguistics and chinese language processing. http://www.aclclp.org.tw/corp.php. Accessed 2019-01-27.
The Association for Computational Linguistics and Chinese Language Processing. (2018). TCC300 Corpus. http://www.aclclp.org.tw/use_mat.php#tcc300edu. Accessed 2019-01-27.
The Association for Computational Linguistics and Chinese Language Processing. (2018). The association for computational linguistics and chinese language processing. http://www.aclclp.org.tw. Accessed 2019-01-26.
The European Language Resources Association. (2018). ELRA-ELDA: the evaluations and language resources distribution agency. http://www.elra.info/en/. Accessed 2019-01- 26.
Wang, H.c., Seide, F., Tseng, C.y., Lee, L.s. (2000). Mat-2000 – design, collection, and validation of a Mandarin 2000-speaker telephone speech database. In InterSpeech (pp. 3–6).
Wang, H.M. (2003). MATBN 2002: a Mandarin Chinese broadcast news corpus. ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition, 10(2), 219–236.
Google Scholar
Wells, J.C. (1995). Computer-coding the IPA: a proposed extension of SAMPA. https://doi.org/10.1179/1362171811Y.0000000076, https://www.phon.ucl.ac.uk/home/sampa/ipasam-x.pdf. Accessed 2019-01-26.
Wikipedia. (2018). Languages of Taiwan. http://www.ethnologue.com/show_country.asp?name=Taiwan. Accessed 2019-01-27.
Wikipedia. (2018). Taiwanese Mandarin. https://en.wikipedia.org/wiki/Taiwanese_Mandarin. Accessed 2019-01-20.
Wu, M.c., Chen, W.y., Misbullah, A. (2018). Established a Taiwanese speech recognition system for formosa speech recognition challenge 2018. In Formosa speech recognition challenge workshop. https://drive.google.com/file/d/15kKTG_w_jbx20vW1s_6rBAScXMAHYxd-.
Xiong, W., Droppo, J., Huang, X., Seide, F., Seltzer, M., Stolcke, A., Yu, D., Zweig, G. (2017). The Microsoft 2016 conversational speech recognition system. In ICASSP, IEEE International conference on acoustics, speech and signal processing - proceedings. https://doi.org/10.1109/ICASSP.2017.7953159, https://arxiv.org/pdf/1708.06073.pdf (pp. 5255–5259).
Xu, H., Povey, D., Mangu, L., Zhu, J. (2011). Minimum Bayes Risk decoding and system combination based on a recursion for edit distance. Computer Speech & Language, 25(4), 802–828. https://doi.org/10.1016/j.csl.2011.03.001.
Article Google Scholar

Download references

Acknowledgements

This research was funded by Taiwan’s Ministry of Science Technology (MOST 106-3011-F-027-006, 107-3011-F-027-003, 106-2221-E-027-128, 107-2221-E-027-102, 108-2221-E-027-067, 107-2911-I-027-501 and 108-2911-I-027-501), by the Slovak Research and Development Agency - APVV SK-TW-2017-0005, and by the Cultural and educational grant agency project KEGA 009TUKE-4/2019 & Scientific grant agency project VEGA 1/0511/17 both financed by the Ministry of Education, Science, Research and Sport of the Slovak Republic.

This work was made possible with contents contributed by National Education Radio, Taiwan. Authors also want to thank for English proofreading.

Author information

Yuan-Fu Liao
Present address: , Complex Building 403, No.1, Sec. 3, Zhongxiao E. Rd., Taipei City, 10608, Taiwan

Authors and Affiliations

Department of Electronic Engineering, National Taipei University of Technology, Taipei, Taiwan
Yuan-Fu Liao, Yu-Chen Lin & Wu-Hua Hsu
Department of English, National Taipei University of Technology, Taipei, Taiwan
Yung-Hsiang Shawn Chang
Department of Electronics and Multimedia Communications, KEMT FEI, Technical University of Kosice, Letna 9, 04200, Kosice, Slovakia
Matus Pleva & Jozef Juhar

Authors

Yuan-Fu Liao
View author publications
You can also search for this author in PubMed Google Scholar
Yung-Hsiang Shawn Chang
View author publications
You can also search for this author in PubMed Google Scholar
Yu-Chen Lin
View author publications
You can also search for this author in PubMed Google Scholar
Wu-Hua Hsu
View author publications
You can also search for this author in PubMed Google Scholar
Matus Pleva
View author publications
You can also search for this author in PubMed Google Scholar
Jozef Juhar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuan-Fu Liao.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liao, YF., Chang, YH.S., Lin, YC. et al. Formosa Speech in the Wild Corpus for Improving Taiwanese Mandarin Speech-Enabled Human-Computer Interaction. J Sign Process Syst 92, 853–873 (2020). https://doi.org/10.1007/s11265-019-01483-4

Download citation

Received: 01 February 2019
Revised: 08 August 2019
Accepted: 09 September 2019
Published: 28 November 2019
Issue Date: August 2020
DOI: https://doi.org/10.1007/s11265-019-01483-4

Formosa Speech in the Wild Corpus for Improving Taiwanese Mandarin Speech-Enabled Human-Computer Interaction

Abstract

Access this article

Similar content being viewed by others

CORAA ASR: a large corpus of spontaneous and prepared speech manually validated for speech recognition in Brazilian Portuguese

Chhattisgarhi speech corpus for research and development in automatic speech recognition

A Systematic Review on Automatic Speech Recognition for Odia Language

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Abstract

Access this article

Similar content being viewed by others

CORAA ASR: a large corpus of spontaneous and prepared speech manually validated for speech recognition in Brazilian Portuguese

Chhattisgarhi speech corpus for research and development in automatic speech recognition

A Systematic Review on Automatic Speech Recognition for Odia Language

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation