Skip to main content

A Multi-task Learning Approach for Mandarin-English Code-Switching Conversational Speech Recognition

  • Conference paper
  • First Online:
Computational Intelligence and Intelligent Systems (ISICA 2017)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 873))

Included in the following conference series:

  • 713 Accesses

Abstract

We propose a new approach based on Deep Neural Network via Multi-task Learning (MTL-DNN) for simultaneous Mandarin-English code-switching conversational speech recognition (MECS-CSR) (primary task) and language identification (LID) (auxiliary task). In our approach, the hidden layers of the DNNs for primary task fuse with ones of the DNN for auxiliary task by sharing weights/bias parameters. Extensive experiments are carried out on LDC2015S04 and Mixed Error Rate (MER) is used as performance metric for the code-switching speech recognition. Compared with the baseline and the first MECS-CSR system [1] on LDC2015S04, MER of proposed approach is relatively reduced by 4.57% and 4.07%, respectively. Results show that the proposed approach is able to capture more language switching information from the auxiliary task and significantly outperforms the competitive algorithms for the single tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/xiaosdawn/Kaldi-multi-task/blob/master/egs/wsj/s5/local/online/run_multitask2.sh

References

  1. Vu, N.T., Lyu, D.-C., Weiner, J., Telaar, D., Schlippe, T., Blaicher, F., et al.: A first speech recognition system for Mandarin-English code-switch conversational speech. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4889–4892 (2012)

    Google Scholar 

  2. Li, Y., Fung, P.: Code switching language model with translation constraint for mixed language speech recognition. In: Proceedings of COLING, pp. 1671–1680 (2012)

    Google Scholar 

  3. Chen, M., et al.: Multi-Task Learning in Deep Neural Networks for Mandarin-English Code-Mixing Speech Recognition. IEICE Trans. Inf. Syst. 99(10), 2554–2557 (2016)

    Article  Google Scholar 

  4. Yeh, C.F., Huang, C.Y., Sun, L.C., Lee, L.S.: An integrated framework for transcribing Mandarin-English code-mixed lectures with improved acoustic and language modeling. In: 7th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 214–219 (2010)

    Google Scholar 

  5. Yu, S., Zhang, S., Xu, B.: Chinese-English bilingual phone modeling for cross-language speech recognition. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2004), pp. I–917 (2004)

    Google Scholar 

  6. Bhuvanagiri, K., Kopparapu, S.: An approach to mixed language automatic speech recognition. In: Oriental COCOSDA, Kathmandu, Nepal (2010)

    Google Scholar 

  7. Lyu, D.-C., Lyu, R.-Y., Chiang, Y.-C., Hsu, C.-N.: Speech recognition on code-switching among the Chinese dialects. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2006, p. I (2006)

    Google Scholar 

  8. Chen, D., Mak, B.K.-W.: Multitask learning of deep neural networks for low-resource speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 23, 1172–1183 (2015)

    Google Scholar 

  9. Huang, J.-T., Li, J., Yu, D., Deng, L., Gong, Y.: Cross-language knowledge transfer using multilingual deep neural network with shared hidden layers. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7304–7308 (2013)

    Google Scholar 

  10. Giri, R., Seltzer, M.L., Droppo, J., Yu, D.: Improving speech recognition in reverberation using a room-aware deep neural network and multi-task learning. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5014–5018 (2015)

    Google Scholar 

  11. Davis, K., Biddulph, R., Balashek, S.: Automatic recognition of spoken digits. J. Acoust. Soc. Am. 24, 637–642 (1952)

    Article  Google Scholar 

  12. Chen, D., Mak, B., Leung, C.-C., Sivadas, S.: Joint acoustic modeling of triphones and trigraphemes by multi-task learning deep neural networks for low-resource speech recognition. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5592–5596 (2014)

    Google Scholar 

  13. Lyu, D.-C., Tan, T.-P., Chng, E.-S., Li, H.: Mandarin–English code-switching speech corpus in South-East Asia: SEAME. Lang. Resour. Eval. 49, 581–600 (2015)

    Article  Google Scholar 

Download references

Acknowledgements

This work is partially supported by Shenzhen Science & Research projects. (No: JCYJ20160331104524983, JSGG20160229121006579). The authors also gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuexian Zou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Song, X., Liu, Y., Yang, D., Zou, Y. (2018). A Multi-task Learning Approach for Mandarin-English Code-Switching Conversational Speech Recognition. In: Li, K., Li, W., Chen, Z., Liu, Y. (eds) Computational Intelligence and Intelligent Systems. ISICA 2017. Communications in Computer and Information Science, vol 873. Springer, Singapore. https://doi.org/10.1007/978-981-13-1648-7_9

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-1648-7_9

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-1647-0

  • Online ISBN: 978-981-13-1648-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics