Data Augmentation for Internet of Things Dialog System

Wang, Eric Ke; Yu, Juntao; Chen, Chien-Ming; Kumari, Saru; Rodrigues, Joel J. P. C.

doi:10.1007/s11036-020-01638-9

Data Augmentation for Internet of Things Dialog System

Published: 04 September 2020

Volume 27, pages 158–171, (2022)
Cite this article

Mobile Networks and Applications Aims and scope Submit manuscript

Eric Ke Wang¹,
Juntao Yu¹,
Chien-Ming Chen²,
Saru Kumari ORCID: orcid.org/0000-0003-4929-5383³ &
…
Joel J. P. C. Rodrigues^4,5

454 Accesses
2 Citations
Explore all metrics

Abstract

With rapid development of voice control technology, making speech recognition more precisely in various IoT domains have been an intractable problem to be solved. Since there are various conversation scenes, understanding the context of a dialog scene is a key issue of voice control systems. However, the reality is available training data for dialog system are always insufficient. In this paper, we mainly solve the problem of data lacking in dialog systems by data augmentation technique. A Generative Adversarial Network(GAN)-based model is proposed and the data are augmented effectively. It can generate from text to text, enhance the original data with text retelling, and improve the robustness of parameter estimation of unknown data by using the sample data generated by GAN model. A new N-gram language model is used to evaluate multiple recognition candidates of speech recognition, and the candidate sentences with the highest evaluation scores are selected as the final result of speech recognition. Our data enhancement algorithm based on the Generative Model is verified by the experiments. In the result of model comparison test, the error rates of data set THCHS30 and AISHELL are 3.3% and 5.1% which are lower than that of the baseline system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Speech recognition in a dialog system: from conventional to deep processing

Article 06 September 2017

Improving Neural Silent Speech Interface Models by Adversarial Training

Automatic Speech Recognition Adaptation to the IoT Domain Dialogue System

References

Hirsimaki T, Pylkkonen J, Kurimo M (2009) Importance of High-Order N-Gram Models in Morph-Based Speech Recognition[J]. IEEE Trans Audio Speech Lang Process 17(4):724–732
Article Google Scholar
Siivola V, Hirsimäki T et al (2007) On growing and pruning Kneser-Ney smoothed N-gram models[J]. IEEE Trans Audio Speech Lang Process 15(5):1617–1624
Article Google Scholar
Cohen L, Krustedt RL, May M (2009) Fluency, Text Structure, and Retelling: A Complex Relationship[J]. Read Horiz 49:101–124
Google Scholar
Kucer SB (2011) Going beyond the author: what retellings tell us about comprehending narrative and expository texts[J]. Literacy 45(2):62–69
Article Google Scholar
Cui X, Goel V, Kingsbury B (2015) Data augmentation for deep neural network acoustic modeling[J]. IEEE/ACM Trans Audio Speech Lang Process 23(9):1469–1477
Article Google Scholar
Naredo E, Urbano P, Trujillo L (2016) The training set and generalization in grammatical evolution for autonomous agent navigation[J]. Soft Comput 21(15):1–18
Google Scholar
Chang WD (2014) Recurrent neural network modeling combined with bilinear model structure[J]. Neural Comput & Applic 24(3–4):765–773
Article Google Scholar
Wang J, Jie Z, Wang X, Bilateral LSTM (2018) A Two-Dimensional Long Short-Term Memory Model With Multiply Memory Units for Short-Term Cycle Time Forecasting in Re-entrant Manufacturing Systems[J]. IEEE Trans Ind Inf 14(2):748–758
Article Google Scholar
Palangi H, Li D, Shen Y et al (2016) Deep Sentence Embedding Using Long Short-Term Memory Networks: Analysis and Application to Information Retrieval[J]. IEEE/ACM Trans Audio Speech Lang Process 24(4):1–1
Article Google Scholar
Blatz J, Fitzgerald E, Foster G et al (2008) Confidence Estimation for Machine Translations[J]. Proc Coling 33(1):9–40
Google Scholar
Nothman J, Ringland N, Radford W et al (2013) Learning multilingual named entity recognition from Wikipedia[J]. Artif Intell 194:151–175
Article MathSciNet MATH Google Scholar
Kukich K (1983) Design of a knowledge-based report generator. In Proceedings of the 21st annual meeting on Association for Computational Linguistics (ACL ‘83). Association for Computational Linguistics, USA, 145–150
Xu L, Jiang L, Qin C, et al. (2018) How images inspire poems: generating classical Chinese poetry from images with memory networks. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI). pp. 5618–5625.
Otte S, Butz MV, Koryakin D et al (2016) Optimizing recurrent reservoirs with neuro-evolution[J]. Neurocomputing 192:128–138
Article Google Scholar
Shao Y, Hardmeier C, Tiedemann J, et al. (2017) Character-based Joint Segmentation and POS Tagging for Chinese using Bidirectional RNN-CRF. The 8th International Joint Conference on Natural Language Processing (IJCNLP).pp. 59–69.
Mei H, Bansal M, Walter MR (2015) What to talk about and how? Selective generation using LSTMs with coarse-to-fine alignment[J]. Computerence 1:720–730
Google Scholar
Biesmans W, Das N, Francart T et al (2017) Auditory-Inspired Speech Envelope Extraction Methods for Improved EEG-Based Auditory Attention Detection in a Cocktail Party Scenario[J]. IEEE Trans Neural Syst Rehab Eng 25(5):402–412
Article Google Scholar
Lebret R et al. (2016) “Generating text from structured data with application to the biography domain.”ArXiv abs/1603.07771
Bengio Y, Ducharme R, Vincent P et al (2003) A neural probabilistic language model[J]. J Mach Learn Res 3(2):1137–1155
MATH Google Scholar
Yu L, Zhang W, Wang J, Yu Y (2017) Seqgan: sequence generative adversarial nets with policy gradient. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17), pp. 2852–2858.
Zhu J, Park T, Isola P, Efros AA (2017) "Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks," 2017 IEEE International Conference on Computer Vision (ICCV), Venice, pp. 2242–2251.
Wei J, Zou K. EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks, Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). pp.6382–6388.
Xie Q, Dai Z, Hovy E, Luong M-T, Le QV (2019) Unsupervised Data Augmentation for Consistency Training, arXiv:1904.12848v4
Berthelot D, Carlini N, Goodfellow I, Papernot N, Oliver A, Raffel C (2019) MixMatch: A Holistic Approach to Semi-Supervised Learning, 33rd Conference on Neural Information Processing Systems (NeurIPS), Vancouver, Canada
Huang S-W, Lin C-T, Chen S-P, Wu Y-Y, Lai S-H (2018) AugGAN: Cross Domain Adaptation with GAN-based Data Augmentation. ECCV 2018: European Conference on Computer Vision, pp.718–731.
Hu Z et al. (2019) Learning Data Manipulation for Augmentation and Weighting. In: Advances in Neural Information Processing Systems, pp. 15738–15749.
Li Y et al (2018) A generative model for category text generation. Inf Sci 450:301–315
Article MathSciNet Google Scholar
Shakeel MH, Karim A, Khan I (2020) A multi-cascaded model with data augmentation for enhanced paraphrase detection in short texts. Inf Process Manag 57(3):78–88
Article Google Scholar
Ling ZH, Ai Y, Gu Y et al (2018) Waveform Modeling and Generation Using Hierarchical Recurrent Neural Networks for Speech Bandwidth Extension[J]. IEEE/ACM Trans Audio Speech Lang Process 26(5):883–894
Article Google Scholar

Download references

Acknowledgements

This work was partially supported by the National Funding from the FCT - Fundação para a Ciência e a Tecnologia through the UID/EEA/50008/2019 Project; and by Brazilian National Council for Scientific and Technological Development (CNPq) via Grant No. 309335/2017-5.

Author information

Authors and Affiliations

Harbin Institute of Technology, Shenzhen, Harbin, China
Eric Ke Wang & Juntao Yu
College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, 266590, Shandong, China
Chien-Ming Chen
Department of Mathematics, Chaudhary Charan Singh University, Meerut, India
Saru Kumari
Federal University of Piauí, Teresina, PI, 64049-550, Brazil
Joel J. P. C. Rodrigues
Instituto de Telecomunicações, Aveiro, Portugal
Joel J. P. C. Rodrigues

Authors

Eric Ke Wang
View author publications
You can also search for this author in PubMed Google Scholar
Juntao Yu
View author publications
You can also search for this author in PubMed Google Scholar
Chien-Ming Chen
View author publications
You can also search for this author in PubMed Google Scholar
Saru Kumari
View author publications
You can also search for this author in PubMed Google Scholar
Joel J. P. C. Rodrigues
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Saru Kumari.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, E.K., Yu, J., Chen, CM. et al. Data Augmentation for Internet of Things Dialog System. Mobile Netw Appl 27, 158–171 (2022). https://doi.org/10.1007/s11036-020-01638-9

Download citation

Published: 04 September 2020
Issue Date: February 2022
DOI: https://doi.org/10.1007/s11036-020-01638-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Data Augmentation for Internet of Things Dialog System

Abstract

Access this article

Similar content being viewed by others

Speech recognition in a dialog system: from conventional to deep processing

Improving Neural Silent Speech Interface Models by Adversarial Training

Automatic Speech Recognition Adaptation to the IoT Domain Dialogue System

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Data Augmentation for Internet of Things Dialog System

Abstract

Access this article

Similar content being viewed by others

Speech recognition in a dialog system: from conventional to deep processing

Improving Neural Silent Speech Interface Models by Adversarial Training

Automatic Speech Recognition Adaptation to the IoT Domain Dialogue System

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation