Enabling progressive system integration for AIoT and speech-based HCI through semantic-aware computing

Chang, Jia-Wei

doi:10.1007/s11227-021-03996-x

Enabling progressive system integration for AIoT and speech-based HCI through semantic-aware computing

Published: 22 July 2021

Volume 78, pages 3288–3324, (2022)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Jia-Wei Chang ORCID: orcid.org/0000-0002-9321-6278¹

347 Accesses
Explore all metrics

Abstract

A novel integration architecture for speech-based human–computer interaction was developed using a progressive growth framework and semantic-aware computing. The architecture can integrate different services and can address the diversity of Internet of Things platforms. A natural language understanding (NLU) agent is proposed as a controller of IoT hubs and hybrid cloud services. The NLU agent with semantic-aware computing can effectively achieve a context-sensitive topic correlation and user intent analysis. Through a modularized design, the proposed progressive growth framework allows the NLU agent to chat about many different issues, such as current affairs and music. Local and cloud services can be loaded based on user demands, such as IoT platforms and hybrid cloud services. We developed and introduced three applications in daily life as case studies to demonstrate their potential and values. With the proposed integration architecture, users can develop many valuable applications according to their demands in various industries.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 11

“Hey CAI” - Conversational AI Enabled User Interface for HPC Tools

Alexa vs. Siri vs. Cortana vs. Google Assistant: A Comparison of Speech-Based Natural User Interfaces

Defining Trigger-Action Rules via Voice: A Novel Approach for End-User Development in the IoT

Notes

References

Gates B, Myhrvold N, Rinearson P, Domonkos D (1995) The road ahead. Viking Penguin, NewYork
Google Scholar
Haeb-Umbach R, Watanabe S, Nakatani T, Bacchiani M, Hoffmeister B, Seltzer ML, Souden M (2019) Speech processing for digital home assistants: combining signal processing with deep-learning techniques. IEEE Signal Process Mag 36(6):111–124
Article Google Scholar
Hwang S (2018) Would satisfaction with smart speakers transfer into loyalty towards the smart speaker provider? 22^nd ITS Biennial Conference, Seoul 2018. Beyond the boundaries: Challenges for business, policy and society: 190336
Ammari T, Kaye J, Tsai JY, Bentley F (2019) Music, search, and IoT: How people (really) use voice assistants. ACM Trans. Comput.-Hum. Interact (TOCHI) 26(3):17–1
Ashton K (2009) That “Internet of Things” thing. RFID J 22(7):97–114
Google Scholar
Ahmed E, Yaqoob I, Gani A, Imran M, Guizani M (2016) Internet-of-Things-based smart environments: state of the art, taxonomy, and open research challenges. IEEE Wireless Commun 23(5):10–16
Article Google Scholar
Biggs P (Ed.) (2005) ITU Internet reports: The Internet of Things. International Telecommunication Union
Darwish D (2015) Improved layered architecture for Internet of Things. Int J Comput Acad Res (IJCAR) 4(4):214–223
Google Scholar
Lee SK, Bae M, Kim H (2017) Future of IoT networks: a survey. Appl Sci 7(10):1072
Article Google Scholar
Rahman A, Nasir MK, Rahman Z, Mosavi A, Shahab S, Minaei-Bidgoli B (2020) Distblockbuilding: A distributed blockchain-based SDN-IoT network for smart building management. IEEE Access 8:140008–140018
Article Google Scholar
Burhan M, Rehman RA, Khan B, Kim BS (2018) IoT elements, layered architectures and security issues: a comprehensive survey. Sensors 18(9):2796
Article Google Scholar
Aleksandrovičs V, Filičevs E, Kampars J (2016) Internet of Things: Structure, features and management. information technology and management science. 19:78–84
Hayashi V, Garcia V, Manzan de Andrade R, and Arakaki R (2020) OKIoT Open Knowledge IoT Project: Smart home case studies of Short-term Course and Software Residency Capstone Project. In Proceedings of the 5th International Conference on Internet of Things, Big Data and Security - 1: IoTBDS, ISBN 978–989–758–426–8, 235–242. https://doi.org/10.5220/0009366002350242
Sudharsan B, Kumar SP, Dhakshinamurthy R (2019) AI Vision: Smart speaker design and implementation with object detection custom skill and advanced voice interaction capability. 11th International Conference on Advanced Computing (ICoAC):97–102. doi: https://doi.org/10.1109/ICoAC48765.2019.247125
Matarneh R, Maksymova S, Lyashenko V, Belova N (2017) Speech recognition systems: A comparative review. International Organization of Scientific Research Journal of Computer Engineering (IOSR-JCE). 19(5):71–79
Engleson S (2018) Smart speaker penetration hits 20% of U.S. Wi-Fi households. Retrieved January 24, 2021, from https://www.comscore.com/Insights/Blog/Smart-Speaker-Penetration-Hits-20-Percent-of-US-Wi-Fi-Households.
Bentley F, Luvogt C, Silverman M, Wirasinghe R, White B, Lottridge D (2018) Understanding the long-term use of smart speaker assistants. In Proc. ACM Interact. Mobile Wearable Ubiquitous Technol 2 (3) 1–24
Wu S, He S, Peng Y, Li W, Zhou M, Guan D (2019) An empirical study on expectation of relationship between human and smart devices—with smart speaker as an example. 2019 IEEE Fourth International Conference on Data Science in Cyberspace (DSC):555–560. https://doi.org/10.1109/DSC.2019.00090.
Jung H, Oh C, Hwang G, Oh CY, Lee J, Suh B (2019) Tell me more: Understanding user interaction of smart speaker news powered by conversational search. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems (CHI EA '19):1–6
Sudharsan B, Corcoran P, Ali MI (2019) Smart speaker design and implementation with biometric authentication and advanced voice interaction capability. In Proceedings of Conference on Artificial Intelligence and Cognitive Science (AICS):305–316
Guo Y, Wang X, Wu C, Fu Q, Ma N, Brown GJ (2016) A robust dual-microphone speech source localization algorithm for reverberant environments. In Proceedings of the International Symposium on Computer Architecture (ISCA). INTERSPEECH:3354–3358
Ganguly A, Kucuk A, Panahi I (2017) Real-time smartphone implementation of noise-robust speech source localization algorithm for hearing aid users. In Proceedings of 2017 3^rd Meetings of Acoustics Society of America and 8^th Forum Acusticum 30(1):055002.
Pandey A, Wang D (2019) A new framework for CNN-based speech enhancement in the time domain. IEEE/ACM Transact on Audio, Sp, Lang Process 27(7):1179–1188
Article Google Scholar
Donahue C, Li B, Prabhavalkar R (2018) Exploring speech enhancement with generative adversarial networks for robust speech recognition. In Proceedings of 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP):5024–5028
Macartney C, Weyde T (2018) Improved speech enhancement with the wave-u-net. In Proceedings of 32^nd Conference on Neural Information Processing Systems (NIPS)
Gardner WG (2002) Reverberation algorithms. In: Kahrs M, Brandenburg K (eds) Applications of digital signal processing to audio and acoustics. Springer, Boston, pp 85–131
Chapter Google Scholar
Mun H, Lee H, Kim S, Lee Y (2020). A smart speaker performance measurement tool. In Proceedings of the 35th Annual ACM Symposium on Applied Computing (SAC '20):755–762.
Wang D, Wang X, Lv S (2019) An overview of end-to-end automatic speech recognition. Symmetry 11(8):1018
Article Google Scholar
Saliha B, Youssef E, Abdeslam D (2019) A study on automatic speech recognition. J Informat Technol Rev 10(3):77–85
Google Scholar
Ibrahim H, Varol A (2020) A study on automatic speech recognition systems. In Proceedings of 2020 8th International Symposium on Digital Forensics and Security (ISDFS):1–5. IEEE
Benzeghiba M, De Mori R, Deroo O, Dupont S, Erbes T, Jouvet D, Wellekens C (2007) Automatic speech recognition and speech variability: a review. Speech Commun 49(10–11):763–786
Article Google Scholar
Khurana D, Koli A, Khatter K, Singh S (2017) Natural language processing: State of the art, Current Trends and Challenges. arXiv preprint
Gatt A, Krahmer E (2018) Survey of the state of the art in natural language generation: core tasks, applications and evaluation. J Artif Intellig Res 61:65–170
Article MathSciNet Google Scholar
Tran VK, Nguyen LM (2017) Neural-based natural language generation in dialogue using RNN encoder-decoder with semantic aggregation. In Proceedings of the 18^th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL):231–240
Varshney D, Ekbal A, Nagaraja GP, Tiwari M, Gopinath AAM, Bhattacharyya P (2020) Natural language generation using transformer network in an open-domain setting. Nat Lang Process Informat Syst (NLDB) 12089:82–93
Article Google Scholar
Martin FA, Malfaz M, Castro-González Á, Castillo JC, Salichs MÁ (2020) Four-features evaluation of text to speech systems for three social robots. Electronics 9(2):267. https://doi.org/10.3390/electronics9020267
Article Google Scholar
Arık SÖ, Chrzanowski M, Coates A, Diamos G, Gibiansky A, Kang Y, Shoeybi M (2017). Deep voice: Real-time neural text-to-speech. In Proceedings of International Conference on Machine Learning (PMLR):195–204
Isewon I, Oyelade OJ, Oladipupo OO (2012) Design and implementation of text to speech conversion for visually impaired people. Int J Appl Informat Syst (IJAIS) 7(2):26–30
Google Scholar
Abdul-Kader SA, Woods JC (2015) Survey on chatbot design techniques in speech conversation systems. Int J Adv Comput Sci Appl 6(7):72–80
Google Scholar
Dahiya M (2017) A tool of conversation: Chatbot. Int J Comput Sci Eng 5(5):158–161
Google Scholar
Bocklisch T, Faulkner J, Pawlowski N, Nichol A (2017) Rasa: Open source language understanding and dialogue management. arXiv preprint

Download references

Acknowledgements

This work was supported by the Ministry of Science and Technology, Taiwan, R.O.C. [grant number MOST 108-2218-E-025-002-MY3]. Special thanks to Mr. Yu-Ting Hsiao for his assistance in the development of programming for this study. In addition, special thanks to Miss Ching-Yi Chiou for her assistance in the proofreading of this paper.

Author information

Authors and Affiliations

Department of Computer Science and Information Engineering, National Taichung University of Science and Technology, Taichung, Taiwan
Jia-Wei Chang

Authors

Jia-Wei Chang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jia-Wei Chang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chang, JW. Enabling progressive system integration for AIoT and speech-based HCI through semantic-aware computing. J Supercomput 78, 3288–3324 (2022). https://doi.org/10.1007/s11227-021-03996-x

Download citation

Accepted: 12 July 2021
Published: 22 July 2021
Issue Date: February 2022
DOI: https://doi.org/10.1007/s11227-021-03996-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enabling progressive system integration for AIoT and speech-based HCI through semantic-aware computing

Abstract

Access this article

Similar content being viewed by others

“Hey CAI” - Conversational AI Enabled User Interface for HPC Tools

Alexa vs. Siri vs. Cortana vs. Google Assistant: A Comparison of Speech-Based Natural User Interfaces

Defining Trigger-Action Rules via Voice: A Novel Approach for End-User Development in the IoT

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Enabling progressive system integration for AIoT and speech-based HCI through semantic-aware computing

Abstract

Access this article

Similar content being viewed by others

“Hey CAI” - Conversational AI Enabled User Interface for HPC Tools

Alexa vs. Siri vs. Cortana vs. Google Assistant: A Comparison of Speech-Based Natural User Interfaces

Defining Trigger-Action Rules via Voice: A Novel Approach for End-User Development in the IoT

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation