Comparative Analysis of Intelligent Personal Agent Performance

Herbert, David; Kang, Byeong

doi:10.1007/978-3-030-30639-7_11

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11669))

Included in the following conference series:

Pacific Rim Knowledge Acquisition Workshop

552 Accesses
2 Citations

Abstract

Intelligent Personal Assistant (IPA) devices such as Google Home and Amazon Echo have become commodity hardware and are well-known in the public domain. Leveraging these devices as speech-based interfaces to bespoke conversation agent (CA) systems in vocabulary-specific domains exposes their underlying Automatic Speech Recognition (ASR) transcription error rates, which are usually hidden behind a probability matching of utterance to intent (slot filling). We present an evaluation of the two aforementioned IPA’s isolated word and phrasal recognition rates together with an improvement scheme associated with a Contextual Multiple Classification Ripple Down Rules (C-MCRDR) CA knowledge-base system (KBS). When measuring isolated-word word error rates (WER) for a human speaker, Google Home achieved an average WER of 0.082 compared to 0.276 for Amazon Echo. Computer-generated utterances unsurprisingly had much poorer recognition rates, with WER for Google Home and Amazon Echo of 0.155 and 0.502 respectively. For phrasal tests, Google Home had an average WER of 0.066 in comparison to the Amazon Echo WER of 0.242 when processing human-sourced sentences. We applied a rule-based transcription error-correcting scheme for isolated words and achieved correct recognition rates of 100% for the Google Home in five of the isolated word data sets, and across all isolated words datasets we improved the initial average WER of 0.082 to 0.0153, a significant decrease of 81.34%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Amazon: Alexa Skills Kit (2018). https://developer.amazon.com/alexa-skills-kit. Accessed 1 Feb 2019
Apple: SiriKit (2019). https://developer.apple.com/documentation/sirikit. Accessed 1 Feb 2019
Bassil, Y., Semaan, P.: ASR context-sensitive error correction based on Microsoft N-gram dataset. arXiv preprint arXiv:1203.5262 (2012)
Chen, W., Ananthakrishnan, S., Kumar, R., Prasad, R., Natarajan, P.: ASR error detection in a conversational spoken language translation system. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 7418–7422, May 2013. https://doi.org/10.1109/ICASSP.2013.6639104
Compton, P.: Pacific knowledge systems - challenges with rules. Report, University of New South Wales. http://pks.com.au/wp-content/uploads/2015/03/WhitePaperChallengesWithRulesPKS.pdf
Compton, P., Jansen, R.: Knowledge in context: a strategy for expert system maintenance. In: AI 1988, pp. 292–306 (1990)
Chapter Google Scholar
Dickens, C.: A Christmas Carol. Project Gutenberg (1843). https://www.gutenberg.org/ebooks/46. Accessed 15 Dec 2018
Dizon, G.: Using intelligent personal assistants for second language learning: a case study of Alexa. TESOL J. 8(4), 811–830 (2017)
Article Google Scholar
Galgani, F., Compton, P., Hoffmann, A.: LEXA: building knowledge bases for automatic legal citation classification. Expert Syst. Appl. 42(17), 6391–6407 (2015). https://doi.org/10.1016/j.eswa.2015.04.022
Article Google Scholar
Glina, E.M., Kang, B.H.: Conversation system with state information. J. Adv. Comput. Intell. 14(6), 741–745 (2010)
Google Scholar
Google: Google Actions SDK (2019). https://developers.google.com/actions/. Accessed 1 Feb 2019
Han, S.C., Mirowski, L., Jeon, S.H., Lee, G.S., Kang, B.H., Turner, P.: Expert systems and home-based telehealth: exploring a role for MCRDR in enhancing diagnostics. In: International Conference, UCMA, SIA, CCSC, ACIT-2013, vol. 22, pp. 121–127 (2013)
Google Scholar
Herbert, D., Kang, B.H.: Intelligent conversation system using multiple classification ripple down rules and conversational context. Expert Syst. Appl. 112, 342–352 (2018). https://doi.org/10.1016/j.eswa.2018.06.049
Article Google Scholar
Horwitz, J.: Siri, Alexa, and Google Assistant can be controlled by inaudible commands. Venture Beat, May 2018. https://venturebeat.com/2018/05/10. Accessed 15 Dec 2019
Hoy, M.B.: Alexa, Siri, Cortana, and more: an introduction to voice assistants. Med. Ref. Serv. Q. 37(1), 81–88 (2018)
Article Google Scholar
Jiang, J., et al.: Automatic online evaluation of intelligent assistants. In: Proceedings of the 24th International Conference on World Wide Web, pp. 506–516. International World Wide Web Conferences Steering Committee (2015)
Google Scholar
Kang, B.H.: Validating knowledge acquisition: multiple classification ripple down rules. Ph.D. thesis, University of New South Wales Sydney (1995)
Google Scholar
Kilgarriff, A.: BNC database and word frequency lists (2006). http://www.kilgarriff.co.uk/bnc-readme.html. Accessed 1 Feb 2019
Li, B., et al.: Acoustic modeling for Google Home. In: INTERSPEECH-2017, pp. 399–403 (2017)
Google Scholar
Lopatovska, I., et al.: Talk to me: exploring user interactions with the Amazon Alexa. J. Libr. Inf. Sci. (2018). https://doi.org/10.1177/0961000618759414
Mak, P., Kang, B.H., Sammut, C., Kadous, W.: Knowledge acquisition module for conversation agent. School of Computing, University of Tasmania, Technical report (2004)
Google Scholar
Mangu, L., Padmanabhan, M.: Error corrective mechanisms for speech recognition. In: Proceedings of 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing (Cat. No.01CH37221), vol. 1, pp. 29–32 (2001). https://doi.org/10.1109/ICASSP.2001.940759
Manikonda, L., Deotale, A., Kambhampati, S.: What’s up with privacy? User preferences and privacy concerns in intelligent personal assistants. arXiv preprint arXiv:1711.07543 (2017)
Miranda-Mena, T.G., Ochoa, J.L., Martínez-Béjar, R., Fernández-Breis, J.T., Salinas, J.: A knowledge-based approach to assign breast cancer treatments in oncology units. Expert Syst. Appl. 31(3), 451–457 (2006). https://doi.org/10.1016/j.eswa.2005.09.076
Article Google Scholar
Moore, A., Parada, P.P., Naylor, P.: Speech enhancement for robust automatic speech recognition: evaluation using a baseline system and instrumental measures. Comput. Speech Lang. 46, 574–584 (2017)
Article Google Scholar
Natcorp: British National Corpus [BNC]. University of Oxford (2018). http://www.natcorp.ox.ac.uk. Accessed 15 Dec 2018
O’Shaughnessy, D.: Invited paper: automatic speech recognition: history, methods and challenges. Pattern Recognit. 41(10), 2965–2979 (2008). https://doi.org/10.1016/j.patcog.2008.05.008
Article MATH Google Scholar
Pellegrini, T., Trancoso, I.: Improving ASR error detection with non-decoder based features. In: Eleventh Annual Conference of the International Speech Communication Association, pp. 1950–1953 (2010)
Google Scholar
Pham, K.C., Sammut, C.: RDRvision-learning vision recognition with ripple down rules. In: Proceedings of Australasian Conference on Robotics and Automation, p. 7 (2005)
Google Scholar
Protalinski, E.: Google’s speech recognition technology now has a 4.9% word error rate. Venture Beat, May 2017. https://venturebeat.com/2017/05/17. Accessed 1 Feb 2019
Reis, A., Paulino, D., Paredes, H., Barroso, J.: Using intelligent personal assistants to strengthen the elderlies’ social bonds. In: Antona, M., Stephanidis, C. (eds.) UAHCI 2017. LNCS, vol. 10279, pp. 593–602. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-58700-4_48
Chapter Google Scholar
Richards, D.: Two decades of ripple down rules research. Knowl. Eng. Rev. 24(2), 159–184 (2009). https://doi.org/10.1017/S0269888909000241
Article Google Scholar
Ringger, E.K., Allen, J.F.: Error correction via a post-processor for continuous speech recognition. In: 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, vol. 1, pp. 427–430, May 1996. https://doi.org/10.1109/ICASSP.1996.541124
Sarma, A., Palmer, D.D.: Context-based speech recognition error detection and correction. In: Proceedings of HLT-NAACL 2004: Short Papers, pp. 85–88. Association for Computational Linguistics (2004)
Google Scholar
Singer-Vine, J.: Markovify (2014). https://github.com/jsvine/markovify. Accessed 15 Dec 2018
Strayer, D.L., Cooper, J.M., Turrill, J., Coleman, J.R., Hopman, R.J.: The smartphone and the driver’s cognitive workload: a comparison of Apple, Google, and Microsoft’s intelligent personal assistants. Can. J. Exp. Psychol./Rev. Can. Psychol. expérimentale 71(2), 93 (2017)
Google Scholar
Zhou, L., Shi, Y., Feng, J., Sears, A.: Data mining for detecting errors in dictation speech recognition. IEEE Trans. Speech Audio Process. 13(5), 681–688 (2005). https://doi.org/10.1109/TSA.2005.851874
Article Google Scholar

Download references

Acknowledgments

This research has been supported by financial support via a grant from the Asian Office of Aerospace Research and Development (AOARD). The research is also supported by an Australian Government Research Training Program Scholarship, and it has University of Tasmania Ethics Approval, number H0016281.

Data cited herein has been extracted from the British National Corpus Online service, managed by Oxford University Computing Services on behalf of the BNC Consortium. All rights in the texts cited are reserved.

Author information

Authors and Affiliations

University of Tasmania, Hobart, Australia
David Herbert & Byeong Kang

Authors

David Herbert
View author publications
You can also search for this author in PubMed Google Scholar
Byeong Kang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David Herbert .

Editor information

Editors and Affiliations

Aoyama Gakuin University, Tokyo, Japan
Kouzou Ohara
University of Tasmania, Tasmania, Australia
Quan Bai

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Herbert, D., Kang, B. (2019). Comparative Analysis of Intelligent Personal Agent Performance. In: Ohara, K., Bai, Q. (eds) Knowledge Management and Acquisition for Intelligent Systems. PKAW 2019. Lecture Notes in Computer Science(), vol 11669. Springer, Cham. https://doi.org/10.1007/978-3-030-30639-7_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-30639-7_11
Published: 22 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30638-0
Online ISBN: 978-3-030-30639-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics