Skip to main content

Comparative Analysis of Intelligent Personal Agent Performance

  • Conference paper
  • First Online:
Knowledge Management and Acquisition for Intelligent Systems (PKAW 2019)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11669))

Included in the following conference series:

Abstract

Intelligent Personal Assistant (IPA) devices such as Google Home and Amazon Echo have become commodity hardware and are well-known in the public domain. Leveraging these devices as speech-based interfaces to bespoke conversation agent (CA) systems in vocabulary-specific domains exposes their underlying Automatic Speech Recognition (ASR) transcription error rates, which are usually hidden behind a probability matching of utterance to intent (slot filling). We present an evaluation of the two aforementioned IPA’s isolated word and phrasal recognition rates together with an improvement scheme associated with a Contextual Multiple Classification Ripple Down Rules (C-MCRDR) CA knowledge-base system (KBS). When measuring isolated-word word error rates (WER) for a human speaker, Google Home achieved an average WER of 0.082 compared to 0.276 for Amazon Echo. Computer-generated utterances unsurprisingly had much poorer recognition rates, with WER for Google Home and Amazon Echo of 0.155 and 0.502 respectively. For phrasal tests, Google Home had an average WER of 0.066 in comparison to the Amazon Echo WER of 0.242 when processing human-sourced sentences. We applied a rule-based transcription error-correcting scheme for isolated words and achieved correct recognition rates of 100% for the Google Home in five of the isolated word data sets, and across all isolated words datasets we improved the initial average WER of 0.082 to 0.0153, a significant decrease of 81.34%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Amazon: Alexa Skills Kit (2018). https://developer.amazon.com/alexa-skills-kit. Accessed 1 Feb 2019

  2. Apple: SiriKit (2019). https://developer.apple.com/documentation/sirikit. Accessed 1 Feb 2019

  3. Bassil, Y., Semaan, P.: ASR context-sensitive error correction based on Microsoft N-gram dataset. arXiv preprint arXiv:1203.5262 (2012)

  4. Chen, W., Ananthakrishnan, S., Kumar, R., Prasad, R., Natarajan, P.: ASR error detection in a conversational spoken language translation system. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 7418–7422, May 2013. https://doi.org/10.1109/ICASSP.2013.6639104

  5. Compton, P.: Pacific knowledge systems - challenges with rules. Report, University of New South Wales. http://pks.com.au/wp-content/uploads/2015/03/WhitePaperChallengesWithRulesPKS.pdf

  6. Compton, P., Jansen, R.: Knowledge in context: a strategy for expert system maintenance. In: AI 1988, pp. 292–306 (1990)

    Chapter  Google Scholar 

  7. Dickens, C.: A Christmas Carol. Project Gutenberg (1843). https://www.gutenberg.org/ebooks/46. Accessed 15 Dec 2018

  8. Dizon, G.: Using intelligent personal assistants for second language learning: a case study of Alexa. TESOL J. 8(4), 811–830 (2017)

    Article  Google Scholar 

  9. Galgani, F., Compton, P., Hoffmann, A.: LEXA: building knowledge bases for automatic legal citation classification. Expert Syst. Appl. 42(17), 6391–6407 (2015). https://doi.org/10.1016/j.eswa.2015.04.022

    Article  Google Scholar 

  10. Glina, E.M., Kang, B.H.: Conversation system with state information. J. Adv. Comput. Intell. 14(6), 741–745 (2010)

    Google Scholar 

  11. Google: Google Actions SDK (2019). https://developers.google.com/actions/. Accessed 1 Feb 2019

  12. Han, S.C., Mirowski, L., Jeon, S.H., Lee, G.S., Kang, B.H., Turner, P.: Expert systems and home-based telehealth: exploring a role for MCRDR in enhancing diagnostics. In: International Conference, UCMA, SIA, CCSC, ACIT-2013, vol. 22, pp. 121–127 (2013)

    Google Scholar 

  13. Herbert, D., Kang, B.H.: Intelligent conversation system using multiple classification ripple down rules and conversational context. Expert Syst. Appl. 112, 342–352 (2018). https://doi.org/10.1016/j.eswa.2018.06.049

    Article  Google Scholar 

  14. Horwitz, J.: Siri, Alexa, and Google Assistant can be controlled by inaudible commands. Venture Beat, May 2018. https://venturebeat.com/2018/05/10. Accessed 15 Dec 2019

  15. Hoy, M.B.: Alexa, Siri, Cortana, and more: an introduction to voice assistants. Med. Ref. Serv. Q. 37(1), 81–88 (2018)

    Article  Google Scholar 

  16. Jiang, J., et al.: Automatic online evaluation of intelligent assistants. In: Proceedings of the 24th International Conference on World Wide Web, pp. 506–516. International World Wide Web Conferences Steering Committee (2015)

    Google Scholar 

  17. Kang, B.H.: Validating knowledge acquisition: multiple classification ripple down rules. Ph.D. thesis, University of New South Wales Sydney (1995)

    Google Scholar 

  18. Kilgarriff, A.: BNC database and word frequency lists (2006). http://www.kilgarriff.co.uk/bnc-readme.html. Accessed 1 Feb 2019

  19. Li, B., et al.: Acoustic modeling for Google Home. In: INTERSPEECH-2017, pp. 399–403 (2017)

    Google Scholar 

  20. Lopatovska, I., et al.: Talk to me: exploring user interactions with the Amazon Alexa. J. Libr. Inf. Sci. (2018). https://doi.org/10.1177/0961000618759414

  21. Mak, P., Kang, B.H., Sammut, C., Kadous, W.: Knowledge acquisition module for conversation agent. School of Computing, University of Tasmania, Technical report (2004)

    Google Scholar 

  22. Mangu, L., Padmanabhan, M.: Error corrective mechanisms for speech recognition. In: Proceedings of 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing (Cat. No.01CH37221), vol. 1, pp. 29–32 (2001). https://doi.org/10.1109/ICASSP.2001.940759

  23. Manikonda, L., Deotale, A., Kambhampati, S.: What’s up with privacy? User preferences and privacy concerns in intelligent personal assistants. arXiv preprint arXiv:1711.07543 (2017)

  24. Miranda-Mena, T.G., Ochoa, J.L., Martínez-Béjar, R., Fernández-Breis, J.T., Salinas, J.: A knowledge-based approach to assign breast cancer treatments in oncology units. Expert Syst. Appl. 31(3), 451–457 (2006). https://doi.org/10.1016/j.eswa.2005.09.076

    Article  Google Scholar 

  25. Moore, A., Parada, P.P., Naylor, P.: Speech enhancement for robust automatic speech recognition: evaluation using a baseline system and instrumental measures. Comput. Speech Lang. 46, 574–584 (2017)

    Article  Google Scholar 

  26. Natcorp: British National Corpus [BNC]. University of Oxford (2018). http://www.natcorp.ox.ac.uk. Accessed 15 Dec 2018

  27. O’Shaughnessy, D.: Invited paper: automatic speech recognition: history, methods and challenges. Pattern Recognit. 41(10), 2965–2979 (2008). https://doi.org/10.1016/j.patcog.2008.05.008

    Article  MATH  Google Scholar 

  28. Pellegrini, T., Trancoso, I.: Improving ASR error detection with non-decoder based features. In: Eleventh Annual Conference of the International Speech Communication Association, pp. 1950–1953 (2010)

    Google Scholar 

  29. Pham, K.C., Sammut, C.: RDRvision-learning vision recognition with ripple down rules. In: Proceedings of Australasian Conference on Robotics and Automation, p. 7 (2005)

    Google Scholar 

  30. Protalinski, E.: Google’s speech recognition technology now has a 4.9% word error rate. Venture Beat, May 2017. https://venturebeat.com/2017/05/17. Accessed 1 Feb 2019

  31. Reis, A., Paulino, D., Paredes, H., Barroso, J.: Using intelligent personal assistants to strengthen the elderlies’ social bonds. In: Antona, M., Stephanidis, C. (eds.) UAHCI 2017. LNCS, vol. 10279, pp. 593–602. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-58700-4_48

    Chapter  Google Scholar 

  32. Richards, D.: Two decades of ripple down rules research. Knowl. Eng. Rev. 24(2), 159–184 (2009). https://doi.org/10.1017/S0269888909000241

    Article  Google Scholar 

  33. Ringger, E.K., Allen, J.F.: Error correction via a post-processor for continuous speech recognition. In: 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, vol. 1, pp. 427–430, May 1996. https://doi.org/10.1109/ICASSP.1996.541124

  34. Sarma, A., Palmer, D.D.: Context-based speech recognition error detection and correction. In: Proceedings of HLT-NAACL 2004: Short Papers, pp. 85–88. Association for Computational Linguistics (2004)

    Google Scholar 

  35. Singer-Vine, J.: Markovify (2014). https://github.com/jsvine/markovify. Accessed 15 Dec 2018

  36. Strayer, D.L., Cooper, J.M., Turrill, J., Coleman, J.R., Hopman, R.J.: The smartphone and the driver’s cognitive workload: a comparison of Apple, Google, and Microsoft’s intelligent personal assistants. Can. J. Exp. Psychol./Rev. Can. Psychol. expérimentale 71(2), 93 (2017)

    Google Scholar 

  37. Zhou, L., Shi, Y., Feng, J., Sears, A.: Data mining for detecting errors in dictation speech recognition. IEEE Trans. Speech Audio Process. 13(5), 681–688 (2005). https://doi.org/10.1109/TSA.2005.851874

    Article  Google Scholar 

Download references

Acknowledgments

This research has been supported by financial support via a grant from the Asian Office of Aerospace Research and Development (AOARD). The research is also supported by an Australian Government Research Training Program Scholarship, and it has University of Tasmania Ethics Approval, number H0016281.

Data cited herein has been extracted from the British National Corpus Online service, managed by Oxford University Computing Services on behalf of the BNC Consortium. All rights in the texts cited are reserved.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Herbert .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Herbert, D., Kang, B. (2019). Comparative Analysis of Intelligent Personal Agent Performance. In: Ohara, K., Bai, Q. (eds) Knowledge Management and Acquisition for Intelligent Systems. PKAW 2019. Lecture Notes in Computer Science(), vol 11669. Springer, Cham. https://doi.org/10.1007/978-3-030-30639-7_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-30639-7_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-30638-0

  • Online ISBN: 978-3-030-30639-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics