skip to main content
10.1145/2742647.2742658acmconferencesArticle/Chapter ViewAbstractPublication PagesmobisysConference Proceedingsconference-collections
research-article

AccelWord: Energy Efficient Hotword Detection through Accelerometer

Published:18 May 2015Publication History

ABSTRACT

Voice control has emerged as a popular method for interacting with smart-devices such as smartphones, smartwatches etc. Popular voice control applications like Siri and Google Now are already used by a large number of smartphone and tablet users. A major challenge in designing a voice control application is that it requires continuous monitoring of user?s voice input through the microphone. Such applications utilize hotwords such as "Okay Google" or "Hi Galaxy" allowing them to distinguish user?s voice command and her other conversations. A voice control application has to continuously listen for hotwords which significantly increases the energy consumption of the smart-devices.

To address this energy efficiency problem of voice control, we present AccelWord in this paper. AccelWord is based on the empirical evidence that accelerometer sensors found in today?s mobile devices are sensitive to user?s voice. We also demonstrate that the effect of user?s voice on accelerometer data is rich enough so that it can be used to detect the hotwords spoken by the user. To achieve the goal of low energy cost but high detection accuracy, we combat multiple challenges, e.g. how to extract unique signatures of user?s speaking hotwords only from accelerometer data and how to reduce the interference caused by user?s mobility.

We finally implement AccelWord as a standalone application running on Android devices. Comprehensive tests show AccelWord has hotword detection accuracy of 85% in static scenarios and 80% in mobile scenarios. Compared to the microphone based hotword detection applications such as Google Now and Samsung S Voice, AccelWord is 2 times more energy efficient while achieving the accuracy of 98% and 92% in static and mobile scenarios respectively.

References

  1. "Apple siri, https://www.apple.com/ios/siri/."Google ScholarGoogle Scholar
  2. "Google now, http://www.google.com/landing/now."Google ScholarGoogle Scholar
  3. "Android wear." http://www.android.com/wear/.Google ScholarGoogle Scholar
  4. "Google glass." https://www.google.com/glass/start/.Google ScholarGoogle Scholar
  5. "Amazon echo." http://www.amazon.com/oc/echo.Google ScholarGoogle Scholar
  6. "Nexus 6, https://www.google.com/nexus/6/."Google ScholarGoogle Scholar
  7. Y. Michalevsky, D. Boneh, and G. Nakibly, "Gyrophone: Recognizing speech from gyroscope signals," in USENIX'2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. P. Marquardt, A. Verma, H. Carter, and P. Traynor, "(sp)iphone: Decoding vibrations from nearby keyboards using mobile phone accelerometers," in Proceedings of the 18th ACM Conference on Computer and Communications Security, CCS'2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. "Samsung s voice." http://www.samsung.com/global/galaxys3/svoice.html.Google ScholarGoogle Scholar
  10. "Monsoon power monitor." https://www.msoon.com/LabEquipment/PowerMonitor/.Google ScholarGoogle Scholar
  11. Y. Zhong, T. V. Raman, C. Burkhardt, F. Biadsy, and J. P. Bigham, "Justspeak: Enabling universal voice control on android," in W4A 2014, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. I. Lopez-Moreno, J. Gonzalez-Dominguez, and O. Plchot, "Automatic language identification using deep neural networks," in ICASSP'2014.Google ScholarGoogle Scholar
  13. W. Zhang and P. Fung, "Discriminatively trained sparse inverse covariance matrices for speech recognition," IEEE/ACM Trans. Audio, Speech and Lang. Proc., vol. 22, pp. 873--882, May 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. C. Chelba, P. Xu, F. Pereira, and T. Richardson, "Distributed acoustic modeling with back-off n-grams," in ICASSP'2012.Google ScholarGoogle Scholar
  15. O. Abdel-Hamid, A.-R. Mohamed, H. Jiang, L. Deng, G. Penn, and D. Yu, "Convolutional neural networks for speech recognition," IEEE/ACM Trans. Audio, Speech and Lang. Proc., vol. 22, no. 10, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Wikipedia. Examples of Sound Pressure, http://en.wikipedia.org/wiki/Sound_pressure#Examples_of_sound_pressure.Google ScholarGoogle Scholar
  17. STMicroelectronics. Everything about STMicroelectronics 3-axis digital MEMS andoscopes, http://www.st.com/web/en/resource/technical/document/technical_article/DM00034730.pdf.Google ScholarGoogle Scholar
  18. Ceramic capacitors feature reduced acoustic noise, http://www.electronics-eetimes.com/en/ceramic-capacitors-feature-reduced-acoustic-noise.html.Google ScholarGoogle Scholar
  19. G. Roth, "Simulation of the effects of acoustic noise on mems gyroscopes," Thesis, Auburn Univeristy, 2009.Google ScholarGoogle Scholar
  20. "Inven sense inc. mpu-6000 and mpu 6050 product speficication." http://www.invensense.com/mems/gyro/documents/PSMPU-6000A-00v3.4.pdf.Google ScholarGoogle Scholar
  21. Wikipedia. Human Hearing Range, http://en.wikipedia.org/wiki/Hearing_range.Google ScholarGoogle Scholar
  22. Wikipedia. Voice Frequency, http://en.wikipedia.org/wiki/Voice_frequency.Google ScholarGoogle Scholar
  23. S. Meter. Google Play Store, https://play.google.com/store/apps/details?id=kr.sira.sound.Google ScholarGoogle Scholar
  24. EngineeringToolnbox. Sound Pressure Levels of Common Sources, http://www.engineeringtoolbox.com/sound-pressure-d_711.html.Google ScholarGoogle Scholar
  25. E. Munguia Tapia, Using machine learning for real-time activity recognition and estimation of energy expenditure. PhD thesis, Massachusetts Institute of Technology, 2008.Google ScholarGoogle Scholar
  26. X. Huang, F. Alleva, H.-W. Hon, M.-Y. Hwang, K.-F. Lee, and R. Rosenfeld, "The sphinx-ii speech recognition system: an overview," Computer Speech & Language, vol. 7, no. 2, pp. 137--148, 1993.Google ScholarGoogle ScholarCross RefCross Ref
  27. H. Hermansky, D. P. Ellis, and S. Sharma, "Tandem connectionist feature extraction for conventional hmm systems," in IEEE ICASSP'2000.Google ScholarGoogle Scholar
  28. I. H. Witten, E. Frank, and M. A. Hall, Data Mining: Practical Machine Learning Tools and Techniques. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 3rd ed., 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. J. R. Kwapisz, G. M. Weiss, and S. A. Moore, "Activity recognition using cell phone accelerometers," in SIGKDD'2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. A. Bayat, M. Pomplun, and D. A. Tran, "A study on human activity recognition using accelerometer data from smartphones," Procedia Computer Science, vol. 34, pp. 450--457, August 2014.Google ScholarGoogle ScholarCross RefCross Ref
  31. T. K. Ho, "The random subspace method for constructing decision forests," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 20, no. 8, pp. 832--844, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. United States Environmental Protection Agency, Summary of the Noise Control Act, 1972.Google ScholarGoogle Scholar
  33. Enviroment Projection Agency of the State Council of China, The quality standared of noisy enviroment, 2008.Google ScholarGoogle Scholar
  34. Ministry of the Environment of Japan, Current Framework of Vehicle Noise Regulation in Japan, September 2012.Google ScholarGoogle Scholar
  35. "Moto x (2rd generation), https://www.motorola.com/us/motomaker?pid=flexr2."Google ScholarGoogle Scholar
  36. S. A. Hadei and M. Lotfizad, "A family of adapative filter algorithms in noise cancellation for speech enhancement," International Journal of Computer and Electrical Engineering, vol. 2, April 2010.Google ScholarGoogle Scholar
  37. A. Matic, V. Osmani, and O. Mayora, "Speech activity detection using accelerometer," in IEEE EMBC'2012.Google ScholarGoogle Scholar
  38. S. V. Dusan, E. B. Andersen, A. Lindahl, and A. P. Bright, "System and method of detecting a user's voice activity using an acceleromter." US Patent No. 20140093093 A1.Google ScholarGoogle Scholar
  39. J. Wang, K. Zhao, X. Zhang, and C. Peng, "Ubiquitous keyboard for small mobile devices: Harnessing multipath fading for fine-grained keystroke localization," MobiSys'14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. A. Davis, M. Rubinstein, N. Wadhwa, G. J. Mysore, F. Durand, and W. T. Freeman, "The visual microphone: Passive recovery of sound from video," ACM Trans. Graph., July 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. G. Galatas, G. Potamianos, and F. Makedon, "Audio-visual speech recognition incorporating facial depth information captured by the kinect," in EUSIPCO'2012.Google ScholarGoogle Scholar

Index Terms

  1. AccelWord: Energy Efficient Hotword Detection through Accelerometer

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          MobiSys '15: Proceedings of the 13th Annual International Conference on Mobile Systems, Applications, and Services
          May 2015
          516 pages
          ISBN:9781450334945
          DOI:10.1145/2742647

          Copyright © 2015 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 18 May 2015

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          MobiSys '15 Paper Acceptance Rate29of219submissions,13%Overall Acceptance Rate274of1,679submissions,16%

          Upcoming Conference

          MOBISYS '24

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        ePub

        View this article in ePub.

        View ePub