ABSTRACT
Voice control has emerged as a popular method for interacting with smart-devices such as smartphones, smartwatches etc. Popular voice control applications like Siri and Google Now are already used by a large number of smartphone and tablet users. A major challenge in designing a voice control application is that it requires continuous monitoring of user?s voice input through the microphone. Such applications utilize hotwords such as "Okay Google" or "Hi Galaxy" allowing them to distinguish user?s voice command and her other conversations. A voice control application has to continuously listen for hotwords which significantly increases the energy consumption of the smart-devices.
To address this energy efficiency problem of voice control, we present AccelWord in this paper. AccelWord is based on the empirical evidence that accelerometer sensors found in today?s mobile devices are sensitive to user?s voice. We also demonstrate that the effect of user?s voice on accelerometer data is rich enough so that it can be used to detect the hotwords spoken by the user. To achieve the goal of low energy cost but high detection accuracy, we combat multiple challenges, e.g. how to extract unique signatures of user?s speaking hotwords only from accelerometer data and how to reduce the interference caused by user?s mobility.
We finally implement AccelWord as a standalone application running on Android devices. Comprehensive tests show AccelWord has hotword detection accuracy of 85% in static scenarios and 80% in mobile scenarios. Compared to the microphone based hotword detection applications such as Google Now and Samsung S Voice, AccelWord is 2 times more energy efficient while achieving the accuracy of 98% and 92% in static and mobile scenarios respectively.
- "Apple siri, https://www.apple.com/ios/siri/."Google Scholar
- "Google now, http://www.google.com/landing/now."Google Scholar
- "Android wear." http://www.android.com/wear/.Google Scholar
- "Google glass." https://www.google.com/glass/start/.Google Scholar
- "Amazon echo." http://www.amazon.com/oc/echo.Google Scholar
- "Nexus 6, https://www.google.com/nexus/6/."Google Scholar
- Y. Michalevsky, D. Boneh, and G. Nakibly, "Gyrophone: Recognizing speech from gyroscope signals," in USENIX'2014. Google ScholarDigital Library
- P. Marquardt, A. Verma, H. Carter, and P. Traynor, "(sp)iphone: Decoding vibrations from nearby keyboards using mobile phone accelerometers," in Proceedings of the 18th ACM Conference on Computer and Communications Security, CCS'2011. Google ScholarDigital Library
- "Samsung s voice." http://www.samsung.com/global/galaxys3/svoice.html.Google Scholar
- "Monsoon power monitor." https://www.msoon.com/LabEquipment/PowerMonitor/.Google Scholar
- Y. Zhong, T. V. Raman, C. Burkhardt, F. Biadsy, and J. P. Bigham, "Justspeak: Enabling universal voice control on android," in W4A 2014, 2014. Google ScholarDigital Library
- I. Lopez-Moreno, J. Gonzalez-Dominguez, and O. Plchot, "Automatic language identification using deep neural networks," in ICASSP'2014.Google Scholar
- W. Zhang and P. Fung, "Discriminatively trained sparse inverse covariance matrices for speech recognition," IEEE/ACM Trans. Audio, Speech and Lang. Proc., vol. 22, pp. 873--882, May 2014. Google ScholarDigital Library
- C. Chelba, P. Xu, F. Pereira, and T. Richardson, "Distributed acoustic modeling with back-off n-grams," in ICASSP'2012.Google Scholar
- O. Abdel-Hamid, A.-R. Mohamed, H. Jiang, L. Deng, G. Penn, and D. Yu, "Convolutional neural networks for speech recognition," IEEE/ACM Trans. Audio, Speech and Lang. Proc., vol. 22, no. 10, 2014. Google ScholarDigital Library
- Wikipedia. Examples of Sound Pressure, http://en.wikipedia.org/wiki/Sound_pressure#Examples_of_sound_pressure.Google Scholar
- STMicroelectronics. Everything about STMicroelectronics 3-axis digital MEMS andoscopes, http://www.st.com/web/en/resource/technical/document/technical_article/DM00034730.pdf.Google Scholar
- Ceramic capacitors feature reduced acoustic noise, http://www.electronics-eetimes.com/en/ceramic-capacitors-feature-reduced-acoustic-noise.html.Google Scholar
- G. Roth, "Simulation of the effects of acoustic noise on mems gyroscopes," Thesis, Auburn Univeristy, 2009.Google Scholar
- "Inven sense inc. mpu-6000 and mpu 6050 product speficication." http://www.invensense.com/mems/gyro/documents/PSMPU-6000A-00v3.4.pdf.Google Scholar
- Wikipedia. Human Hearing Range, http://en.wikipedia.org/wiki/Hearing_range.Google Scholar
- Wikipedia. Voice Frequency, http://en.wikipedia.org/wiki/Voice_frequency.Google Scholar
- S. Meter. Google Play Store, https://play.google.com/store/apps/details?id=kr.sira.sound.Google Scholar
- EngineeringToolnbox. Sound Pressure Levels of Common Sources, http://www.engineeringtoolbox.com/sound-pressure-d_711.html.Google Scholar
- E. Munguia Tapia, Using machine learning for real-time activity recognition and estimation of energy expenditure. PhD thesis, Massachusetts Institute of Technology, 2008.Google Scholar
- X. Huang, F. Alleva, H.-W. Hon, M.-Y. Hwang, K.-F. Lee, and R. Rosenfeld, "The sphinx-ii speech recognition system: an overview," Computer Speech & Language, vol. 7, no. 2, pp. 137--148, 1993.Google ScholarCross Ref
- H. Hermansky, D. P. Ellis, and S. Sharma, "Tandem connectionist feature extraction for conventional hmm systems," in IEEE ICASSP'2000.Google Scholar
- I. H. Witten, E. Frank, and M. A. Hall, Data Mining: Practical Machine Learning Tools and Techniques. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 3rd ed., 2011. Google ScholarDigital Library
- J. R. Kwapisz, G. M. Weiss, and S. A. Moore, "Activity recognition using cell phone accelerometers," in SIGKDD'2010. Google ScholarDigital Library
- A. Bayat, M. Pomplun, and D. A. Tran, "A study on human activity recognition using accelerometer data from smartphones," Procedia Computer Science, vol. 34, pp. 450--457, August 2014.Google ScholarCross Ref
- T. K. Ho, "The random subspace method for constructing decision forests," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 20, no. 8, pp. 832--844, 1998. Google ScholarDigital Library
- United States Environmental Protection Agency, Summary of the Noise Control Act, 1972.Google Scholar
- Enviroment Projection Agency of the State Council of China, The quality standared of noisy enviroment, 2008.Google Scholar
- Ministry of the Environment of Japan, Current Framework of Vehicle Noise Regulation in Japan, September 2012.Google Scholar
- "Moto x (2rd generation), https://www.motorola.com/us/motomaker?pid=flexr2."Google Scholar
- S. A. Hadei and M. Lotfizad, "A family of adapative filter algorithms in noise cancellation for speech enhancement," International Journal of Computer and Electrical Engineering, vol. 2, April 2010.Google Scholar
- A. Matic, V. Osmani, and O. Mayora, "Speech activity detection using accelerometer," in IEEE EMBC'2012.Google Scholar
- S. V. Dusan, E. B. Andersen, A. Lindahl, and A. P. Bright, "System and method of detecting a user's voice activity using an acceleromter." US Patent No. 20140093093 A1.Google Scholar
- J. Wang, K. Zhao, X. Zhang, and C. Peng, "Ubiquitous keyboard for small mobile devices: Harnessing multipath fading for fine-grained keystroke localization," MobiSys'14. Google ScholarDigital Library
- A. Davis, M. Rubinstein, N. Wadhwa, G. J. Mysore, F. Durand, and W. T. Freeman, "The visual microphone: Passive recovery of sound from video," ACM Trans. Graph., July 2014. Google ScholarDigital Library
- G. Galatas, G. Potamianos, and F. Makedon, "Audio-visual speech recognition incorporating facial depth information captured by the kinect," in EUSIPCO'2012.Google Scholar
Index Terms
- AccelWord: Energy Efficient Hotword Detection through Accelerometer
Recommendations
WakeScope: runtime WakeLock anomaly management scheme for Android platform
EMSOFT '13: Proceedings of the Eleventh ACM International Conference on Embedded SoftwareAndroid provides a WakeLock mechanism for application developers to ensure the proper execution of applications without having to enter the sleep state of a device. When using the WakeLock mechanism, application developers should bear the responsibility ...
Where is the energy spent inside my app?: fine grained energy accounting on smartphones with Eprof
EuroSys '12: Proceedings of the 7th ACM european conference on Computer SystemsWhere is the energy spent inside my app? Despite the immense popularity of smartphones and the fact that energy is the most crucial aspect in smartphone programming, the answer to the above question remains elusive. This paper first presents eprof, the ...
Performance and Energy Consumption Analysis of Embedded Applications Based on Android Platform
SBESC '12: Proceedings of the 2012 Brazilian Symposium on Computing System EngineeringThis paper presents an analysis of embedded applications based on Android Platform. Analyzing performance and energy consumption from different algorithmic versions this work tries to find a performance and energy pattern for the paradigm used in each ...
Comments