skip to main content
10.1145/3522783.3529528acmconferencesArticle/Chapter ViewAbstractPublication PageswisecConference Proceedingsconference-collections
research-article
Public Access

Voice Fingerprinting for Indoor Localization with a Single Microphone Array and Deep Learning

Authors Info & Claims
Published:16 May 2022Publication History

ABSTRACT

With the fast development of the Internet of Things (IoT), smart speakers for voice assistance have become increasingly important in smart homes, which offers a new type of human-machine interaction interface. Voice localization with microphone arrays can improve smart speaker's performance and enable many new IoT applications. To address the challenges of complex indoor environments, such as non-line-of-sight (NLOS) and multi-path propagation, we propose voice fingerprinting for indoor localization using a single microphone array. The proposed system consists of a ReSpeaker 6-mic circular array kit connected to a Raspberry Pi and a deep learning model, and operates in offline training and online test stages. In the offline stage, the models are trained with spectrogram images obtained from audio data using short-time Fourier transform (STFT). Transfer learning is used to speed up the training process. In the online stage, a top-K probabilistic method is used for location estimation. Our experimental results demonstrate that the Inception-ResNet-v2 model can achieve a satisfactory localization performance with small location errors in two typical home environments.

References

  1. M. Wang, W. Sun, and L. Qiu, "MAVL: Multiresolution analysis of voice localization," in Proc. USENIX NSDI'21, Virtual Conference, Apr. 2021, pp. 845--858.Google ScholarGoogle Scholar
  2. W. Wang, J. Li, Y. He, and Y. Liu, "Symphony: Localizing multiple acoustic sources with a single microphone array," in Proc. ACM SenSys'20, Virtual Conference, Nov. 2020, pp. 82--94.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. E. Epstein and L. Vasserman, "Generating language models," US Patent 9,437,189, Sept. 2016.Google ScholarGoogle Scholar
  4. Q. Lin, Z. An, and L. Yang, "Rebooting ultrasonic positioning systems for ultrasound-incapable smart devices," in Proc. ACM MobiCom'19, Los Cabos, Mexico, Oct. 2019, pp. 1--16.Google ScholarGoogle Scholar
  5. T. C. Collier, A. N. Kirschel, and C. E. Taylor, "Acoustic localization of antbirds in a Mexican rainforest using a wireless sensor network," J. Acoustical Soc. America, vol. 128, no. 1, pp. 182--189, July 2010.Google ScholarGoogle ScholarCross RefCross Ref
  6. S. Shen, D. Chen, Y.-L. Wei, Z. Yang, and R. R. Choudhury, "Voice localization using nearby wall reflections," in Proc. ACM MobiCom'20, London, UK, Sept. 2020, pp. 1--14.Google ScholarGoogle Scholar
  7. J. Purohit, X. Wang, S. Mao, X. Sun, and C. Yang, "Fingerprinting-based indoor and outdoor localization with LoRa and deep learning," in Proc. IEEE GLOBECOM'20, Taipei, Taiwan, Dec. 2020, pp. 1--6.Google ScholarGoogle Scholar
  8. X. Wang, L. Gao, S. Mao, and S. Pandey, "CSI-based fingerprinting for indoor localization: A deep learning approach," IEEE Trans. Veh. Technol., vol. 66, no. 1, pp. 763--776, Jan. 2017.Google ScholarGoogle Scholar
  9. X. Wang, L. Gao, and S. Mao, "BiLoc: Bi-modality deep learning for indoor localization with 5GHz commodity Wi-Fi," IEEE Access J., vol. 5, no. 1, pp. 4209--4220, Mar. 2017.Google ScholarGoogle ScholarCross RefCross Ref
  10. X. Wang, X. Wang, and S. Mao, "Deep convolutional neural networks for indoor localization with CSI images," IEEE Trans. Netw. Sci. Eng., vol. 7, no. 1, pp. 316--327, Jan./Mar. 2020.Google ScholarGoogle ScholarCross RefCross Ref
  11. C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, "Rethinking the inception architecture for computer vision," in Proc. IEEE CVPR'16, Las Vegas, NV, June-July 2016, pp. 2818--2826.Google ScholarGoogle Scholar
  12. K. He, X. Zhang, S. Ren, and J. Sun, "Identity mappings in deep residual networks," in Proc. 2016 European Conference on Computer Vision, Amsterdam, The Netherlands, Oct. 2016, pp. 630--645.Google ScholarGoogle ScholarCross RefCross Ref
  13. C. Szegedy, S. Ioffe, V. Vanhoucke, and A. Alemi, "Inception-v4, inception-resnet and the impact of residual connections on learning," in Proc. AAAI'17, San Francisco, CA, Feb. 2017, pp. 4278--4284.Google ScholarGoogle Scholar
  14. Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proc. IEEE, vol. 86, no. 11, pp. 2278--2324, Nov. 1998.Google ScholarGoogle ScholarCross RefCross Ref
  15. K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proc. IEEE CVPR'16, Las Vegas, NV, June-July 2016, pp. 770--778.Google ScholarGoogle Scholar
  16. X. Wang, X. Wang, and S. Mao, "Indoor fingerprinting with bimodal CSI tensors: A deep residual sharing learning approach," IEEE Internet of Things Journal, vol. 8, no. 6, pp. 4498--4513, Mar. 2021.Google ScholarGoogle ScholarCross RefCross Ref
  17. M. Youssef and A. Agrawala, "The Horus WLAN location determination system," in Proc. ACM MobiSys'05, Seattle, WA, June 2005, pp. 205--218.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. X. Wang, Z. Yu, and S. Mao, "Indoor localization using magnetic and light sensors with smartphones: A deep LS™ approach," Springer Mobile Networks and Applications (MONET) J., vol. 25, no. 2, pp. 819--832, Apr. 2020.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Seed Wiki, "ReSpeaker 6-Mic circular array kit for Raspberry Pi," Jan. 2019. [Online]. Available: https://wiki.seeedstudio.com/BIBentrySTDinterwordspacingGoogle ScholarGoogle Scholar

Index Terms

  1. Voice Fingerprinting for Indoor Localization with a Single Microphone Array and Deep Learning
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            WiseML '22: Proceedings of the 2022 ACM Workshop on Wireless Security and Machine Learning
            May 2022
            93 pages
            ISBN:9781450392778
            DOI:10.1145/3522783
            • General Chair:
            • Murtuza Jadliwala

            Copyright © 2022 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 16 May 2022

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
          • Article Metrics

            • Downloads (Last 12 months)85
            • Downloads (Last 6 weeks)13

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader