Skip to main content

Low-Intensity Human Activity Recognition Framework Using Audio Data in an Outdoor Environment

  • Conference paper
  • First Online:
Computer Vision and Image Processing (CVIP 2022)

Abstract

Audio-based activity recognition is an essential task in a wide range of human-centric applications. However, most of the work predominantly focuses on event detection, machine sound classification, road surveillance, scene classification, etc. There has been negligible attention to the recognition of low-intensity human activities for outdoor scenarios. This paper proposes a deep learning-based framework for recognizing different low-intensity human activities in a sparsely populated outdoor environment using audio. The proposed framework classifies 2.0 s long audio recordings into one of nine different activity classes. A variety of audio sounds in an outdoor environment makes it challenging to distinguish human activities from other background sounds. The proposed framework is an end-to-end architecture that employs a combination of mel-frequency cepstral coefficients and a 2D convolutional neural network to obtain a deep representation of activities and classify them. The extensive experimental analysis demonstrates that the proposed framework outperforms existing frameworks by 16.43% on the parameter F1-score. Additionally, we collected and provided an audio dataset for evaluation and benchmarking purposes to the research community.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Carletti, V., Foggia, P., Percannella, G., Saggese, A., Strisciuglio, N., Vento, M.: Audio surveillance using a bag of aural words classifier. In: 10th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 81–86. Krakow, Poland (2013)

    Google Scholar 

  2. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16(1), 321–357 (2002)

    Article  MATH  Google Scholar 

  3. Chen, J., Kam, A.H., Zhang, J., Liu, N., Shue, L.: Bathroom activity monitoring based on sound. In: Gellersen, H.-W., Want, R., Schmidt, A. (eds.) Pervasive 2005. LNCS, vol. 3468, pp. 47–61. Springer, Heidelberg (2005). https://doi.org/10.1007/11428572_4

    Chapter  Google Scholar 

  4. Choudhary, P., Kumari, P.: An audio-seismic dataset for human activity recognition (2022). https://doi.org/10.21227/315c-zw20

  5. Cramer, J., Wu, H.H., Salamon, J., Bello, J.P.: Look, listen, and learn more: design choices for deep audio embeddings. In: International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3852–3856. IEEE (2019)

    Google Scholar 

  6. Cui, W., Li, B., Zhang, L., Chen, Z.: Device-free single-user activity recognition using diversified deep ensemble learning. Appl. Soft Comput. 102, 107066 (2021)

    Article  Google Scholar 

  7. Doukas, C., Maglogiannis, I.: Advanced patient or elder fall detection based on movement and sound data. In: 2nd International Conference on Pervasive Computing Technologies for Healthcare, pp. 103–107. IEEE (2008)

    Google Scholar 

  8. Ekpezu, A.O., Wiafe, I., Katsriku, F., Yaokumah, W.: Using deep learning for acoustic event classification: the case of natural disasters. J. Acoust. Soc. Am. 149(4), 2926–2935 (2021)

    Article  Google Scholar 

  9. Foggia, P., Petkov, N., Saggese, A., Strisciuglio, N., Vento, M.: Audio surveillance of roads: A system for detecting anomalous sounds. IEEE Trans. Intell. Transp. Syst. 17(1), 279–288 (2015)

    Article  Google Scholar 

  10. Geiger, J.T., Schuller, B., Rigoll, G.: Large-scale audio feature extraction and SVM for acoustic scene classification. In: Workshop on Applications of Signal Processing to Audio and Acoustics, pp. 1–4. IEEE (2013)

    Google Scholar 

  11. Hershey, S., et al.: CNN architectures for large-scale audio classification. In: International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 131–135. IEEE (2017)

    Google Scholar 

  12. Iravantchi, Y., Ahuja, K., Goel, M., Harrison, C., Sample, A.: PrivacyMic: utilizing inaudible frequencies for privacy preserving daily activity recognition. In: CHI Conference on Human Factors in Computing Systems, pp. 1–13. ACM (2021)

    Google Scholar 

  13. Jung, M., Chi, S.: Human activity classification based on sound recognition and residual convolutional neural network. Autom. Construct. 114, 103177 (2020)

    Article  Google Scholar 

  14. Khatun, A., Hossain, S., Sarowar, G.: A Fourier domain feature approach for human activity recognition & fall detection. arXiv preprint arXiv:2003.05209 (2020)

  15. Kraft, F., Malkin, R., Schaaf, T., Waibel, A.: Temporal ICA for classification of acoustic events in a kitchen environment. In: Interspeech, Lisbon, Portugal, vol. 605. CiteSeer (2005)

    Google Scholar 

  16. Küçükbay, S.E., Sert, M., Yazici, A.: Use of acoustic and vibration sensor data to detect objects in surveillance wireless sensor networks. In: 21st International Conference on Control Systems and Computer Science (CSCS), pp. 207–212. IEEE, Bucharest, Romania (2017)

    Google Scholar 

  17. Lee, Y.C., Scarpiniti, M., Uncini, A.: Advanced sound classifiers and performance analyses for accurate audio-based construction project monitoring. J. Comput. Civ. Eng. 34(5), 04020030 (2020)

    Article  Google Scholar 

  18. Peltonen, V., Tuomi, J., Klapuri, A., Huopaniemi, J., Sorsa, T.: Computational auditory scene recognition. In: International Conference on Acoustics, Speech and Signal Processing (ICASSP). vol. 2, pp. II-1941. Orlando, FL, USA (2002)

    Google Scholar 

  19. Pucci, L., Testi, E., Favarelli, E., Giorgetti, A.: Human activities classification using biaxial seismic sensors. IEEE Sens. Lett. 4(10), 1–4 (2020)

    Article  Google Scholar 

  20. Rashid, K.M., Louis, J.: Activity identification in modular construction using audio signals and machine learning. Autom. Constr. 119, 103361 (2020)

    Article  Google Scholar 

  21. Sherafat, B., Rashidi, A., Lee, Y.C., Ahn, C.R.: Automated activity recognition of construction equipment using a data fusion approach. In: Computing in Civil Engineering 2019: Data, Sensing, and Analytics, pp. 1–8. ASCE (2019)

    Google Scholar 

  22. Vafeiadis, A., Votis, K., Giakoumis, D., Tzovaras, D., Chen, L., Hamzaoui, R.: Audio-based event recognition system for smart homes. In: IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation, pp. 1–8. IEEE (2017)

    Google Scholar 

  23. Vafeiadis, A., Votis, K., Giakoumis, D., Tzovaras, D., Chen, L., Hamzaoui, R.: Audio content analysis for unobtrusive event detection in smart homes. Eng. Appl. Artif. Intell. 89, 103226 (2020)

    Article  Google Scholar 

  24. Yin, C., Chen, J., Miao, X., Jiang, H., Chen, D.: Device-free human activity recognition with low-resolution infrared array sensor using long short-term memory neural network. Sensors 21(10), 3551 (2021)

    Article  Google Scholar 

  25. Zhu, C., Sheng, W.: Wearable sensor-based hand gesture and daily activity recognition for robot-assisted living. IEEE Trans. Syst. Man Cybern. Syst. 41(3), 569–573 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Priyankar Choudhary .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Choudhary, P., Kumari, P., Goel, N., Saini, M. (2023). Low-Intensity Human Activity Recognition Framework Using Audio Data in an Outdoor Environment. In: Gupta, D., Bhurchandi, K., Murala, S., Raman, B., Kumar, S. (eds) Computer Vision and Image Processing. CVIP 2022. Communications in Computer and Information Science, vol 1777. Springer, Cham. https://doi.org/10.1007/978-3-031-31417-9_49

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-31417-9_49

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-31416-2

  • Online ISBN: 978-3-031-31417-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics