Abstract
Audio-based activity recognition is an essential task in a wide range of human-centric applications. However, most of the work predominantly focuses on event detection, machine sound classification, road surveillance, scene classification, etc. There has been negligible attention to the recognition of low-intensity human activities for outdoor scenarios. This paper proposes a deep learning-based framework for recognizing different low-intensity human activities in a sparsely populated outdoor environment using audio. The proposed framework classifies 2.0 s long audio recordings into one of nine different activity classes. A variety of audio sounds in an outdoor environment makes it challenging to distinguish human activities from other background sounds. The proposed framework is an end-to-end architecture that employs a combination of mel-frequency cepstral coefficients and a 2D convolutional neural network to obtain a deep representation of activities and classify them. The extensive experimental analysis demonstrates that the proposed framework outperforms existing frameworks by 16.43% on the parameter F1-score. Additionally, we collected and provided an audio dataset for evaluation and benchmarking purposes to the research community.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Carletti, V., Foggia, P., Percannella, G., Saggese, A., Strisciuglio, N., Vento, M.: Audio surveillance using a bag of aural words classifier. In: 10th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 81–86. Krakow, Poland (2013)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16(1), 321–357 (2002)
Chen, J., Kam, A.H., Zhang, J., Liu, N., Shue, L.: Bathroom activity monitoring based on sound. In: Gellersen, H.-W., Want, R., Schmidt, A. (eds.) Pervasive 2005. LNCS, vol. 3468, pp. 47–61. Springer, Heidelberg (2005). https://doi.org/10.1007/11428572_4
Choudhary, P., Kumari, P.: An audio-seismic dataset for human activity recognition (2022). https://doi.org/10.21227/315c-zw20
Cramer, J., Wu, H.H., Salamon, J., Bello, J.P.: Look, listen, and learn more: design choices for deep audio embeddings. In: International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3852–3856. IEEE (2019)
Cui, W., Li, B., Zhang, L., Chen, Z.: Device-free single-user activity recognition using diversified deep ensemble learning. Appl. Soft Comput. 102, 107066 (2021)
Doukas, C., Maglogiannis, I.: Advanced patient or elder fall detection based on movement and sound data. In: 2nd International Conference on Pervasive Computing Technologies for Healthcare, pp. 103–107. IEEE (2008)
Ekpezu, A.O., Wiafe, I., Katsriku, F., Yaokumah, W.: Using deep learning for acoustic event classification: the case of natural disasters. J. Acoust. Soc. Am. 149(4), 2926–2935 (2021)
Foggia, P., Petkov, N., Saggese, A., Strisciuglio, N., Vento, M.: Audio surveillance of roads: A system for detecting anomalous sounds. IEEE Trans. Intell. Transp. Syst. 17(1), 279–288 (2015)
Geiger, J.T., Schuller, B., Rigoll, G.: Large-scale audio feature extraction and SVM for acoustic scene classification. In: Workshop on Applications of Signal Processing to Audio and Acoustics, pp. 1–4. IEEE (2013)
Hershey, S., et al.: CNN architectures for large-scale audio classification. In: International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 131–135. IEEE (2017)
Iravantchi, Y., Ahuja, K., Goel, M., Harrison, C., Sample, A.: PrivacyMic: utilizing inaudible frequencies for privacy preserving daily activity recognition. In: CHI Conference on Human Factors in Computing Systems, pp. 1–13. ACM (2021)
Jung, M., Chi, S.: Human activity classification based on sound recognition and residual convolutional neural network. Autom. Construct. 114, 103177 (2020)
Khatun, A., Hossain, S., Sarowar, G.: A Fourier domain feature approach for human activity recognition & fall detection. arXiv preprint arXiv:2003.05209 (2020)
Kraft, F., Malkin, R., Schaaf, T., Waibel, A.: Temporal ICA for classification of acoustic events in a kitchen environment. In: Interspeech, Lisbon, Portugal, vol. 605. CiteSeer (2005)
Küçükbay, S.E., Sert, M., Yazici, A.: Use of acoustic and vibration sensor data to detect objects in surveillance wireless sensor networks. In: 21st International Conference on Control Systems and Computer Science (CSCS), pp. 207–212. IEEE, Bucharest, Romania (2017)
Lee, Y.C., Scarpiniti, M., Uncini, A.: Advanced sound classifiers and performance analyses for accurate audio-based construction project monitoring. J. Comput. Civ. Eng. 34(5), 04020030 (2020)
Peltonen, V., Tuomi, J., Klapuri, A., Huopaniemi, J., Sorsa, T.: Computational auditory scene recognition. In: International Conference on Acoustics, Speech and Signal Processing (ICASSP). vol. 2, pp. II-1941. Orlando, FL, USA (2002)
Pucci, L., Testi, E., Favarelli, E., Giorgetti, A.: Human activities classification using biaxial seismic sensors. IEEE Sens. Lett. 4(10), 1–4 (2020)
Rashid, K.M., Louis, J.: Activity identification in modular construction using audio signals and machine learning. Autom. Constr. 119, 103361 (2020)
Sherafat, B., Rashidi, A., Lee, Y.C., Ahn, C.R.: Automated activity recognition of construction equipment using a data fusion approach. In: Computing in Civil Engineering 2019: Data, Sensing, and Analytics, pp. 1–8. ASCE (2019)
Vafeiadis, A., Votis, K., Giakoumis, D., Tzovaras, D., Chen, L., Hamzaoui, R.: Audio-based event recognition system for smart homes. In: IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation, pp. 1–8. IEEE (2017)
Vafeiadis, A., Votis, K., Giakoumis, D., Tzovaras, D., Chen, L., Hamzaoui, R.: Audio content analysis for unobtrusive event detection in smart homes. Eng. Appl. Artif. Intell. 89, 103226 (2020)
Yin, C., Chen, J., Miao, X., Jiang, H., Chen, D.: Device-free human activity recognition with low-resolution infrared array sensor using long short-term memory neural network. Sensors 21(10), 3551 (2021)
Zhu, C., Sheng, W.: Wearable sensor-based hand gesture and daily activity recognition for robot-assisted living. IEEE Trans. Syst. Man Cybern. Syst. 41(3), 569–573 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Choudhary, P., Kumari, P., Goel, N., Saini, M. (2023). Low-Intensity Human Activity Recognition Framework Using Audio Data in an Outdoor Environment. In: Gupta, D., Bhurchandi, K., Murala, S., Raman, B., Kumar, S. (eds) Computer Vision and Image Processing. CVIP 2022. Communications in Computer and Information Science, vol 1777. Springer, Cham. https://doi.org/10.1007/978-3-031-31417-9_49
Download citation
DOI: https://doi.org/10.1007/978-3-031-31417-9_49
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-31416-2
Online ISBN: 978-3-031-31417-9
eBook Packages: Computer ScienceComputer Science (R0)