Abstract
This paper addresses infant cry classification in multi-view settings, that is, settings where the typical low-level representations, commonly used for audio recognition tasks, are considered as different views of the target data. We show that through the use of multi-view methods, such as Structured Latent Multi-View Representation Learning, we are able to reliably discriminate between normal and pathological infant cry signals. Extensive experimental results on two benchmark infant cry data sets indicate that the proposed method outperforms state-of-the-art models.
Thanks to VLIRUOS for financial support in the framework of the Institutional University Cooperation project with Universidad de Oriente, Cuba.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
LaGasse, L.L., Neal, A.R., Lester, B.M.: Assessment of infant cry: acoustic cry analysis and parental perception. Ment. Retard. Dev. Disabil. Res. Rev. 11(1), 83–93 (2005)
Zeman, J.: Emotional development-early infancy, later infancy months. JRank Psychology Encyclopedia. https://psychology.jrank.org/pages/212/Emotional-Development.html. Accessed 29 Nov 2022
Bashiri, A., Hosseinkhani, R.: Infant crying classification by using genetic algorithm and artificial neural network. Acta Medica Iranica 531–539 (2020)
Ji, C., Mudiyanselage, T.B., Gao, Y., Pan, Y.: A review of infant cry analysis and classification. EURASIP J. Audio Speech Music Process. 2021(1), 1–17 (2021). https://doi.org/10.1186/s13636-021-00197-5
Chunyan, J., Xueli, X., Sunitha, B., Yi, P.: Deep learning for asphyxiated infant cry classification based on acoustic features and weighted prosodic features. In: 2019 International Conference on Internet of Things and IEEE Green Computing and Communications and IEEE Cyber. Physical and Social Computing and IEEE Smart Data, Atlanta, USA, pp. 1233–1240. IEEE (2019)
Maghfira, T.N., Basaruddin, T., Krisnadhiand, A.: Infant cry classification using CNN-RNN. In: Journal of Physics: Conference Series, vol. 1528, pp. 012–019 (2020)
Le, L., Kabir, A.N.M., Ji, C., Basodi, S., Pan, Y.: Using transfer learning, SVM, and ensemble classification to classify baby cries based on their spectrogram images. In: 2019 IEEE 16th International Conference on Mobile Ad Hoc and Sensor Systems Workshops (MASSW), Monterey, USA, pp. 106–110. IEEE (2019)
Felipe, G.Z., et al.: Identification of infants’ cry motivation using spectrograms. In: 2019 International Conference on Systems. Signals and Image Processing (IWSSIP), Osijek, Croatia, pp. 181–186. IEEE (2019)
Sharma, S., Mittal, V.K.: A qualitative assessment of different sound types of an infant cry. In: 2017 4th IEEE Uttar Pradesh Section International Conference on Electrical. Computer and Electronics (UPCON), Mathura, India, pp. 532–537. IEEE (2017)
Dewi, S.P., Prasasti, A.L., Irawan, B.: The study of baby crying analysis using MFCC and LFCC in different classification methods. In: 2019 IEEE International Conference on Signals and Systems (ICSigSys), Bandung, Indonesia, pp. 18–23. IEEE (2019)
Wang, W., Tran, D., Feiszli, M.: What makes training multi-modal classification networks hard?. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, pp. 12695–12705. IEEE (2020)
Phan, H., et al.: Multi-view audio and music classification. In: 2021 IEEE International Conference on Acoustics. Speech and Signal Processing (ICASSP), ICASSP 2021, Toronto, Canada, pp. 611–615. IEEE (2021)
Yan, X., Hu, S., Mao, Y., Ye, Y., Yu, H.: Deep multiview learning methods: a review. Neurocomputing 448, 106–129 (2021)
Zhang, C., Han, Z., Fu, H., Zhou, J.T., Hu, Q.: CPM-Nets: cross partial multi-view networks. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Kang, H., et al.: Diagnosis of coronavirus disease 2019 (Covid-19) with structured latent multi-view representation learning. IEEE Trans. Med. Imaging 39(8), 2606–2614 (2020)
Zhang, C., Cui, Y., Zongbo, Z.H., Zhou, J.T., Fu, H., Hu, Q.: Deep partial multi-view learning. IEEE Trans. Pattern Anal. Mach. Intell. 44(5), 2402–2415 (2022)
Xie, Z., Yang, Y., Zhang, Y., Wang, J., Du, S.: Deep learning on multi-view sequential data: a survey. Artif. Intell. Rev. 56, 6661–6704 (2022)
Casebeer, J., Luc, B., Smaragdis, P.: Multi-view networks for denoising of arbitrary numbers of channels. In: 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC), Tokyo, Japan, pp. 496–500. IEEE (2018)
Casebeer, J., Wang, Z., Smaragdis, P.: Multi-view networks for multi-channel audio classification. In: 2019 IEEE International Conference on Acoustics. Speech and Signal Processing (ICASSP), ICASSP 2019, Brighton, UK, pp. 940–944. IEEE (2019)
Singh, A., Rajan, P., Bhavsar, A.: Deep multi-view features from raw audio for acoustic scene classification. In: Detection and Classification of Acoustic Scenes and Events 2019, New York, USA. IEEE (2019)
Aytar, Y., Vondrick, C., Torralba, A.: SoundNet: learning sound representations from unlabeled video. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Chandrakala, S.: Multi-view representation for sound event recognition. SIViP 15(6), 1211–1219 (2021). https://doi.org/10.1007/s11760-020-01851-9
He, N., Ferguson, S.: Multi-view neural networks for raw audio-based music emotion recognition. In: 2020 IEEE International Symposium on Multimedia (ISM), Naples, Italy, pp. 168–172. IEEE (2020)
Badreldine, O.M., Elbeheiry, N.A., Haroon, A.N.M., ElShehaby, S., Marzook, E.M.: Automatic diagnosis of asphyxia infant cry signals using wavelet based Mel frequency cepstrum features. In: 2018 14th International Computer Engineering Conference (ICENCO), Giza, Egypt, pp. 96–100. IEEE (2018)
Hariharan, M., et al.: Improved binary dragonfly optimization algorithm and wavelet packet based non-linear features for infant cry classification. Comput. Methods Programs Biomed. 155, 39–51 (2018)
Wahid, N.S.A., Saad, P., Hariharan, M.: Automatic infant cry pattern classification for a multiclass problem. J. Telecommun. Electron. Comput. Eng. (JTEC) 8(9), 45–52 (2016)
Martinez-Cañete, Y., Cano-Ortiz, S.D., Lombardía-Legrá, L., Rodríguez-Fernández, E., Veranes-Vicet, L.: Data mining techniques in normal or pathological infant cry. In: Hernández Heredia, Y., Milián Núñez, V., Ruiz Shulcloper, J. (eds.) IWAIPR 2018. LNCS, vol. 11047, pp. 141–148. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01132-1_16
Liu, L., Li, Y., Kuo, K.: Infant cry signal detection, pattern extraction and recognition. In: 2018 International Conference on Information and Computer Technologies (ICICT), Illinois, USA, pp. 159–163. IEEE (2018)
Liu, L., Li, W., Wu, X., Zhou, B.X.: Infant cry language analysis and recognition: an experimental approach. IEEE/CAA J. Automatica Sinica 6(3), 778–788 (2019)
Patni, H., Jagtap, A., Bhoyar, V., Gupta, A.: Speech emotion recognition using MFCC, GFCC, chromagram and RMSE features. In: 2021 8th International Conference on Signal Processing and Integrated Networks (SPIN), Delhi, India, pp. 892–897. IEEE (2021)
Reyes-Galaviz, O.F., Cano-Ortiz, S.D., ReyesGarc’ıa, C.A.: Evolutionary-neural system to classify infant cry units for pathologies identification in recently born babies. In: 2008 Seventh Mexican International Conference on Artificial Intelligence, Atizapan de Zaragoza, pp. 330–335. IEEE (2008)
Chunyan, J., Chen, M., Bin, L., Pan, Y.: Infant cry classification with graph convolutional networks. In: 2021 IEEE 6th International Conference on Computer and Communication Systems (ICCCS), Chengdu, China, pp. 322–327. IEEE (2021)
Mahmoud, A.M., Swilem, S.M., Alqarni, A.S., Haron, F.: Infant cry classification using semisupervised k-nearest neighbor approach. In: 2020 13th International Conference on Developments in eSystems Engineering (DeSE), Wuhan, China, pp. 305–310. IEEE (2020)
Kulkarni, P., Umarani, S., Diwan, V., Korde, V., Rege, P.P.: Child cry classification-an analysis of features and models. In: 2021 6th International Conference for Convergence in Technology (I2CT), Pune, India, pp. 1–7. IEEE (2021)
Agarwal, P., Kumar, M., Sriramoju, V., Deshpande, K., Shaikh, N.: New-born’s cry analysis using machine learning algorithm. Available at SSRN 4091262. https://ssrn.com/abstract=4091262 or https://doi.org/10.2139/ssrn. Accessed 23 Apr 2022
Jiang, L., Yi, Y., Chen, D., Tan, P., Liu, X.: A novel infant cry recognition system using auditory model-based robust feature and GMM-UBM. Concurr. Comput.: Pract. Exp. 33(11), e5405 (2021)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Martinez-Cañete, Y., Sahli, H., Berenguer, A.D. (2023). Multi-view Infant Cry Classification. In: Pertusa, A., Gallego, A.J., Sánchez, J.A., Domingues, I. (eds) Pattern Recognition and Image Analysis. IbPRIA 2023. Lecture Notes in Computer Science, vol 14062. Springer, Cham. https://doi.org/10.1007/978-3-031-36616-1_51
Download citation
DOI: https://doi.org/10.1007/978-3-031-36616-1_51
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36615-4
Online ISBN: 978-3-031-36616-1
eBook Packages: Computer ScienceComputer Science (R0)