Skip to main content

Multi-view Infant Cry Classification

  • Conference paper
  • First Online:
Pattern Recognition and Image Analysis (IbPRIA 2023)

Abstract

This paper addresses infant cry classification in multi-view settings, that is, settings where the typical low-level representations, commonly used for audio recognition tasks, are considered as different views of the target data. We show that through the use of multi-view methods, such as Structured Latent Multi-View Representation Learning, we are able to reliably discriminate between normal and pathological infant cry signals. Extensive experimental results on two benchmark infant cry data sets indicate that the proposed method outperforms state-of-the-art models.

Thanks to VLIRUOS for financial support in the framework of the Institutional University Cooperation project with Universidad de Oriente, Cuba.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/gveres/donateacry-corpus.

  2. 2.

    https://github.com/SuperKogito/spafe.

  3. 3.

    https://librosa.org/.

  4. 4.

    https://github.com/kitzeslab/opensoundscape.

  5. 5.

    https://pypi.org/project/disvoice-prosody/.

References

  1. LaGasse, L.L., Neal, A.R., Lester, B.M.: Assessment of infant cry: acoustic cry analysis and parental perception. Ment. Retard. Dev. Disabil. Res. Rev. 11(1), 83–93 (2005)

    Article  Google Scholar 

  2. Zeman, J.: Emotional development-early infancy, later infancy months. JRank Psychology Encyclopedia. https://psychology.jrank.org/pages/212/Emotional-Development.html. Accessed 29 Nov 2022

  3. Bashiri, A., Hosseinkhani, R.: Infant crying classification by using genetic algorithm and artificial neural network. Acta Medica Iranica 531–539 (2020)

    Google Scholar 

  4. Ji, C., Mudiyanselage, T.B., Gao, Y., Pan, Y.: A review of infant cry analysis and classification. EURASIP J. Audio Speech Music Process. 2021(1), 1–17 (2021). https://doi.org/10.1186/s13636-021-00197-5

    Article  Google Scholar 

  5. Chunyan, J., Xueli, X., Sunitha, B., Yi, P.: Deep learning for asphyxiated infant cry classification based on acoustic features and weighted prosodic features. In: 2019 International Conference on Internet of Things and IEEE Green Computing and Communications and IEEE Cyber. Physical and Social Computing and IEEE Smart Data, Atlanta, USA, pp. 1233–1240. IEEE (2019)

    Google Scholar 

  6. Maghfira, T.N., Basaruddin, T., Krisnadhiand, A.: Infant cry classification using CNN-RNN. In: Journal of Physics: Conference Series, vol. 1528, pp. 012–019 (2020)

    Google Scholar 

  7. Le, L., Kabir, A.N.M., Ji, C., Basodi, S., Pan, Y.: Using transfer learning, SVM, and ensemble classification to classify baby cries based on their spectrogram images. In: 2019 IEEE 16th International Conference on Mobile Ad Hoc and Sensor Systems Workshops (MASSW), Monterey, USA, pp. 106–110. IEEE (2019)

    Google Scholar 

  8. Felipe, G.Z., et al.: Identification of infants’ cry motivation using spectrograms. In: 2019 International Conference on Systems. Signals and Image Processing (IWSSIP), Osijek, Croatia, pp. 181–186. IEEE (2019)

    Google Scholar 

  9. Sharma, S., Mittal, V.K.: A qualitative assessment of different sound types of an infant cry. In: 2017 4th IEEE Uttar Pradesh Section International Conference on Electrical. Computer and Electronics (UPCON), Mathura, India, pp. 532–537. IEEE (2017)

    Google Scholar 

  10. Dewi, S.P., Prasasti, A.L., Irawan, B.: The study of baby crying analysis using MFCC and LFCC in different classification methods. In: 2019 IEEE International Conference on Signals and Systems (ICSigSys), Bandung, Indonesia, pp. 18–23. IEEE (2019)

    Google Scholar 

  11. Wang, W., Tran, D., Feiszli, M.: What makes training multi-modal classification networks hard?. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, pp. 12695–12705. IEEE (2020)

    Google Scholar 

  12. Phan, H., et al.: Multi-view audio and music classification. In: 2021 IEEE International Conference on Acoustics. Speech and Signal Processing (ICASSP), ICASSP 2021, Toronto, Canada, pp. 611–615. IEEE (2021)

    Google Scholar 

  13. Yan, X., Hu, S., Mao, Y., Ye, Y., Yu, H.: Deep multiview learning methods: a review. Neurocomputing 448, 106–129 (2021)

    Article  Google Scholar 

  14. Zhang, C., Han, Z., Fu, H., Zhou, J.T., Hu, Q.: CPM-Nets: cross partial multi-view networks. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

    Google Scholar 

  15. Kang, H., et al.: Diagnosis of coronavirus disease 2019 (Covid-19) with structured latent multi-view representation learning. IEEE Trans. Med. Imaging 39(8), 2606–2614 (2020)

    Article  Google Scholar 

  16. Zhang, C., Cui, Y., Zongbo, Z.H., Zhou, J.T., Fu, H., Hu, Q.: Deep partial multi-view learning. IEEE Trans. Pattern Anal. Mach. Intell. 44(5), 2402–2415 (2022)

    Google Scholar 

  17. Xie, Z., Yang, Y., Zhang, Y., Wang, J., Du, S.: Deep learning on multi-view sequential data: a survey. Artif. Intell. Rev. 56, 6661–6704 (2022)

    Article  Google Scholar 

  18. Casebeer, J., Luc, B., Smaragdis, P.: Multi-view networks for denoising of arbitrary numbers of channels. In: 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC), Tokyo, Japan, pp. 496–500. IEEE (2018)

    Google Scholar 

  19. Casebeer, J., Wang, Z., Smaragdis, P.: Multi-view networks for multi-channel audio classification. In: 2019 IEEE International Conference on Acoustics. Speech and Signal Processing (ICASSP), ICASSP 2019, Brighton, UK, pp. 940–944. IEEE (2019)

    Google Scholar 

  20. Singh, A., Rajan, P., Bhavsar, A.: Deep multi-view features from raw audio for acoustic scene classification. In: Detection and Classification of Acoustic Scenes and Events 2019, New York, USA. IEEE (2019)

    Google Scholar 

  21. Aytar, Y., Vondrick, C., Torralba, A.: SoundNet: learning sound representations from unlabeled video. In: Advances in Neural Information Processing Systems, vol. 29 (2016)

    Google Scholar 

  22. Chandrakala, S.: Multi-view representation for sound event recognition. SIViP 15(6), 1211–1219 (2021). https://doi.org/10.1007/s11760-020-01851-9

    Article  Google Scholar 

  23. He, N., Ferguson, S.: Multi-view neural networks for raw audio-based music emotion recognition. In: 2020 IEEE International Symposium on Multimedia (ISM), Naples, Italy, pp. 168–172. IEEE (2020)

    Google Scholar 

  24. Badreldine, O.M., Elbeheiry, N.A., Haroon, A.N.M., ElShehaby, S., Marzook, E.M.: Automatic diagnosis of asphyxia infant cry signals using wavelet based Mel frequency cepstrum features. In: 2018 14th International Computer Engineering Conference (ICENCO), Giza, Egypt, pp. 96–100. IEEE (2018)

    Google Scholar 

  25. Hariharan, M., et al.: Improved binary dragonfly optimization algorithm and wavelet packet based non-linear features for infant cry classification. Comput. Methods Programs Biomed. 155, 39–51 (2018)

    Article  Google Scholar 

  26. Wahid, N.S.A., Saad, P., Hariharan, M.: Automatic infant cry pattern classification for a multiclass problem. J. Telecommun. Electron. Comput. Eng. (JTEC) 8(9), 45–52 (2016)

    Google Scholar 

  27. Martinez-Cañete, Y., Cano-Ortiz, S.D., Lombardía-Legrá, L., Rodríguez-Fernández, E., Veranes-Vicet, L.: Data mining techniques in normal or pathological infant cry. In: Hernández Heredia, Y., Milián Núñez, V., Ruiz Shulcloper, J. (eds.) IWAIPR 2018. LNCS, vol. 11047, pp. 141–148. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01132-1_16

    Chapter  Google Scholar 

  28. Liu, L., Li, Y., Kuo, K.: Infant cry signal detection, pattern extraction and recognition. In: 2018 International Conference on Information and Computer Technologies (ICICT), Illinois, USA, pp. 159–163. IEEE (2018)

    Google Scholar 

  29. Liu, L., Li, W., Wu, X., Zhou, B.X.: Infant cry language analysis and recognition: an experimental approach. IEEE/CAA J. Automatica Sinica 6(3), 778–788 (2019)

    Article  Google Scholar 

  30. Patni, H., Jagtap, A., Bhoyar, V., Gupta, A.: Speech emotion recognition using MFCC, GFCC, chromagram and RMSE features. In: 2021 8th International Conference on Signal Processing and Integrated Networks (SPIN), Delhi, India, pp. 892–897. IEEE (2021)

    Google Scholar 

  31. Reyes-Galaviz, O.F., Cano-Ortiz, S.D., ReyesGarc’ıa, C.A.: Evolutionary-neural system to classify infant cry units for pathologies identification in recently born babies. In: 2008 Seventh Mexican International Conference on Artificial Intelligence, Atizapan de Zaragoza, pp. 330–335. IEEE (2008)

    Google Scholar 

  32. Chunyan, J., Chen, M., Bin, L., Pan, Y.: Infant cry classification with graph convolutional networks. In: 2021 IEEE 6th International Conference on Computer and Communication Systems (ICCCS), Chengdu, China, pp. 322–327. IEEE (2021)

    Google Scholar 

  33. Mahmoud, A.M., Swilem, S.M., Alqarni, A.S., Haron, F.: Infant cry classification using semisupervised k-nearest neighbor approach. In: 2020 13th International Conference on Developments in eSystems Engineering (DeSE), Wuhan, China, pp. 305–310. IEEE (2020)

    Google Scholar 

  34. Kulkarni, P., Umarani, S., Diwan, V., Korde, V., Rege, P.P.: Child cry classification-an analysis of features and models. In: 2021 6th International Conference for Convergence in Technology (I2CT), Pune, India, pp. 1–7. IEEE (2021)

    Google Scholar 

  35. Agarwal, P., Kumar, M., Sriramoju, V., Deshpande, K., Shaikh, N.: New-born’s cry analysis using machine learning algorithm. Available at SSRN 4091262. https://ssrn.com/abstract=4091262 or https://doi.org/10.2139/ssrn. Accessed 23 Apr 2022

  36. Jiang, L., Yi, Y., Chen, D., Tan, P., Liu, X.: A novel infant cry recognition system using auditory model-based robust feature and GMM-UBM. Concurr. Comput.: Pract. Exp. 33(11), e5405 (2021)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yadisbel Martinez-Cañete , Hichem Sahli or Abel Díaz Berenguer .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Martinez-Cañete, Y., Sahli, H., Berenguer, A.D. (2023). Multi-view Infant Cry Classification. In: Pertusa, A., Gallego, A.J., Sánchez, J.A., Domingues, I. (eds) Pattern Recognition and Image Analysis. IbPRIA 2023. Lecture Notes in Computer Science, vol 14062. Springer, Cham. https://doi.org/10.1007/978-3-031-36616-1_51

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-36616-1_51

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-36615-4

  • Online ISBN: 978-3-031-36616-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics