Abstract
Crying is the infant’s first verbal communication. Before learning how to express the emotions or physiological/psychological requirements with language, infants usually express how they feel through crying. Crying is a response to a stimulus such as hunger, pain, or discomfort. However, it is sometimes difficult to figure out why an infant is crying. This can certainly be frustrating or even frightening for a caretaker, and so we in this paper have proposed an infant cry classification system to categorize the types of infant crying to help parents and nursing staffs attending to the needs of the infants. Currently, three kinds of distinct infant cries have been identified: hunger; pain; and feeling sleepy. Fifteen features are extracted from each crying frame and the sequential forward floating selection is then adopted to pick out high discriminative features. The directed acyclic graph support vector machine is finally used to classify infant crying. Experimental results have revealed the good performance of the proposed system and the classification accuracy is up to 92.17 %.
Similar content being viewed by others
References
Abdulaziz, Y., Ahmad, & S. M. S. (2010). Infant cry recognition system: A comparison of system performance based on Mel frequency and linear prediction cepstral coefficients. In Proceedings of the 2010 international conference on information retrieval and knowledge management, pp. 260–263.
Baeck, H. E., & Souza, M. N. (2001). Study of acoustic features of newborn cries that correlate with the context. IEEE International Conference Engineering in Medicine and Biology Society, 3, 2174–2177.
Baeck, H. E., & Souza, M. N. (2007). Longitudinal study of the fundamental frequency of hunger cries along the first 6 months of healthy babies. Journal of Voice, 21(5), 551–559.
Chang, C. C. & Lin, C. J. LIBSVM: A library for support vector machines. http://www.csie.ntu.edu.tw/~cjlin/libsvm.
Cooley, J. W., & Tukey, J. W. (1965). An algorithm for the machine calculation of complex Fourier series. Mathematics of Computation, 19, 297–301.
Dhanalakshmi, P., Palanivel, S., & Ramalingam, V. (2009). Classification of audio signals using SVM and RBFNN. Expert Systems with Applications, 36, 6069–6075.
Díaz, M. A. R., García, C. A. R., Robles, L. C. A., Altamirano, J. E. X., & Mendoza, A. V. (2012). Automatic infant cry analysis for the identification of qualitative features to help opportune diagnosis. Biomedical Signal Processing and Control, 7, 43–49.
Etz, T., Reetz, H., Wegener, C., & Bahlmann, F. (2014). Infant cry reliability: Acoustic homogeneity of spontaneous cries and pain-induced cries. Speech Communication, 58, 91–100.
Gilbert, H. R., & Robb, M. P. (1996). Vocal fundamental frequency characteristics of infant hunger cries: Birth to 12 months. International Journal of Pediatric Otorhinolaryngology, 34, 231–243.
Guyon, I., Gunn, S., Nikravesh, M., & Zadeh, L. A. (2006). Feature Extraction: Foundations And Applications. Berlin: Springer.
Jothilakshmi, S., Ramalingam, V., & Palanivel, S. (2009). Unsupervised speaker segmentation with residual phase and MFCC features. Expert Systems with Applications, 36, 9799–9804.
Ooi, C. S., Seng, K. P., Ang, L. M., & Chew, L. W. (2014). A new approach of audio emotion recognition. Expert Systems with Applications, 41, 5858–5869.
Orlandi, S., Dejonckere, P. H., Schoentgen, J., Lebacq, J., Rruqja, N., & Manfredi, C. (2013). Effective pre-processing of long term noisy audio recordings: An aid to clinical monitoring. Biomedical Signal Processing and Control, 8, 799–810.
Petroni, M., Malowanyl, A. S., Johnston, C. C., & Stevens, B. J. (1995). A comparison of neural network architectures for the classification of three types of infant cry vocalizations. IEEE 17th Annual Conference Engineering in Medicine and Biology Society, 1, 821–822.
Platt, J. C., Cristianini, N., & Taylor, J. S. (2000). Large margin DAGs for multiclass classification. Advances in neural information processing systems. Cambridge: MIT Press.
Prukkanon, N., Chamnongthai, K., Miyanaga, Y., & Higuchi, K. (2009). VT-AMDF, a pitch detection algorithm. In International symposium on intelligent signal processing and communication systems, pp. 453–456.
Pudil, P., Ferri, F. J., Novovicova, J., & Kittler, J. (1994). Floating search methods for feature selection with nonmonotonic criterion functions. Pattern Recognition, 2, 279–283.
Runefors, P., Arnbjörnsson, E., Elander, G., & Michelsson, K. (2000). Newborn infants’ cry after heel-prick: Analysis with sound spectrogram. Acta Paediatrica, 89, 68–72.
Sheng, X. C., Maddage, N. C., & Xi, S. (2005). Automatic music classification and summarization. IEEE Transactions on Speech and Audio Processing, 13(3), 441–450.
Silva, M., Mijovic, B., et al. (2010). Decoupling between fundamental frequency and energy envelope of neonate cries. Early Human Development, 86, 35–40.
Acknowledgments
This work was supported by Ministry of Science and Technology, Taiwan, under the Grants NSC 100-2218-E-224-007-MY3.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chang, CY., Chang, CW., Kathiravan, S. et al. DAG-SVM based infant cry classification system using sequential forward floating feature selection. Multidim Syst Sign Process 28, 961–976 (2017). https://doi.org/10.1007/s11045-016-0404-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11045-016-0404-5