Abstract
Crying is the primary means of communication between the baby and the outside world. When a baby is crying, it is difficult for a novice parent to immediately understand the baby’s needs. If parents can accurately determine the cause of the baby’s cry, they can understand the baby’s emotional and physiological changes and needs. In real-world applications, recording devices may record sounds that are not produced by a baby. To reduce the burden on the recognition server and improve the accuracy of the classifier, this study proposes the conversion of the baby’s crying signal into a two-dimensional spectrogram. A convolutional neural network is used to determine if the input spectrum represents a baby’s cry. A baby’s cry is ultimately divided into four categories (including pain, hunger, sleepiness, and wet diaper) through additional one-dimensional convolutional neural networks. Experimental results showed that the proposed method achieves high crying detection and recognition rates.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Ballester, P., de Araújo, R.M.: On the performance of GoogLeNet and AlexNet applied to sketches. In: AAAI (2016)
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Rader, C., Brenner, N.: A new principle for fast Fourier transformation. IEEE Trans. Acoust. Speech Signal Process. 24(3), 264–266 (1976)
Tyagi, V., Wellekens, C.: On desensitizing the Mel-Cepstrum to spurious spectral components for robust speech recognition. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, (ICASSP 2005) , vol. 1 (2005)
Garcia, J.O., Garcia, C.R.: Mel-frequency cepstrum coefficients extraction from infant cry for classification of normal and pathological cry with feed-forward neural networks. Neural Netw. 4, 3140–3145 (2003)
Petroni, M., et al.: Identification of pain from infant cry vocalizations using artificial neural networks (ANNs). In: Applications and Science of Artificial Neural Networks, vol. 2492. International Society for Optics and Photonics (1995)
Yong, B.F., Ting, H.N., Ng, K.H.: Baby cry recognition using deep neural networks. In: World Congress on Medical Physics and Biomedical Engineering 2018. Springer, Singapore (2019)
Abdel-Hamid, O., et al.: Convolutional neural networks for speech recognition. IEEE/ACM Trans. Audio Speech, Lang. Process. 22(10), 1533–1545 (2014)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (2012)
Muda, L., Begam, M., Elamvazuthi, I.: Voice recognition algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) techniques. arXiv preprint arXiv:1003.4083 (2010)
Sohn, J., Sung, W.: A voice activity detector employing soft decision based noise spectrum adaptation. Acoust. Speech Signal Process. 1, 365–368 (1998)
Fushiki, T.: Estimation of prediction error by using K-fold cross-validation. Stat. Comput. 21(2), 137–146 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Chang, CY., Tsai, LY. (2019). A CNN-Based Method for Infant Cry Detection and Recognition. In: Barolli, L., Takizawa, M., Xhafa, F., Enokido, T. (eds) Web, Artificial Intelligence and Network Applications. WAINA 2019. Advances in Intelligent Systems and Computing, vol 927. Springer, Cham. https://doi.org/10.1007/978-3-030-15035-8_76
Download citation
DOI: https://doi.org/10.1007/978-3-030-15035-8_76
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-15034-1
Online ISBN: 978-3-030-15035-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)