Infant Attachment Prediction Using Vision and Audio Features in Mother-Infant Interaction

Li, Honggai; Cui, Jinshi; Wang, Li; Zha, Hongbin

doi:10.1007/978-3-030-41299-9_38

Honggai Li¹²,
Jinshi Cui¹²,
Li Wang¹³ &
…
Hongbin Zha¹²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12047))

Included in the following conference series:

Asian Conference on Pattern Recognition

1391 Accesses
1 Citations

Abstract

Attachment is a deep and enduring emotional bond that connects one person to another across time and space. Our early attachment styles are established in childhood through the interaction between infants and caregivers. There are two attachment types, secure and insecure. The attachment experience affects personality development, particularly a sense of security, and research shows that it influences the ability to form stable relationships throughout life. It is also an important aspect of assessing the quality of parenting. Therefore, attachment has been widely studied in psychology research. It’s usually acquired by Ainsworth’s Strange Situation Assessment (SSA) through tedious observation. As far as we know, there is no computational method to predict infant attachment type. We try to use the Still-Face Paradigm (SFP) video and audio as input to predict attachment types through machine learning methods. In the present work, we recruited 64 infant-mother participants, collected videos of SFP when babies are 5–8 months of age and identified their attachment types including secure and insecure by SSA when those infants are almost 2 years old. For the visual part, we extract motion features and apply a RNN network with LSTM units model for classification. For the audio part, speech enhancement is conducted as data pre-processing, pitch frequency, short-time energy and Mel Frequency Cepstral Coefficient feature sequences are extracted. Then SVM is deployed to explore the patterns in it. The experiments show that our method is able to discriminate between the 2 classes of subjects with a good accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bowlby, J.: Attachment theory and its therapeutic implications. Adolesc. Psychiatry 6, 5–33 (1978)
Google Scholar
Firestone, L.: Disorganized Attachment
Google Scholar
Braungart-rieker, J., Garwood, M., Powers, B., Wang, X.: Parental sensitivity, infant affect, and affect regulation: predictors of later attachment. Child Dev. 72, 252–270 (2001)
Article Google Scholar
Braungart-rieker, J., Zentall, S., Lickenbrock, D., Ekas, N., Oshio, T., Planalp, E.: Attachment in the making: mother and father sensitivity and infants’ responses during the still-face paradigm. J. Exp. Child Psychol. 125, 63–84 (2014)
Article Google Scholar
Tronick, E., Als, H., Adamson, L., Wise, S., Brazelton, T.: The infant’s response to entrapment between contradictory messages in face-to-face interaction. Pediatrics 62, 403–403 (1978)
Google Scholar
Cohn, J.: Additional components of the still-face effect: commentary on Adamson and Frick. Infancy 4, 493–497 (2003)
Article Google Scholar
Ainsworth, M., Blehar, M., Waters, E., Wall, S.: Patterns of Attachment: A Psychological Study of the Strange Situation. Psychology Press, London (2015)
Google Scholar
Ainsworth, M., Blehar, M., Waters, E., Wall, S.: Patterns of Attachment: Assessed in the Strange Situation and at Home. Erlbau, Hillsdale (1978)
Google Scholar
Zeng, Z., et al.: Audio-visual affect recognition. IEEE Trans. Multimedia 9, 424–428 (2007)
Article Google Scholar
Qu, J., Leerkes, E.: Patterns of RSA and observed distress during the still-face paradigm predict later attachment, compliance and behavior problems: a person-centered approach. Dev. Psychobiol. 60, 707–721 (2018)
Article Google Scholar
Hammal, Z., Cohn, J., Messinger, D.: Head movement dynamics during play and perturbed mother-infant interaction. IEEE Trans. Affect. Comput. 6, 361–370 (2015)
Article Google Scholar
Egmose, I., et al.: Relations between automatically extracted motion features and the quality of mother-infant interactions at 4 and 13 months. Front. Psychol. 8, 2178 (2017)
Article Google Scholar
Messinger, D., Mahoor, M., Chow, S., Cohn, J.: Automated measurement of facial expression in infant-mother interaction: a pilot study. Infancy 14, 285–305 (2009)
Article Google Scholar
Cohen, D., et al.: Do parentese prosody and fathers’ involvement in interacting facilitate social interaction in infants who later develop autism? Plos One 8, e61402 (2013)
Article Google Scholar
Weisman, O., et al.: Dynamics of non-verbal vocalizations and hormones during father-infant interaction. IEEE Trans. Affect. Comput. 7, 337–345 (2015)
Article Google Scholar
Leclère, C., et al.: Interaction and behaviour imaging: a novel method to measure mother-infant interaction using video 3D reconstruction. Transl. Psychiatry 6, e816 (2016)
Article Google Scholar
Herath, S., Harandi, M., Porikli, F.: Going deeper into action recognition: a survey. Image Vis. Comput. 60, 4–21 (2017)
Article Google Scholar
Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos
Google Scholar
Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks
Google Scholar
Donahue, J., et al.: Long-term recurrent convolutional networks for visual recognition and description
Google Scholar
Zunino, A., et al.: Video gesture analysis for autism spectrum disorder detection
Google Scholar
Elayadi, M., Kamel, M., Karray, F.: Survey on speech emotion recognition: features, classification schemes, and datasets. Pattern Recogn. 44, 572–587 (2011)
Article Google Scholar
Davis, S., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Sig. Process. 28, 357–366 (1980)
Article Google Scholar
Osmani, A., Hamidi, M., Chibani, A.: Machine learning approach for infant cry interpretation
Google Scholar
Zhang, S., Zhang, S., Huang, T., Gao, W., Tian, Q.: Learning affective features with a hybrid deep model for audio-visual emotion recognition. IEEE Trans. Circuits Syst. Video Technol. 28, 3030–3043 (2017)
Article Google Scholar
Anbarjafari, G., Noroozi, F., Marjanovic, M., Njegus, A., Escalera, S.: Audio-visual emotion recognition in video clips
Google Scholar
Sharma, S., Kiros, R., Salakhutdinov, R.: Action recognition using visual attention. Arxiv Preprint Arxiv:1511.04119 (2015)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Arxiv Preprint Arxiv:1409.1556 (2014)
Noroozi, F., Sapiński, T., Kamińska, D., Anbarjafari, G.: Vocal-based emotion recognition using random forests and decision tree. Int. J. Speech Technol. 20, 239–246 (2017)
Article Google Scholar
Park, S., Lee, J.: A fully convolutional neural network for speech enhancement. Arxiv Preprint Arxiv:1609.07132 (2016)
Zeng, Z., Hu, Y., Fu, Y., Huang, T., Roisman, G., Wen, Z.: Audio-visual emotion recognition in adult attachment interview
Google Scholar
Kamińska, D., Sapiński, T., Anbarjafari, G.: Efficiency of chosen speech descriptors in relation to emotion recognition. EURASIP J. Audio Speech Music Process. 2017, 3 (2017)
Google Scholar
Noroozi, F., Marjanovic, M., Njegus, A., Escalera, S., Anbarjafari, G.: Audio-visual emotion recognition in video clips. IEEE Trans. Affect. Comput. 10, 60–75 (2017)
Article Google Scholar
Haq, S., Jackson, P.: Multimodal Emotion Recognition. IGI Global, Hershey (2011)
Google Scholar
Rabiner, L.: On the use of autocorrelation analysis for pitch detection. IEEE Trans. Acoust. Speech Sig. Process. 25, 24–33 (1977)
Article Google Scholar
Chaudhry, R., Ran, A., Hager, G., Vidal, R.: Histograms of oriented optical flow and binet-cauchy kernels on nonlinear dynamical systems for the recognition of human actions
Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection
Google Scholar
Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of flow and appearance
Google Scholar
Wang, H., Schmid, C.: Action recognition with improved trajectories
Google Scholar
Chang, C., Lin, C.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2, 27 (2011)
Article Google Scholar

Download references

Acknowledgment

This work was supported by National Key R&D Program of China (2017YFB1002503).

Author information

Authors and Affiliations

Key Laboratory of Machine Perception, Peking University, Beijing, 100871, China
Honggai Li, Jinshi Cui & Hongbin Zha
School of Psychological and Cognitive Sciences, Peking University, Beijing, 100871, China
Li Wang

Authors

Honggai Li
View author publications
You can also search for this author in PubMed Google Scholar
Jinshi Cui
View author publications
You can also search for this author in PubMed Google Scholar
Li Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hongbin Zha
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Honggai Li or Jinshi Cui .

Editor information

Editors and Affiliations

University of Malaya, Kuala Lumpur, Malaysia
Shivakumara Palaiahnakote
Consiglio Nazionale delle Ricerche, ICAR, Naples, Italy
Gabriella Sanniti di Baja
Chinese Academy of Sciences, Beijing, China
Liang Wang
Auckland University of Technology, Auckland, New Zealand
Wei Qi Yan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, H., Cui, J., Wang, L., Zha, H. (2020). Infant Attachment Prediction Using Vision and Audio Features in Mother-Infant Interaction. In: Palaiahnakote, S., Sanniti di Baja, G., Wang, L., Yan, W. (eds) Pattern Recognition. ACPR 2019. Lecture Notes in Computer Science(), vol 12047. Springer, Cham. https://doi.org/10.1007/978-3-030-41299-9_38

Download citation

DOI: https://doi.org/10.1007/978-3-030-41299-9_38
Published: 23 February 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-41298-2
Online ISBN: 978-3-030-41299-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics