Abstract
Children’s social and physical environment plays a significant role in their cognitive development. Therefore, children’s lived experiences are important to developmental psychologists. The traditional way of studying everyday experiences has become a bottleneck because it relies on short recordings and manual coding. Designing a non-invasive child-friendly recording setup and automating the coding process can potentially improve the research standards by allowing researchers to study longer and more diverse aspects of experience. We leverage modern computer vision tools and techniques to address this problem. We present a simple and non-invasive video recording setup and collect egocentric data from children. We test the state-of-the-art object detectors and observe that egocentric videos from children are a challenging problem, indicated by the low mean Average Precision of state-of-the-art. The performance of these object detectors can be improved through fine-tuning. Once accurate object detection has been achieved, other questions, such as human-object interaction and scene understanding, can be answered. Developing an automatic processing pipeline may provide an important tool for developmental psychologists to study variation in everyday experience.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bambach, S.: A survey on recent advances of computer vision algorithms for egocentric video. CoRR (2015). abs/1501.02825
Bergstrom, T., Shi, H.: Human-object interaction detection: A quick survey and examination of methods. In: Proceedings of the 1st International Workshop on Human-Centric Multimedia Analysis, pp. 63–71 (2020)
Bochkovskiy, A., Wang, C.-Y., Mark Liao, H.-Y.: Yolov4: Optimal speed and accuracy of object detection (2020). arXiv:2004.10934
Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: Lee, D.D., Sugiyama, M., von Luxburg, U., Guyon, I., Garnett, R., (eds.) Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, pp. 379–387. Barcelona, Spain (2016)
Damen, D., Doughty, H., Farinella, G.M., Fidler, S., Furnari, A., Kazakos, E., Moltisanti, D., Munro, J., Perrett, T., Price, W., et al.: Scaling egocentric vision: The epic-kitchens dataset. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 720–736 (2018)
Girshick, R.B.: Fast R-CNN. In: 2015 IEEE International Conference on Computer Vision. ICCV 2015, pp. 1440–1448. IEEE Computer Society, Santiago, Chile (2015)
Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2014, pp. 580–587. IEEE Computer Society, Columbus, OH, USA (2014)
Grauman, K., Westbury, A., Byrne, E., Chavis, Z., Furnari, A., Girdhar, R., Hamburger, J., Jiang, H., Liu, M., Liu, X. et al.: Ego4d: Around the world in 3,000 hours of egocentric video. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 18995–19012 (2022)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE CVPR, pp. 770–778 (2016)
Heyes, C.: Cognitive Gadgets. Harvard University Press (2018)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)
Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A.: You only look once: Unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2016, pp. 779–788. IEEE Computer Society. Las Vegas, NV, USA (2016)
Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)
Ren, X., Philipose, M.: Egocentric recognition of handled objects: Benchmark and analysis. In: 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 1–8. IEEE (2009)
Rogoff, B., Dahl, A., Callanan, M.: The importance of understanding children’s lived experience. Dev. Rev. 50, 5–15 (2018)
Sullivan, J., Mei, M., Perfors, A., Wojcik, E., Frank, M.C.: Saycam: A large, longitudinal audiovisual dataset recorded from the infant’s perspective. Open Mind 5, 20–29 (2021)
Tomasello, M.: A Natural History of Human Thinking. Harvard University Press (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zahra, A., Martin, PE., Bohn, M., Haun, D. (2024). Computer Vision for Analyzing Children’s Lived Experiences. In: Arai, K. (eds) Intelligent Systems and Applications. IntelliSys 2023. Lecture Notes in Networks and Systems, vol 823. Springer, Cham. https://doi.org/10.1007/978-3-031-47724-9_25
Download citation
DOI: https://doi.org/10.1007/978-3-031-47724-9_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47723-2
Online ISBN: 978-3-031-47724-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)