Abstract
In order to achieve intelligent monitoring of infants and toddlers’ behaviour, reduce the risk of accidental injury and ease the caregiver’s burden, this paper proposes a behavioural state detection algorithm that incorporates multi-scale contextual features to achieve real-time monitoring of whether infants and toddlers are climbing, crawling, sitting, lying, standing (walking) and lost in a total of six states. To ensure the algorithm’s ability to detect targets of interest at multiple scales and to obtain faster detection efficiency, a deep feature fusion network is constructed based on a feature pyramid network structure, In addition, in order to improve the ability of the deep feature fusion network to obtain more global semantic information of the feature map, a contextual feature extraction structure is constructed to mine the contextual valid features of the feature map by residual structure and dilated convolution. The experimental results show that the method achieves a detection speed of 72.18 FPS and a detection accuracy of 95.24%, which enables faster detection of infants and toddlers’ behavioural states and slightly better accuracy relative to the baseline algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Zhai, Z.W., Jin, G.Z., Zhang, Y.Y.: Reassessment of Chinas fertility level: an analysis of the 7th population census data. Popul. Res. 46(4), 3–13 (2022)
General Office of the State Council of China: Guidance of the General Office of the State Council on promoting the development of care services for infants and toddlers under the age of 3 (2019). http://www.gov.cn/xinwen/2019-05/09/content_5390023.htm. Accessed 5 Sept 2019
Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. Adv. Neural. Inf. Process. Syst. 30(5), 568–576 (2014)
Tran, D., Bourdev, L., Fergus, L., et al.: Learning spatiotemporal features with 3D convolutional networks. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 4489–4497. https://doi.org/10.1109/ICCV.2015.510
Donahue, J., Anne Hendricks, L., Guadarrama, S., et al.: Long-term recurrent convolutional networks for visual recognition and description. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 2625–2624 (2017)
Yan, S., Xiong, Y., Lin, D.: Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, no. 1, pp. 7444–7452 (2018)
Duan, H., Zhao, Y., Chen, K., et al.: Revisiting skeleton-based action recognition. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2959–2968. https://doi.org/10.1109/CVPR52688.2022.00298
Wang, Z.P., Wang, T.: Faster RCNN-based detection method for violations of crossing fences. Comput. Syst. Appl. 31(4), 346–351 (2022)
Wan, L.B.: Research on Smoking Behavior Detection System Based on Deep Learning. Master, University of Electronic Science and Technology of China (2022)
Zhou, H.C., Yang, J., Xu, Z.G.: Design of human fall detection system based on YOLOv5 algorithm. J. Jinling Inst. Technol. 38(2), 22–29 (2022)
Li, Z., Xiong, J., Chen, H.: Based on improved yolov3 for college students’ classroom behavior recognition. In: 2022 International Conference on Artificial Intelligence and Computer Information Technology (AICIT), pp. 1–4. https://doi.org/10.1109/AICIT55386.2022.9930274
Choi, B., An, W., Kang, H.: Human action recognition method using YOLO and OpenPose. In: 2022 13th International Conference on Information and Communication Technology Convergence (ICTC), pp. 1786–1788. https://doi.org/10.1109/ICTC55196.2022.9952808
Ge, Z., Liu, S., Wang, F., et al.: YOLOX: exceeding YOLO series in 2021. arXiv e-prints, arXiv: 2107.08430 (2021)
Lin, T.-Y., Dollár, P., Girshick, R., et al.: Feature pyramid networks for object detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 936–944. https://doi.org/10.1109/CVPR.2017.106
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. arXiv preprint arXiv: 1804.02767 (2018)
He, K.M., Zhang, X., Ren, S.Q.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778. https://doi.org/10.1109/CVPR.2016.90
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, Q., Zhu, Z., Guo, W., Huang, H. (2023). Behavioural State Detection Algorithm for Infants and Toddlers Incorporating Multi-scale Contextual Features. In: Lu, H., et al. Image and Graphics. ICIG 2023. Lecture Notes in Computer Science, vol 14356. Springer, Cham. https://doi.org/10.1007/978-3-031-46308-2_11
Download citation
DOI: https://doi.org/10.1007/978-3-031-46308-2_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46307-5
Online ISBN: 978-3-031-46308-2
eBook Packages: Computer ScienceComputer Science (R0)