A Coarse-to-Fine Human Visual Focus Estimation for ASD Toddlers in Early Screening

Wang, Xinming; Yang, Zhihao; Zhang, Hanlin; Liu, Zuode; Ren, Weihong; Xu, Xiu; Xu, Qiong; Liu, Honghai

doi:10.1007/978-3-031-13844-7_43

Xinming Wang¹⁴,
Zhihao Yang¹⁴,
Hanlin Zhang¹⁴,
Zuode Liu¹⁴,
Weihong Ren¹⁴,
Xiu Xu¹⁵,
Qiong Xu¹⁵ &
…
Honghai Liu¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13455))

Included in the following conference series:

International Conference on Intelligent Robotics and Applications

2972 Accesses

Abstract

Human visual focus is a vital feature to uncover subjects’ underlying cognitive processes. To predict the subject’s visual focus, existing deep learning methods learn to combine the head orientation, location, and scene content for estimating the visual focal point. However, these methods mainly face three problems: the visual focal point prediction solely depends on learned spatial distribution heatmaps, the reasoning process in post-processing is non-learnable, and the learning of gaze salience representation could utilize more prior knowledge. Therefore, we propose a coarse-to-fine human visual focus estimation method to address these problems, for improving estimation performance. To begin with, we introduce a coarse-to-fine regression module, in which the coarse branch aims to estimate the subject’s possible attention area while the fine branch directly outputs the estimated visual focal point position, thus avoiding sequential reasoning and making visual focal point estimation is totally learnable. Furthermore, the human visual field prior is used to guide the learning of gaze salience for better encoding target-related representation. Extensive experimental results demonstrate that our method outperforms existing state-of-the-art methods on self-collected ASD-attention datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Enhanced multilevel autism classification for children using eye-tracking and hybrid CNN-RNN deep learning models

Article 04 December 2024

Connecting Gaze, Scene, and Attention: Generalized Attention Estimation via Joint Modeling of Gaze and Scene Saliency

ViTGaze: gaze following with interaction features in vision transformers

Article Open access 21 November 2024

References

Cao, Z., Wang, G., Guo, X.: Stage-by-stage based design paradigm of two-pathway model for gaze following. In: Lin, Z., et al. (eds.) PRCV 2019. LNCS, vol. 11858, pp. 644–656. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-31723-2_55
Chapter Google Scholar
Chong, E., Ruiz, N., Wang, Y., Zhang, Y., Rozga, A., Rehg, J.M.: Connecting gaze, scene, and attention: generalized attention estimation via joint modeling of gaze and scene saliency. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11209, pp. 397–412. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01228-1_24
Chapter Google Scholar
Chong, E., Wang, Y., Ruiz, N., Rehg, J.M.: Detecting attended visual targets in video. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, 13–19 June, 2020, pp. 5395–5405. Computer Vision Foundation / IEEE (2020)
Google Scholar
Dai, L., Liu, J., Ju, Z., Gao, Y.: Attention mechanism based real time gaze tracking in natural scenes with residual blocks. IEEE Trans. Cogn. Dev. Syst. 14, 1 (2021)
Google Scholar
Fan, L., Chen, Y., Wei, P., Wang, W., Zhu, S.: Inferring shared attention in social scene videos. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, 18–22 June, 2018, pp. 6460–6468. Computer Vision Foundation/IEEE Computer Society (2018)
Google Scholar
Fang, Y., Tang, J., Shen, W., Shen, W., Gu, X., Song, L., Zhai, G.: Dual attention guided gaze target detection in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19–25, 2021. pp. 11390–11399. Computer Vision Foundation / IEEE (2021)
Google Scholar
Lian, D., Yu, Z., Gao, S.: Believe it or not, we know what you are looking at! In: Jawahar, C.V., Li, H., Mori, G., Schindler, K. (eds.) ACCV 2018. LNCS, vol. 11363, pp. 35–50. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20893-6_3
Chapter Google Scholar
Liu, J., et al.: Early screening of autism in toddlers via response-to-instructions protocol. IEEE Trans. Cybern., 1–11 (2020)
Google Scholar
Massé, B., Lathuilière, S., Mesejo, P., Horaud, R.: Extended gaze following: Detecting objects in videos beyond the camera field of view. In: 2019 14th IEEE International Conference on Automatic Face Gesture Recognition (FG 2019), pp. 1–8 (2019)
Google Scholar
Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 483–499. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_29
Chapter Google Scholar
Recasens, A., Khosla, A., Vondrick, C., Torralba, A.: Where are they looking? In: Cortes, C., Lawrence, N., Lee, D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems. vol. 28. Curran Associates, Inc. (2015)
Google Scholar
Recasens, A., Vondrick, C., Khosla, A., Torralba, A.: Following gaze in video. In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, 22–29 October, 2017, pp. 1444–1452. IEEE Computer Society (2017)
Google Scholar
Tan, G., Xu, K., Liu, J., Liu, H.: A trend on autism spectrum disorder research: eye tracking-eeg correlative analytics. IEEE Trans. Cogn. Dev. Syst., 1 (2021)
Google Scholar
Wang, X., Zhang, J., Zhang, H., Zhao, S., Liu, H.: Vision-based gaze estimation: a review. IEEE Trans. Cogn. Dev. Syst., 1 (2021)
Google Scholar
Wang, Z., Liu, J., He, K., Xu, Q., Xu, X., Liu, H.: Screening early children with autism spectrum disorder via response-to-name protocol. IEEE Trans. Industr. Inf. 17(1), 587–595 (2021)
Article Google Scholar
Yang, L., Dong, K., Dmitruk, A.J., Brighton, J., Zhao, Y.: A dual-cameras-based driver gaze mapping system with an application on non-driving activities monitoring. IEEE Trans. Intell. Transp. Syst. 21(10), 4318–4327 (2020)
Article Google Scholar
Yücel, Z., Salah, A.A., Meriçli,, Meriçli, T., Valenti, R., Gevers, T.: Joint attention by gaze interpolation and saliency. IEEE Trans. Cybern. 43(3), 829–842 (2013)
Google Scholar
Zhao, H., Lu, M., Yao, A., Chen, Y., Zhang, L.: Learning to draw sight lines. Int. J. Comput. Vis. 128(5), 1076–1100 (2020)
Article Google Scholar
Zhuang, N., Ni, B., Xu, Y., Yang, X., Zhang, W., Li, Z., Gao, W.: Muggle: multi-stream group gaze learning and estimation. IEEE Trans. Circuits Syst. Video Technol. 30(10), 3637–3650 (2020)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Mechanical Engineering and Automation, Harbin Institute of Technology, Shenzhen, China
Xinming Wang, Zhihao Yang, Hanlin Zhang, Zuode Liu, Weihong Ren & Honghai Liu
Children’s Hospital of Fudan University Department of Child Health Care, Shanghai, China
Xiu Xu & Qiong Xu

Authors

Xinming Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhihao Yang
View author publications
You can also search for this author in PubMed Google Scholar
Hanlin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zuode Liu
View author publications
You can also search for this author in PubMed Google Scholar
Weihong Ren
View author publications
You can also search for this author in PubMed Google Scholar
Xiu Xu
View author publications
You can also search for this author in PubMed Google Scholar
Qiong Xu
View author publications
You can also search for this author in PubMed Google Scholar
Honghai Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Honghai Liu .

Editor information

Editors and Affiliations

Harbin Institute of Technology, Shenzhen, China
Honghai Liu
Huazhong University of Science and Technology, Wuhan, China
Zhouping Yin
Shenyang Institute of Automation, Shenyang, Liaoning, China
Lianqing Liu
Harbin Institute of Technology, Harbin, China
Li Jiang
Shanghai Jiao Tong University, Shanghai, China
Guoying Gu
Shenzhen Institute of Advanced Technology, Shenzhen, China
Xinyu Wu
Harbin Institute of Technology, Shenzhen, China
Weihong Ren

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, X. et al. (2022). A Coarse-to-Fine Human Visual Focus Estimation for ASD Toddlers in Early Screening. In: Liu, H., et al. Intelligent Robotics and Applications. ICIRA 2022. Lecture Notes in Computer Science(), vol 13455. Springer, Cham. https://doi.org/10.1007/978-3-031-13844-7_43

Download citation

DOI: https://doi.org/10.1007/978-3-031-13844-7_43
Published: 04 August 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-13843-0
Online ISBN: 978-3-031-13844-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Coarse-to-Fine Human Visual Focus Estimation for ASD Toddlers in Early Screening

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Enhanced multilevel autism classification for children using eye-tracking and hybrid CNN-RNN deep learning models

Connecting Gaze, Scene, and Attention: Generalized Attention Estimation via Joint Modeling of Gaze and Scene Saliency

ViTGaze: gaze following with interaction features in vision transformers

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Coarse-to-Fine Human Visual Focus Estimation for ASD Toddlers in Early Screening

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Enhanced multilevel autism classification for children using eye-tracking and hybrid CNN-RNN deep learning models

Connecting Gaze, Scene, and Attention: Generalized Attention Estimation via Joint Modeling of Gaze and Scene Saliency

ViTGaze: gaze following with interaction features in vision transformers

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation