Skip to main content

Semi-automatic Pipeline for Large-Scale Dataset Annotation Task: A DMD Application

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 Workshops (ECCV 2022)

Abstract

This paper concerns a methodology of a semi-automatic annotation strategy for the gaze estimation material of the Driver Monitoring Dataset (DMD). It consists of a pipeline of semi-automatic annotation that uses ideas from Active Learning to annotate data with an accuracy as high as possible using less human intervention. A dummy model (the initial model) that is improved by iterative training and other state-of-the-art (SoA) models are the actors of an automatic label assessment strategy that will annotate new material. The newly annotated data will be used as an iterative process to train the dummy model and repeat the loop. The results show a reduction of annotation work for the human by 60%, where the automatically annotated images have a reliability of 99%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://dmd.vicomtech.org/.

  2. 2.

    https://github.com/Vicomtech/DMD-Driver-Monitoring-Dataset/tree/master/annotation-tool.

  3. 3.

    https://github.com/Vicomtech/DMD-Driver-Monitoring-Dataset/tree/master/exploreMaterial-tool.

  4. 4.

    https://vcd.vicomtech.org/.

  5. 5.

    https://github.com/erkil1452/gaze360.

  6. 6.

    https://github.com/hysts/pytorch_mpiigaze_demo.

References

  1. Cañas, P., Ortega, J.D., Nieto, M., Otaegui, O.: Detection of distraction-related actions on DMD: an image and a video-based approach comparison. In: VISIGRAPP (2021)

    Google Scholar 

  2. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)

    Google Scholar 

  3. (EuroNCAP), E.N.C.A.P.: Assessment protocol - safety assist (2021)

    Google Scholar 

  4. Ghoddoosian, R., Galib, M., Athitsos, V.: A realistic dataset and baseline temporal model for early drowsiness detection. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 178–187 (2019)

    Google Scholar 

  5. Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. ArXiv abs/1704.04861 (2017)

    Google Scholar 

  6. Kellnhofer, P., Recasens, A., Stent, S., Matusik, W., Torralba, A.: Gaze360: physically unconstrained gaze estimation in the wild. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 6911–6920 (2019)

    Google Scholar 

  7. Kim, K.Y., Park, D., Kim, K.I., Chun, S.Y.: Task-aware variational adversarial active learning. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8162–8171 (2021)

    Google Scholar 

  8. Ortega, J.D., et al.: DMD: a large-scale multi-modal driver monitoring dataset for attention and alertness analysis. In: Bartoli, A., Fusiello, A. (eds.) ECCV 2020. LNCS, vol. 12538, pp. 387–405. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-66823-5_23

    Chapter  Google Scholar 

  9. SAE International: Taxonomy and Definitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles. Technical reports, SAE International (2018)

    Google Scholar 

  10. Settles, B.: Active Learning. In: Synthesis Lectures on Artificial Intelligence and Machine Learning Series, Morgan & Claypool (2012)

    Google Scholar 

  11. Stappen, L., Rizos, G., Schuller, B.: X-aware: Context-aware human-environment attention fusion for driver gaze prediction in the wild. In: Proceedings of the 2020 International Conference on Multimodal Interaction (2020)

    Google Scholar 

  12. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2818–2826 (2016)

    Google Scholar 

  13. Vora, S., Rangesh, A., Trivedi, M.M.: On generalizing driver gaze zone estimation using convolutional neural networks. In: 2017 IEEE Intelligent Vehicles Symposium (IV), pp. 849–854 (2017)

    Google Scholar 

  14. Yuen, K., Trivedi, M.M.: Looking at hands in autonomous vehicles: a convnet approach using part affinity fields. IEEE Trans. Intell. Veh. 5, 361–371 (2020)

    Article  Google Scholar 

  15. Zhang, X., Park, S., Beeler, T., Bradley, D., Tang, S., Hilliges, O.: ETH-XGaze: a large scale dataset for gaze estimation under extreme head pose and gaze variation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12350, pp. 365–381. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58558-7_22

    Chapter  Google Scholar 

Download references

Acknowledgement

This work has received funding from the Basque Government under project AutoEv@l of the program ELKARTEK 2021.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paola Natalia Cañas .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Urselmann, T., Cañas, P.N., Ortega, J.D., Nieto, M. (2023). Semi-automatic Pipeline for Large-Scale Dataset Annotation Task: A DMD Application. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13806. Springer, Cham. https://doi.org/10.1007/978-3-031-25075-0_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-25075-0_38

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-25074-3

  • Online ISBN: 978-3-031-25075-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics