Driver Activity Recognition by Fusing Multi-object and Key Points Detection

Pardo-Decimavilla, Pablo; Bergasa, Luis M.; López-Guillén, Elena; Llamazares, Ángel; Abdeselam, Navil; Ocaña, Manuel

doi:10.1007/978-3-031-58676-7_12

Pablo Pardo-Decimavilla¹⁴,
Luis M. Bergasa¹⁴,
Elena López-Guillén¹⁴,
Ángel Llamazares¹⁴,
Navil Abdeselam¹⁴ &
…
Manuel Ocaña¹⁴

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 976))

Included in the following conference series:

Iberian Robotics conference

15 Accesses

Abstract

Driver distraction recognition plays a fundamental role in road safety. In this paper, we present a modular architecture based on the fusion of key points and object detection for predicting driver’s actions. From multi-camera infrared recordings, we will temporarily detect among a variety of actions that lead to distractions. Our system detects objects of interest and extracts key points from the driver. They are merged by generating features that relate them and processed with a ML-based classification algorithm. Finally, filters are applied to reduce bounces and add temporal context to the detections. Our proposal has been validated on two state-of-the-art datasets for driving distractions. Through several experiments we show that fusion substantially improves related action inference and improves domain adaptation. In addition, our framework is lightweight, explainable and has a low latency as it performs frame-by-frame inference. The modularity of the network allows us to upgrade parts independently or eliminate a camera without having to modify the entire network.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

La moncloa. 07/01/2022. los accidentes de tráfico se cobraron la vida de 1.004 personas el pasado año [prensa/actualidad/interior]. Accessed 04 Jan 2023
Google Scholar
Preliminary 2021 eu road safety statistics. Accessed 04 April 2023
Google Scholar
Abouelnaga, Y., Eraqi, H.M., Moustafa, M.N.: Real-time distracted driver posture classification (2018)
Google Scholar
Cai, Z., Vasconcelos, N.: Cascade R-CNN: high quality object detection and instance segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 43(5), 1483–1498 (2019)
Google Scholar
Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: OpenPose: realtime multi-person 2D pose estimation using part affinity fields. IEEE Trans. Pattern Anal. Mach. Intell. 43(1), 172–186 (2019)
Google Scholar
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-End object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Chapter Google Scholar
Cruz, S.D.D., Wasenmuller, O., Beise, H.P., Stifter, T., Stricker, D.: SVIRO: synthetic vehicle interior rear seat occupancy dataset and benchmark. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 973–982 (2020)
Google Scholar
Fang, H.S., et al.: AlphaPose: whole-body regional multi-person pose estimation and tracking in real-time. IEEE Trans. Pattern Anal. Mach. Intell. 45(6), 7157–7173 (2022)
Google Scholar
Girshick, R., Radosavovic, I., Gkioxari, G., Dollár, P., He, K.: Detectron (2018)
Google Scholar
Jocher, G., Chaurasia, A., Qiu, J.: YOLO by Ultralytics (2023). URL https://github.com/ultralytics/ultralytics
Katrolia, J.S., Mirbach, B., El-Sherif, A., Feld, H., Rambach, J., Stricker, D.: TICaM: a time-of-flight in-car cabin monitoring dataset. arXiv preprint arXiv:2103.11719 (2021)
Kiran, B.R., Thomas, D.M., Parakkal, R.: An overview of deep learning based methods for unsupervised and semi-supervised anomaly detection in videos. J. Imaging 4(2), 36 (2018)
Article Google Scholar
Koay, H.V., Chuah, J.H., Chow, C.-O., Chang, Y.-L., Rudrusamy, B.: Optimally-weighted image-pose approach (OWIPA) for distracted driver detection and classification. Sensors 21(14), 4837 (2021). https://doi.org/10.3390/s21144837
Article Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
McNally, W., Vats, K., Wong, A., McPhee, J.: Rethinking keypoint representations: Modeling keypoints and poses as objects for multi-person human pose estimation. arXiv preprint arXiv:2111.08557 (2021)
Naphade, M., et al.: The 7th AI city challenge (2023)
Google Scholar
Tan, M., Pang, R., Le, Q.V.: EfficientDet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10,781–10,790 (2020)
Google Scholar
Tran, M.T., Vu, M.Q., Hoang, N.D., Bui, K.H.N.: An effective temporal localization method with multi-view 3D action recognition for untrimmed naturalistic driving videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3168–3173 (2022)
Google Scholar
Vats, A., Anastasiu, D.C.: Key point-based driver activity recognition. In: 2022 IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2022, vol. 1 (2022)
Google Scholar

Download references

Acknowledgements

This work has been supported from the Spanish PID2021-126623OB-I00 project, funded by MICIN/AEI and FEDER, the TED2021-130131A-I00, PDC2022-133470-I00 projects, funded by MICIN/AEI and the European Union NextGenerationEU/PRTR, and the collaboration scholarship for the 2022–2023 academic year (22C01/007899), financed by the Ministry of Education.

Author information

Authors and Affiliations

RobeSafe Research Group, Electronics Department, Universidad de Alcalá, Alcalá de Henares, Spain
Pablo Pardo-Decimavilla, Luis M. Bergasa, Elena López-Guillén, Ángel Llamazares, Navil Abdeselam & Manuel Ocaña

Authors

Pablo Pardo-Decimavilla
View author publications
You can also search for this author in PubMed Google Scholar
Luis M. Bergasa
View author publications
You can also search for this author in PubMed Google Scholar
Elena López-Guillén
View author publications
You can also search for this author in PubMed Google Scholar
Ángel Llamazares
View author publications
You can also search for this author in PubMed Google Scholar
Navil Abdeselam
View author publications
You can also search for this author in PubMed Google Scholar
Manuel Ocaña
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pablo Pardo-Decimavilla .

Editor information

Editors and Affiliations

Dep. de Eng. Electrotecnica e de Computadores, University of Coimbra, Coimbra, Portugal
Lino Marques
Department of Electrónica Industrial, University of Minho, Escola de Engenharia, Guimarães, Portugal
Cristina Santos
Department of Electrical Engineering, Polytechnic Institute of Bragança, Bragança, Portugal
José Luís Lima
Centro Universitario de la Defensa, Zaragoza, Spain
Danilo Tardioli
Centre for Automation and Robotics UPM-CSIC, Universidad Politecnica de Madrid, Madrid, Spain
Manuel Ferre

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pardo-Decimavilla, P., Bergasa, L.M., López-Guillén, E., Llamazares, Á., Abdeselam, N., Ocaña, M. (2024). Driver Activity Recognition by Fusing Multi-object and Key Points Detection. In: Marques, L., Santos, C., Lima, J.L., Tardioli, D., Ferre, M. (eds) Robot 2023: Sixth Iberian Robotics Conference. ROBOT 2023. Lecture Notes in Networks and Systems, vol 976. Springer, Cham. https://doi.org/10.1007/978-3-031-58676-7_12

Download citation

DOI: https://doi.org/10.1007/978-3-031-58676-7_12
Published: 27 April 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-58675-0
Online ISBN: 978-3-031-58676-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Driver Activity Recognition by Fusing Multi-object and Key Points Detection