Skip to main content

Driver Activity Recognition by Fusing Multi-object and Key Points Detection

  • Conference paper
  • First Online:
Robot 2023: Sixth Iberian Robotics Conference (ROBOT 2023)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 976))

Included in the following conference series:

  • 15 Accesses

Abstract

Driver distraction recognition plays a fundamental role in road safety. In this paper, we present a modular architecture based on the fusion of key points and object detection for predicting driver’s actions. From multi-camera infrared recordings, we will temporarily detect among a variety of actions that lead to distractions. Our system detects objects of interest and extracts key points from the driver. They are merged by generating features that relate them and processed with a ML-based classification algorithm. Finally, filters are applied to reduce bounces and add temporal context to the detections. Our proposal has been validated on two state-of-the-art datasets for driving distractions. Through several experiments we show that fusion substantially improves related action inference and improves domain adaptation. In addition, our framework is lightweight, explainable and has a low latency as it performs frame-by-frame inference. The modularity of the network allows us to upgrade parts independently or eliminate a camera without having to modify the entire network.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. La moncloa. 07/01/2022. los accidentes de tráfico se cobraron la vida de 1.004 personas el pasado año [prensa/actualidad/interior]. Accessed 04 Jan 2023

    Google Scholar 

  2. Preliminary 2021 eu road safety statistics. Accessed 04 April 2023

    Google Scholar 

  3. Abouelnaga, Y., Eraqi, H.M., Moustafa, M.N.: Real-time distracted driver posture classification (2018)

    Google Scholar 

  4. Cai, Z., Vasconcelos, N.: Cascade R-CNN: high quality object detection and instance segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 43(5), 1483–1498 (2019)

    Google Scholar 

  5. Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: OpenPose: realtime multi-person 2D pose estimation using part affinity fields. IEEE Trans. Pattern Anal. Mach. Intell. 43(1), 172–186 (2019)

    Google Scholar 

  6. Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-End object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13

    Chapter  Google Scholar 

  7. Cruz, S.D.D., Wasenmuller, O., Beise, H.P., Stifter, T., Stricker, D.: SVIRO: synthetic vehicle interior rear seat occupancy dataset and benchmark. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 973–982 (2020)

    Google Scholar 

  8. Fang, H.S., et al.: AlphaPose: whole-body regional multi-person pose estimation and tracking in real-time. IEEE Trans. Pattern Anal. Mach. Intell. 45(6), 7157–7173 (2022)

    Google Scholar 

  9. Girshick, R., Radosavovic, I., Gkioxari, G., Dollár, P., He, K.: Detectron (2018)

    Google Scholar 

  10. Jocher, G., Chaurasia, A., Qiu, J.: YOLO by Ultralytics (2023). URL https://github.com/ultralytics/ultralytics

  11. Katrolia, J.S., Mirbach, B., El-Sherif, A., Feld, H., Rambach, J., Stricker, D.: TICaM: a time-of-flight in-car cabin monitoring dataset. arXiv preprint arXiv:2103.11719 (2021)

  12. Kiran, B.R., Thomas, D.M., Parakkal, R.: An overview of deep learning based methods for unsupervised and semi-supervised anomaly detection in videos. J. Imaging 4(2), 36 (2018)

    Article  Google Scholar 

  13. Koay, H.V., Chuah, J.H., Chow, C.-O., Chang, Y.-L., Rudrusamy, B.: Optimally-weighted image-pose approach (OWIPA) for distracted driver detection and classification. Sensors 21(14), 4837 (2021). https://doi.org/10.3390/s21144837

    Article  Google Scholar 

  14. Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

  15. McNally, W., Vats, K., Wong, A., McPhee, J.: Rethinking keypoint representations: Modeling keypoints and poses as objects for multi-person human pose estimation. arXiv preprint arXiv:2111.08557 (2021)

  16. Naphade, M., et al.: The 7th AI city challenge (2023)

    Google Scholar 

  17. Tan, M., Pang, R., Le, Q.V.: EfficientDet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10,781–10,790 (2020)

    Google Scholar 

  18. Tran, M.T., Vu, M.Q., Hoang, N.D., Bui, K.H.N.: An effective temporal localization method with multi-view 3D action recognition for untrimmed naturalistic driving videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3168–3173 (2022)

    Google Scholar 

  19. Vats, A., Anastasiu, D.C.: Key point-based driver activity recognition. In: 2022 IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2022, vol. 1 (2022)

    Google Scholar 

Download references

Acknowledgements

This work has been supported from the Spanish PID2021-126623OB-I00 project, funded by MICIN/AEI and FEDER, the TED2021-130131A-I00, PDC2022-133470-I00 projects, funded by MICIN/AEI and the European Union NextGenerationEU/PRTR, and the collaboration scholarship for the 2022–2023 academic year (22C01/007899), financed by the Ministry of Education.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pablo Pardo-Decimavilla .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pardo-Decimavilla, P., Bergasa, L.M., López-Guillén, E., Llamazares, Á., Abdeselam, N., Ocaña, M. (2024). Driver Activity Recognition by Fusing Multi-object and Key Points Detection. In: Marques, L., Santos, C., Lima, J.L., Tardioli, D., Ferre, M. (eds) Robot 2023: Sixth Iberian Robotics Conference. ROBOT 2023. Lecture Notes in Networks and Systems, vol 976. Springer, Cham. https://doi.org/10.1007/978-3-031-58676-7_12

Download citation

Publish with us

Policies and ethics