Assessing the YOLO Series Through Empirical Analysis on the KITTI Dataset for Autonomous Driving

Ramos, Filipa; Correia, Alexandre; Rossetti, Rosaldo J. F.

doi:10.1007/978-3-030-38822-5_14

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 310))

Included in the following conference series:

International Conference on Intelligent Transport Systems

841 Accesses
1 Citations

Abstract

Computer vision and deep learning have been widely popularised on the turn of the 21$^{st}$ century. On the centre of its applications we find autonomous driving. As this challenge becomes a racing platform for all companies, both directly and indirectly involved with transportation systems, it is only pertinent to evaluate exactly how some generic, state-of-the-art models can perform on datasets specifically built for autonomous driving research. With this purpose, this article aims at directly studying the evolution of the YOLO (You Only Look Once) model since its first implementation until the most recent version 3. Experiences carried out on the respected and acknowledged driving dataset and benchmark known as KITTI Vision Benchmark enable direct comparison between the newest updated version and its predecessor. Results show how the two versions of the model have a performance gap whilst being tested on the same dataset and using a similar configuration setup. YOLO version 3 shows its renewed boost in accuracy whilst dropping minimally on detection speed. Some conclusions on the applicability of models such as this to a real-world scenario are drawn so as to predict the direction of research in the area of autonomous driving.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Deep Neural Networks and Data for Automated Driving

A Survey on Autonomous Vehicles

Statistically correlated multi-task learning for autonomous driving

Article 20 April 2021

Notes

1.
Found at https://pjreddie.com/darknet/yolo/.
2.
Download at http://www.cvlibs.net/datasets/kitti/raw_data.php.

References

Brown, T.: Plein Air Oil Painting (2015). http://tombrownfineart.blogspot.com/2015/06/25-cars-8x10-plein-air-oil-painting-by.html
Everingham, M., Eslami, S.M.A., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL visual object classes challenge: a retrospective. Int. J. Comput. Vis. (2014). https://doi.org/10.1007/s11263-014-0733-5
Article Google Scholar
Fritsch, J., Kuehnl, T., Geiger, A.: A new performance measure and evaluation benchmark for road detection algorithms. In: International Conference on Intelligent Transportation Systems (ITSC) (2013)
Google Scholar
Fu, C.Y., Liu, W., Ranga, A., Tyagi, A., Berg, A.C.: DSSD: deconvolutional single shot detector. CoRR abs/1701.06659 (2017)
Google Scholar
Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the KITTI dataset. Int. J. Robot. Res. 32(11), 1231–1237 (2013). https://doi.org/10.1177/0278364913491297
Article Google Scholar
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision 2015 Inter, pp. 1440–1448 (2015). https://doi.org/10.1109/ICCV.2015.169
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014). https://doi.org/10.1109/CVPR.2014.81
He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision 2017, pp. 2980–2988, October 2017. https://doi.org/10.1109/ICCV.2017.322
Lenc, K., Vedaldi, A.: R-CNN minus R. In: British Machine Vision Conference (2015)
Google Scholar
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 936–944, July 2017. https://doi.org/10.1109/CVPR.2017.106
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollar, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision 2017, pp. 2999–3007, October 2017. https://doi.org/10.1109/ICCV.2017.324
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Loureiro, P.F.Q., Rossetti, R.J.F., Braga, R.A.M.: Video processing techniques for traffic information acquisition using uncontrolled video streams. In: 2009 12th International IEEE Conference on Intelligent Transportation Systems, pp. 1–7, October 2009
Google Scholar
Neto, J., Santos, D., Rossetti, R.J.F.: Computer-vision-based surveillance of intelligent transportation systems. In: 2018 13th Iberian Conference on Information Systems and Technologies (CISTI), pp. 1–5, June 2018. https://doi.org/10.23919/CISTI.2018.8399240
Pereira, J.L.F., Rossetti, R.J.F.: An integrated architecture for autonomous vehicles simulation. In: Proceedings of the 27th Annual ACM Symposium on Applied Computing, SAC 2012, pp. 286–292. ACM, New York (2012)
Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779–788, June 2016. https://doi.org/10.1109/CVPR.2016.91
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6517–6525, July 2017. https://doi.org/10.1109/CVPR.2017.690
Redmon, J.: Darknet: Open source neural networks in C (2013–2016). https://pjreddie.com/darknet/
Redmon, J., Farhadi, A., Ap, C.: YOLOv3 : an incremental improvement. Technical report (2018). https://doi.org/10.1109/CVPR.2017.690
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017). https://doi.org/10.1109/TPAMI.2016.2577031
Article Google Scholar
Rossetti, R.J.F., Oliveira, E.C., Bazzan, A.L.C.: Towards a specification of a framework for sustainable transportation analysis. In: 13th Portuguese Conference on Artificial Intelligence, EPIA, Guimarães, Portugal, pp. 179–190. APPIA (2007)
Google Scholar
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y
Article MathSciNet Google Scholar
Uijlings, J.R., Sande, K.E., Gevers, T., Smeulders, A.W.: Selective search for object recognition. Int. J. Comput. Vis. 104(2), 154–171 (2013). https://doi.org/10.1007/s11263-013-0620-5
Article Google Scholar

Download references

Aknowledgements

This work is supported by: European Structural and Investment Funds in the FEDER component, through the Operational Competitiveness and Internationalization Programme (COMPETE 2020) [Project n$^\circ $ 037902; Funding Reference: POCI-01-0247-FEDER-037902].

Author information

Authors and Affiliations

Bosch Car Multimedia S.A., 4705-820, Braga, Portugal
Filipa Ramos & Alexandre Correia
Artificial Intelligence and Computer Science Lab,Department of Informatics Engineering, Faculty of Engineering, University of Porto, 4200-465, Porto, Portugal
Rosaldo J. F. Rossetti

Authors

Filipa Ramos
View author publications
You can also search for this author in PubMed Google Scholar
Alexandre Correia
View author publications
You can also search for this author in PubMed Google Scholar
Rosaldo J. F. Rossetti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Filipa Ramos .

Editor information

Editors and Affiliations

ISCTE-IUL, Lisbon, Portugal
Ana Lúcia Martins
ISCTE-IUL, Lisbon, Portugal
Joao Carlos Ferreira
University of Pisa, Pisa, Italy
Alexander Kocian

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ramos, F., Correia, A., Rossetti, R.J.F. (2020). Assessing the YOLO Series Through Empirical Analysis on the KITTI Dataset for Autonomous Driving. In: Martins, A., Ferreira, J., Kocian, A. (eds) Intelligent Transport Systems. From Research and Development to the Market Uptake. INTSYS 2019. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 310. Springer, Cham. https://doi.org/10.1007/978-3-030-38822-5_14

Download citation

DOI: https://doi.org/10.1007/978-3-030-38822-5_14
Published: 10 January 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-38821-8
Online ISBN: 978-3-030-38822-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Assessing the YOLO Series Through Empirical Analysis on the KITTI Dataset for Autonomous Driving

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Deep Neural Networks and Data for Automated Driving

A Survey on Autonomous Vehicles

Statistically correlated multi-task learning for autonomous driving

Notes

References

Aknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Assessing the YOLO Series Through Empirical Analysis on the KITTI Dataset for Autonomous Driving

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Deep Neural Networks and Data for Automated Driving

A Survey on Autonomous Vehicles

Statistically correlated multi-task learning for autonomous driving

Notes

References

Aknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation