Imbalanced driving scene recognition with class focal loss and data augmentation

Zhu, Xianglei; Men, Jianfeng; Yang, Liu; Li, Keqiu

doi:10.1007/s13042-022-01575-x

Imbalanced driving scene recognition with class focal loss and data augmentation

Original Article
Published: 07 June 2022

Volume 13, pages 2957–2975, (2022)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Xianglei Zhu^1,2^na1,
Jianfeng Men¹^na1,
Liu Yang ORCID: orcid.org/0000-0001-8555-5387¹ &
…
Keqiu Li¹

372 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Driving scene recognition based on visual features is essential to develop intelligent transportation systems. However, real-world driving scene data is class imbalanced by nature, leading to the majority classes and the minority classes present different distribution patterns. Specifically, some classes have sufficient samples, while for other massive classes, only very few samples are available. With this distribution, deep neural networks have been found to perform poorly on minority classes. To handle the class Imbalance of Driving Scene Recognition (IDSR), this paper presents a novel class focal loss for imbalanced driving scene recognition to improve recognition performance in minority scenes. It introduces the quantity distribution of categories based on focal loss, which can better balance quantity and difficulty in the training process. In addition, this paper explores a data augmentation method for imbalanced driving scene to improve performance. To evaluate the performance of the proposed method, comprehensive experiments were conducted on real-world driving scene datasets. The results show that the proposed method can substantially outperform state-of-the-art methods in class imbalanced driving scene recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Effects of Brightness and Class-Unbalanced Dataset on CNN Model Selection and Image Classification Considering Autonomous Driving

JUIVCDv1: development of a still-image based dataset for indian vehicle classification

Article 07 February 2024

SelectAug: A Data Augmentation Method for Distracted Driving Detection

Notes

The driving scene data were collected by China Automotive Technology and Research Center Company Ltd.

References

Ferdowsi A, Challita U, Saad W (2019) Deep learning for reliable mobile edge analytics in intelligent transportation systems: an overview. IEEE Veh Technol Mag 14(1):62–70
Article Google Scholar
Zhou B, Cui Q, Wei X-S, Chen Z-M (2020) Bbn: Bilateral-branch network with cumulative learning for long-tailed visual recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9719–9728
Kang B, Xie S, Rohrbach M, Yan Z, Gordo A, Feng J, Kalantidis Y (2019) Decoupling representation and classifier for long-tailed recognition. In: International conference on learning representations
Cui Y, Jia M, Lin T-Y, Song Y, Belongie S (2019) Class-balanced loss based on effective number of samples. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9268–9277
Buda M, Maki A, Mazurowski MA (2018) A systematic study of the class imbalance problem in convolutional neural networks. Neural Netw 106:249–259
Article Google Scholar
Sarafianos N, Xu X, Kakadiaris IA (2018) Deep imbalanced attribute classification using visual attention aggregation. In: Proceedings of the European conference on computer vision, pp 680–697
Shen L, Lin Z, Huang Q (2016) Relay backpropagation for effective learning of deep convolutional neural networks. In: Proceedings of the European Conference on Computer Vision, pp 467–482
More A (2016) Survey of resampling techniques for improving classification performance in unbalanced datasets. arXiv preprint arXiv:1608.06048
Drummond C, Holte RC, et al (2003) C4. 5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling. In: International conference on machine learning workshop on learning from imbalanced datasets, vol 2, no 11, pp 1–8
Huang C, Li Y, Loy CC, Tang X (2016) Learning deep representation for imbalanced classification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5375–5384
Wang Y-X, Ramanan D, Hebert M (2017) Learning to model the tail. In: Advances in neural information processing systems, pp 7032–7042
Huang C, Li Y, Loy CC, Tang X (2019) Deep imbalanced learning for face recognition and attribute prediction. IEEE Trans Pattern Anal Mach Intell 42(11):2781–2794
Article Google Scholar
Wang Y, Gan W, Yang J, Wu W, Yan J (2019) Dynamic curriculum learning for imbalanced data classification. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 5017–5026
Cao K, Wei C, Gaidon A, Arechiga N, Ma T (2019) Learning imbalanced datasets with label-distribution-aware margin loss. In: Advances in neural information processing systems, pp 1567–1578
Ren J, Yu C, Sheng S, Ma X, Zhao H, Yi S, Li H (2020) Balanced meta-softmax for long-tailed visual recognition. In: Advances in neural information processing systems, pp 4175–4186
Hayat M, Khan S, Zamir SW, Shen J, Shao L (2019) Gaussian affinity for max-margin class imbalanced learning. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6469–6479
Wang P, Han K, Wei X-S, Zhang L, Wang L (2021) Contrastive learning based hybrid networks for long-tailed image classification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 943–952
Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In Proceedings of the IEEE/CVF international conference on computer vision, pp 2980–2988
Cui Y, Song Y, Sun C, Howard A, Belongie S (2018) Large scale fine-grained categorization and domain-specific transfer learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4109–4118
Chu P, Bian X, Liu S, Ling H (2020) Feature space augmentation for long-tailed data. In: Proceedings of the European conference on computer vision, pp 694–710
Liu J, Sun Y, Han C, Dou Z, Li W (2020) Deep representation learning on long-tailed data: a learnable embedding augmentation perspective. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2970–2979
Kim J, Jeong J, Shin J (2020) M2m: Imbalanced classification via major-to-minor translation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13896–13905
Chou H-P, Chang S-C, Pan J-Y, Wei W, Juan D-C (2020) Remix: rebalanced mixup. In: Proceedings of the European conference on computer vision, pp 95–110
Zhang H, Cisse M, Dauphin YN, Lopez-Paz D (2018) mixup: Beyond empirical risk minimization. In: International conference on learning representations
Zhang Y, Wei X, Zhou B, Wu J (2021) Bag of tricks for long-tailed visual recognition with deep convolutional neural networks. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, no 4, pp 3447–3455
Saito M, Matsumoto E, Saito S (2017) Temporal generative adversarial nets with singular value clipping. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 2830–2839
Khan SH, Hayat M, Bennamoun M, Sohel FA, Togneri R (2017) Cost-sensitive learning of deep feature representations from imbalanced data. IEEE Trans Neural Netw Learn Syst 29(8):3573–3587
Google Scholar
Dong Q, Gong S, Zhu X (2017) Class rectification hard mining for imbalanced deep learning. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 1851–1860
Mahajan D, Girshick R, Ramanathan V, He K, Paluri M, Li Y, Bharambe A, Van Der Maaten L (2018) Exploring the limits of weakly supervised pretraining. In: Proceedings of the European conference on computer vision, pp 181–196
Mullick SS, Datta S, Das S (2019) Generative adversarial minority oversampling. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 1695–1704
Chapelle O, Weston J, Bottou L, Vapnik V (2001) Vicinal risk minimization. Adv Neural Inf Process Syst 13:416–422
Google Scholar
Bellinger C, Corizzo R, Japkowicz N (2020) Remix: calibrated resampling for class imbalance in deep learning. arXiv preprint arXiv:2012.02312
Yun S, Oh SJ, Heo B, Han D, Kim J (2020) Videomix: rethinking data augmentation for video classification. arXiv preprint arXiv:2012.03457
Duan H, Zhao Y, Xiong Y, Liu W, Lin D (2020) Omni-sourced webly-supervised learning for video recognition. In: Proceedings of the European conference on computer vision, pp 670–688
Wang J, Lin Y, Ma AJ (2020) Self-supervised learning using consistency regularization of spatio-temporal data augmentation for action recognition. arXiv preprint arXiv:2008.02086
Kim T, Lee H, Cho M, Lee HS, Cho DH, Lee S (2020) Learning temporally invariant and localizable features via data augmentation for video recognition. In: Proceedings of the European conference on computer vision, pp 386–403
Donahue J, Anne Hendricks L, Guadarrama S, Rohrbach M, Venugopalan S, Saenko K, Darrell T (2015) Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2625–2634
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 248–255
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition challenge. Int J Comput Vision 115(3):211–252
Article MathSciNet Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 770–778
Khan A, Sohail A, Zahoora U, Qureshi AS (2020) A survey of the recent architectures of deep convolutional neural networks. Artif Intell Rev 53(8):5455–5516
Article Google Scholar
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, pp 448–456
Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on international conference on machine learning, pp 807–814
Yu Y, Si X, Hu C, Zhang J (2019) A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput 31(7):1235–1270
Article MathSciNet Google Scholar
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681
Article Google Scholar
Graves A, Fernández S, Schmidhuber J (2005) Bidirectional lstm networks for improved phoneme classification and recognition. In: International conference on artificial neural networks, vol 3694, pp 799–804
Graves A, Jaitly N, Mohamed A-R (2013) Hybrid speech recognition with deep bidirectional LSTM. In: 2013 IEEE workshop on automatic speech recognition and understanding, pp 273–278
Li B, Liu Y, Wang X (2019) Gradient harmonized single-stage detector. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, no 01, pp 8577–8584
Van der Maaten L, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9(11):2579–2605
MATH Google Scholar

Download references

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant 62076179 and Grant 61732011, in part by the Beijing Natural Science Foundation under Grant Z180006.

Author information

Xianglei Zhu and Jianfeng Men contributed equally to this work and should be regarded as co-first authors.

Authors and Affiliations

College of Intelligence and Computing, Tianjin University, Tianjin, 300350, China
Xianglei Zhu, Jianfeng Men, Liu Yang & Keqiu Li
China Automotive Technology and Research Center Co. Ltd., Tianjin, 300300, China
Xianglei Zhu

Authors

Xianglei Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Jianfeng Men
View author publications
You can also search for this author in PubMed Google Scholar
Liu Yang
View author publications
You can also search for this author in PubMed Google Scholar
Keqiu Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liu Yang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (RAR 15,681 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhu, X., Men, J., Yang, L. et al. Imbalanced driving scene recognition with class focal loss and data augmentation. Int. J. Mach. Learn. & Cyber. 13, 2957–2975 (2022). https://doi.org/10.1007/s13042-022-01575-x

Download citation

Received: 17 January 2022
Accepted: 30 April 2022
Published: 07 June 2022
Issue Date: October 2022
DOI: https://doi.org/10.1007/s13042-022-01575-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Imbalanced driving scene recognition with class focal loss and data augmentation

Abstract

Access this article

Similar content being viewed by others

Effects of Brightness and Class-Unbalanced Dataset on CNN Model Selection and Image Classification Considering Autonomous Driving

JUIVCDv1: development of a still-image based dataset for indian vehicle classification

SelectAug: A Data Augmentation Method for Distracted Driving Detection

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (RAR 15,681 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Imbalanced driving scene recognition with class focal loss and data augmentation

Abstract

Access this article

Similar content being viewed by others

Effects of Brightness and Class-Unbalanced Dataset on CNN Model Selection and Image Classification Considering Autonomous Driving

JUIVCDv1: development of a still-image based dataset for indian vehicle classification

SelectAug: A Data Augmentation Method for Distracted Driving Detection

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (RAR 15,681 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation