LocMix: local saliency-based data augmentation for image classification

Yan, Lingyu; Ye, Yu; Wang, Chunzhi; Sun, Yun

doi:10.1007/s11760-023-02852-0

LocMix: local saliency-based data augmentation for image classification

Original Paper
Published: 11 November 2023

Volume 18, pages 1383–1392, (2024)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Lingyu Yan¹^na1,
Yu Ye¹^na1,
Chunzhi Wang¹ &
…
Yun Sun^1,2

140 Accesses
Explore all metrics

Abstract

Data augmentation is a crucial strategy to tackle issues like inadequate model robustness and a significant generalization gap. It is proven to combat overfitting, elevate deep neural network performance, and enhance generalization, particularly when data are limited. In recent years, mixed sample data augmentation (MSDA), including variants like Mixup and CutMix, has gained significant attention. However, these methods sometimes confound the network with misleading signals, limiting their effectiveness. In this context, we propose LocMix, an MSDA that aims to generate new training samples by prioritizing local saliency feature information and employing statistical data mixing. We achieve this by concealing salient regions with random masks and efficiently combining images through the optimization of local saliency information using transportation methods. Prioritizing the local features within an image allows LocMix to capture image details with greater accuracy and comprehensiveness, thereby enhancing the model’s capacity to understand the target image. We conduct extensive validation of this approach on various challenging datasets. When applied to the training of the PreAct-ResNet18 model, our method yields notable improvements in accuracy. Specifically, on the CIFAR-10 dataset, we observe an impressive 1.71% accuracy enhancement. Similarly, on CIFAR-100, Tiny-ImageNet, ImageNet, and SVHN, we attain substantial accuracy improvements of 80.12%, 64.60%, 77.62%, and 97.12%, corresponding to improvements of 4.88%, 8.75%, 1.93%, and 0.57%, respectively. These experimental results plainly illustrate the effectiveness of our proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

LMix: regularization strategy for convolutional neural networks

Article 21 August 2022

An overview of mixing augmentation methods and augmentation strategies

Article Open access 30 June 2022

NeighborMix data augmentation for image recognition

Article 01 September 2023

Data Availability

Our data sets are sourced from publicly available repositories. To access them, please visit the respective official websites for downloads.

References

Chandio, A., et al.: Precise single-stae detector. arXiv preprint arXiv:2210.04252 (2022)
Khan, W., et al.: Introducing urdu digits dataset with demonstration of an efficient and robust noisy decoder-based pseudo example generator. Symmetry 14(10), 1976 (2022). https://doi.org/10.3390/sym14101976
Article ADS CAS Google Scholar
Roy, A.M., et al.: Wildect-yolo: an efficient and robust computer vision-based accurate object localization model for automated endangered wildlife detection. Eco. Inform. 75, 101919 (2023). https://doi.org/10.1016/j.ecoinf.2022.101919
Article Google Scholar
He, K., et al.: Mask r-cnn. In Proceedings of the IEEE international conference on computer vision(ICCV), 2961–2969 (2017). https://doi.org/10.48550/arXiv.1703.06870
Liu, X., Deng, Z., Yang, Y.: Recent progress in semantic image segmentation. Artif. Intell. Rev. 52, 1089–1106 (2019). https://doi.org/10.1007/s10462-018-9641-3
Article Google Scholar
Baseri Saadi, S., et al.: Investigation of effectiveness of shuffled frog-leaping optimizer in training a convolution neural network. J. Healthc. Eng. (2022). https://doi.org/10.1155/2022/4703682
Article PubMed PubMed Central Google Scholar
Ranjbarzadeh, R., et al.: Me-ccnn: multi-encoded images and a cascade convolutional neural network for breast tumor segmentation and recognition. Artif. Intell. Rev. (2023). https://doi.org/10.1007/s10462-023-10426-2
Article Google Scholar
Ranjbarzadeh, R., et al.: Mrfe-cnn: multi-route feature extraction model for breast tumor segmentation in mammograms using a convolutional neural network. Ann. Oper. Res. (2022). https://doi.org/10.1007/s10479-022-04755-8
Article Google Scholar
Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vision 115, 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y
Article MathSciNet Google Scholar
Shorten, C., Khoshgoftaar, T.M.: A survey on image data augmentation for deep learning. J. Big Data 6(1), 1–48 (2019). https://doi.org/10.1186/s40537-019-0197-0
Article Google Scholar
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv preprint arXiv:2210.04252 (2014)
Singh, A., et al.: Understanding eeg signals for subject-wise definition of armoni activities. arXiv preprint arXiv:2301.00948 (2023)
Bayer, M., Kaufhold, M.A., Reuter, C.: A survey on data augmentation for text classification. ACM Comput. Surv. 55(7), 1–39 (2022). https://doi.org/10.1145/3544558
Article Google Scholar
Harris, E., et al.: Fmix: Enhancing mixed sample data augmentation. arXiv preprint arXiv:2002.12047 (2020)
Zhang, H., et al.: Mixup: beyond empirical risk minimization. arXiv preprint https://doi.org/10.48550/arXiv.1710.09412 (2017)
Yun, S., et al.: Cutmix: regularization strategy to train strong classifiers with localizable features. In Proceedings of the IEEE/CVF international conference on computer vision(ICCV), 6023–6032 (2019). arXiv:1905.04899
Uddin, A.A.O.: Saliencymix: a saliency guided data augmentation strategy for better regularization. arXiv preprint arXiv:2006.01791 (2020)
Kim, J.H., Choo, W., Song, H.O.: Puzzle mix: Exploiting saliency and local statistics for optimal mixup. In International conference on machine learning, 5275–5285. PMLR (2020). arXiv:2009.06962v2
Simonyan, K., Vedaldi, A., Zisserman, A.: Deep inside convolutional networks: visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034 (2013)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017). https://doi.org/10.1145/3065386
Article Google Scholar
Zhong, Z., et al.: Random erasing data augmentation. In Proceedings of the AAAI conference on artificial intelligence 34, 13001–13008 (2020). https://doi.org/10.1609/aaai.v34i07.7000
Singh, K.K., et al.: Hide-and-seek: A data augmentation technique for weakly-supervised localization and beyond. arXiv preprint arXiv:1811.02545 (2018)
Taylor, L., Nitschke, G.: Improving deep learning with generic data augmentation. In 2018 IEEE symposium series on computational intelligence (SSCI), 1542–1547. IEEE (2018). https://doi.org/10.1109/SSCI.2018.8628742
Verma, V., et al.: Manifold mixup: Better representations by interpolating hidden states. In International conference on machine learning, 6438–6447. PMLR (2019). arXiv:1806.05236v7
Yan, L., et al.: Lmix: regularization strategy for convolutional neural networks. SIViP 17(4), 1245–1253 (2023). https://doi.org/10.1007/s11760-022-02332-x
Article Google Scholar
Sundararajan, M., Taly, A., Yan, Q.: Axiomatic attribution for deep networks. In International conference on machine learning, 3319–3328. PMLR (2017)
Zhao, R., et al.: Saliency detection by multi-context deep learning. In Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), 1265–1274 (2015)
Zhou, B., et al.: Learning deep features for discriminative localization. In Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), 2921–2929 (2016)
Selvaraju, R.R., et al.: Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision(CVPR, 618–626 (2017)
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Le, Y., Yang, X.: Tiny imagenet visual recognition challenge. CS 231N, 7(7), 3 (2015). http://cs231n.stanford.edu/tiny-imagenet-200
Netzer, Y., et al.: Reading digits in natural images with unsupervised feature learning (2011)
He, K., et al.: Identity mappings in deep residual networks. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part IV 14, 630–645. Springer (2016). https://doi.org/10.1007/978-3-319-46493-0_38
Zagoruyko, S., Komodakis, N.: Wide residual networks. arXiv preprint arXiv:1605.07146 (2016)
Han, D., Kim, J., Kim, J.: Deep pyramidal residual networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), 5927–5935 (2017). arxiv:1610.02915
Kim, J.H., et al.: Co-mixup: Saliency guided joint mixup with supermodular diversity. Cornell University-arXiv, Learning (2021)

Download references

Funding

This work is funded by the National Natural Science Foundation of China under Grant No. 61772180 and the Key R & D plan of Hubei Province No. 2023BCB041.

Author information

Lingyu Yan, Yu Ye have contributed equally to this work.

Authors and Affiliations

School of Computer Science, Hubei University of Technology, Wuhan, 430068, China
Lingyu Yan, Yu Ye, Chunzhi Wang & Yun Sun
School of Computer Science and Artificial Intelligence, Wuhan University of Technology, Wuhan, 430070, China
Yun Sun

Authors

Lingyu Yan
View author publications
You can also search for this author in PubMed Google Scholar
Yu Ye
View author publications
You can also search for this author in PubMed Google Scholar
Chunzhi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yun Sun
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

LY and YY performed the main manuscript work and experiments. WC and SY created Tables 3 and 4. All authors participated in manuscript review.

Corresponding author

Correspondence to Yu Ye.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yan, L., Ye, Y., Wang, C. et al. LocMix: local saliency-based data augmentation for image classification. SIViP 18, 1383–1392 (2024). https://doi.org/10.1007/s11760-023-02852-0

Download citation

Received: 01 September 2023
Revised: 02 October 2023
Accepted: 15 October 2023
Published: 11 November 2023
Issue Date: March 2024
DOI: https://doi.org/10.1007/s11760-023-02852-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

LocMix: local saliency-based data augmentation for image classification

Abstract

Access this article

Similar content being viewed by others

LMix: regularization strategy for convolutional neural networks

An overview of mixing augmentation methods and augmentation strategies

NeighborMix data augmentation for image recognition

Data Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

LocMix: local saliency-based data augmentation for image classification

Abstract

Access this article

Similar content being viewed by others

LMix: regularization strategy for convolutional neural networks

An overview of mixing augmentation methods and augmentation strategies

NeighborMix data augmentation for image recognition

Data Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation