Abstract
A novel approach of data augmentation based on irregular superpixel decomposition is proposed. This approach called SuperpixelGridMasks permits to extend original image datasets that are required by training stages of machine learning-related analysis architectures towards increasing their performances. Three variants named SuperpixelGridCut, SuperpixelGridMean, and SuperpixelGridMix are presented. These grid-based methods produce a new style of image transformations using the dropping and fusing of information. Extensive experiments using various image classification models as well as precision health and surrounding real-world datasets show that baseline performances can be significantly outperformed using our methods. The comparative study also shows that our methods can overpass the performances of other data augmentations. SuperpixelGridCut, SuperpixelGridMean, and SuperpixelGridMix codes are publicly available at https://github.com/hammoudiproject/SuperpixelGridMasks.











Similar content being viewed by others
Code Availability
The source codes permitting to generate the presented SuperpixelGridMasks data augmentations will be publicly made available online at: https://github.com/hammoudiproject/SuperpixelGridMasks.
Notes
Dataset Chest X-Ray Images: https://www.kaggle.com/datasets/paultimothymooney/chest-xray-pneumonia
A PASCAL VOC dataset: http://host.robots.ox.ac.uk/pascal/VOC/databases.html#VOC2005_1
References
Yang Z, Benhabiles H, Hammoudi K, Windal F, He R, Collard D (2021) A generalized deep learning-based framework for assistance to the human malaria diagnosis from microscopic images. Neural Comput. Appl.
Hammoudi K, Benhabiles H, Melkemi M, Dornaika F, Arganda-carreras I, Collard D, Scherpereel A (2021) Deep learning on chest x-ray images to detect and evaluate pneumonia cases at the era of COVID-19. J Medical Syst 45(7):75
Hammoudi K, Cabani A, Benhabiles H, Melkemi M (2020) Validating the correct wearing of protection mask by taking a selfie: design of a mobile application “checkyourmask” to limit the spread of covid-19. Comput Model Eng & Sci 124(3):1049–1059
Cabani A, Hammoudi K, Benhabiles H, Melkemi M (2020) Maskedface-net – a dataset of correctly/incorrectly masked face images in the context of covid-19. Smart Health
Buslaev A, Iglovikov VI, Khvedchenya E, Parinov A, Druzhinin M, Kalinin AA (2020) Albumentations: fast and flexible image augmentations. Information 11:2
Naveed H (2021) Survey: image mixing and deleting for data augmentation. CoRR, abs/2106.07085
Devries T, Taylor GW (2017) Improved regularization of convolutional neural networks with cutout. CoRR, abs/1708.04552
Yun S, Han D, Chun S, Oh S, Yoo Y, Choe J (2019) Cutmix: regularization strategy to train strong classifiers with localizable features. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), (Los Alamitos, CA, USA), pp. 6022–6031, IEEE Computer Society
Huang S, Wang X, Tao D (2021) Snapmix: semantically proportional mixing for augmenting fine-grained data, in AAAI
Zhao C, Lei Y (2021) Intra-class cutmix for unbalanced data augmentation. In: 2021 13th International Conference on Machine Learning and Computing, ICMLC 2021, (New York, NY, USA), p. 246–251 Association for Computing Machinery
Bochkovskiy A, Wang C, Liao HM (2020) Yolov4: optimal speed and accuracy of object detection. CoRR, abs/2004.10934
Chen P, Liu S, Zhao H, Jia J (2020) Gridmask data augmentation. CoRR, abs/2001.04086
Feng S, Yang S, Niu Z, Xie J, Wei M, Li P (2021) Grid cut and mix: flexible and efficient data augmentation. In: Pan Z, Hei X (eds) Twelfth International Conference on Graphics and Image Processing (ICGIP 2020), vol 11720. International Society for Optics and Photonics, SPIE, pp 656–662
Pereira MB, Santos JAD (2021) Chessmix: spatial context data augmentation for remote sensing semantic segmentation
Kim J-H, Choo W, Song HO (2020) Puzzle mix: exploiting saliency and local statistics for optimal mixup. In: III HD, Singh A (eds) Proceedings of the 37th International Conference on Machine Learning. Proceedings of Machine Learning Research, PMLR 13–18, vol 119, pp 5275–5285
Walawalkar D, Shen Z, Liu Z, Savvides M (2020) Attentive cutmix: an enhanced data augmentation approach for deep learning based image classification
Uddin AFMS, Monira MS, Shin W, Chung T, Bae S-H (2021) Saliencymix: a saliency guided data augmentation strategy for better regularization. arXiv:2006.01791
Yang L, Li X, Zhao B, Song R, Yang J (2022) Recursivemix: mixed learning with history
Li C-L, Sohn K, Yoon J, Pfister T (2021) Cutpaste: self-supervised learning for anomaly detection and localization. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 9659–9669
Hendrycks D, Mu N, Cubuk ED, Zoph B, Gilmer J, Lakshminarayanan B (2020) Augmix: a simple data processing method to improve robustness and uncertainty. In: Proceedings of the International Conference on Learning Representations (ICLR)
Zhang Y, Yang L, Zheng H, Liang P, Mangold C, Loreto RG, Hughes DP, Chen DZ (2019) SPDA: Superpixel-based data augmentation for biomedical image segmentation. In: Cardoso MJ, Feragen A, Glocker B, Konukoglu E, Oguz I, Unal G, Vercauteren T (eds) Proceedings of the 2nd International Conference on Medical Imaging with Deep Learning, vol 102, pp 572–587
Acción L, Argüello F, Heras DB (2020) Dual-window superpixel data augmentation for hyperspectral image classification. Appl Sci 10:24
Franchi G, Belkhir N, Ha ML, hu y., Bursuc A, Blanz V, Yao A (Nov. 2021) Robust semantic segmentation with Superpixel-Mix. In: The British machine vision conference (BMVC), Online, United Kingdom
Wang M, Liu X, Gao Y, Ma X, Soomro NQ (2017) Superpixel segmentation: a benchmark. Signal Process Image Commun 56:28–39
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp 248–255
Achanta R, Shaji A, Smith K, Lucchi A, Fua P, Süsstrunk S. (2012) Slic superpixels compared to state-of-the-art superpixel methods. IEEE Trans Pattern Anal Mach Intell 34(11):2274–2282
van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9:2579–2605 11
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing Interests
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Karim Hammoudi and Adnane Cabani contributed equally to this work.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Hammoudi, K., Cabani, A., Slika, B. et al. SuperpixelGridMasks Data Augmentation: Application to Precision Health and Other Real-world Data. J Healthc Inform Res 6, 442–460 (2022). https://doi.org/10.1007/s41666-022-00122-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41666-022-00122-1