research-article

CyclicShift: A Data Augmentation Method For Enriching Data Patterns

Authors:

Ming LiuAuthors Info & Claims

MM '22: Proceedings of the 30th ACM International Conference on Multimedia

Pages 4921 - 4929

https://doi.org/10.1145/3503161.3548188

Published: 10 October 2022 Publication History

Abstract

In this paper, we propose a simple yet effective data augmentation strategy, dubbed CyclicShift, to enrich data patterns. The idea is to shift the image in a certain direction and then circularly refill the resultant out-of-frame part to the other side. Compared with previous related methods, Translation, and Shuffle, our proposed method is able to avoid losing pixels of the original image and preserve its semantic information as much as possible. Visually and emprically, we show that our method indeed brings new data patterns and thereby improves the generalization ability as well as the performance of models. Extensive experiments demonstrate our method's effectiveness in image classification and fine-grained recognition over multiple datasets and various network architectures. Furthermore, our method can also be superimposed on other data augmentation methods in a very simple way. CyclicMix, the simultaneous use of CyclicShift and CutMix, hits a new high in most cases. Our code is open-source and available at https://github.com/dejavunHui/CyclicShift.

Supplementary Material

MP4 File (MM22-fp1822.mp4)

Presentation video

Download
172.01 MB

References

[1]

Nicholas Baker, Hongjing Lu, Gennady Erlikhman, and Philip J Kellman. 2018. Deep convolutional networks do not classify based on global object shape. PLoS computational biology, Vol. 14, 12 (2018), e1006613.

[2]

Wieland Brendel and Matthias Bethge. 2019. Approximating cnns with bag-of-local-features models works surprisingly well on imagenet. arXiv preprint arXiv:1904.00760 (2019).

[3]

Pengguang Chen, Shu Liu, Hengshuang Zhao, and Jiaya Jia. 2020. Gridmask data augmentation. arXiv preprint arXiv:2001.04086 (2020).

[4]

Ekin D Cubuk, Barret Zoph, Dandelion Mane, Vijay Vasudevan, and Quoc V Le. 2018. Autoaugment: Learning augmentation policies from data. arXiv preprint arXiv:1805.09501 (2018).

[5]

Terrance DeVries and Graham W Taylor. 2017. Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017).

[6]

Carl Doersch, Abhinav Gupta, and Alexei A Efros. 2015. Unsupervised visual representation learning by context prediction. In Proceedings of the IEEE international conference on computer vision. 1422--1430.

Digital Library

[7]

Mark Everingham, Luc Van Gool, Christopher KI Williams, John Winn, and Andrew Zisserman. 2010. The pascal visual object classes (voc) challenge. International journal of computer vision, Vol. 88, 2 (2010), 303--338.

Digital Library

[8]

Sanja Fidler, Marko Boben, and Ales Leonardis. 2014. Learning a hierarchical compositional shape vocabulary for multi-class object representation. arXiv preprint arXiv:1408.5516 (2014).

[9]

Chengyue Gong, Dilin Wang, Meng Li, Vikas Chandra, and Qiang Liu. 2021. KeepAugment: A simple information-preserving data augmentation approach. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1055--1064.

[10]

Dongyoon Han, Jiwhan Kim, and Junmo Kim. 2017. Deep pyramidal residual networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5927--5935.

[11]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.

[12]

Dan Hendrycks, Norman Mu, Ekin D Cubuk, Barret Zoph, Justin Gilmer, and Balaji Lakshminarayanan. 2019. Augmix: A simple data processing method to improve robustness and uncertainty. arXiv preprint arXiv:1912.02781 (2019).

[13]

Ya Jin and Stuart Geman. 2006. Context and hierarchy in a probabilistic image model. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06), Vol. 2. IEEE, 2145--2152.

[14]

Aditya Khosla, Nityananda Jayadevaprakash, Bangpeng Yao, and Fei-Fei Li. 2011. Novel dataset for fine-grained image categorization: Stanford dogs. In Proc. CVPR Workshop on Fine-Grained Visual Categorization (FGVC), Vol. 2. Citeseer.

[15]

Jang-Hyun Kim, Wonho Choo, Hosan Jeong, and Hyun Oh Song. 2021. Co-mixup: Saliency guided joint mixup with supermodular diversity. arXiv preprint arXiv:2102.03065 (2021).

[16]

Adam Kortylewski, Ju He, Qing Liu, and Alan L Yuille. 2020a. Compositional convolutional neural networks: A deep architecture with innate robustness to partial occlusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8940--8949.

[17]

Adam Kortylewski, Qing Liu, Huiyu Wang, Zhishuai Zhang, and Alan Yuille. 2020b. Combining compositional models and deep networks for robust object classification under occlusion. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 1333--1341.

[18]

Jonathan Krause, Michael Stark, Jia Deng, and Li Fei-Fei. 2013. 3D Object Representations for Fine-Grained Categorization. In 2013 IEEE International Conference on Computer Vision Workshops. 554--561. https://doi.org/10.1109/ICCVW.2013.77

Digital Library

[19]

Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning multiple layers of features from tiny images. (2009).

[20]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, Vol. 25 (2012).

Digital Library

[21]

Sungbin Lim, Ildoo Kim, Taesup Kim, Chiheon Kim, and Sungwoong Kim. 2019. Fast autoaugment. Advances in Neural Information Processing Systems, Vol. 32 (2019).

[22]

W Liu, D Anguelov, D Erhan, C Szegedy, S Reed, CY Fu, and AC Berg. 1985. Ssd: Single shot multibox detector. InEuropean conference on computer vision 2016 Oct 8 (pp. 21--37).

[23]

Mehdi Noroozi and Paolo Favaro. 2016. Unsupervised learning of visual representations by solving jigsaw puzzles. In European conference on computer vision. Springer, 69--84.

[24]

Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in pytorch. (2017).

[25]

Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, Vol. 28 (2015).

[26]

Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, et al. 2015. Imagenet large scale visual recognition challenge. International journal of computer vision, Vol. 115, 3 (2015), 211--252.

[27]

Connor Shorten and Taghi M Khoshgoftaar. 2019. A survey on image data augmentation for deep learning. Journal of big data, Vol. 6, 1 (2019), 1--48.

[28]

Krishna Kumar Singh and Yong Jae Lee. 2017. Hide-and-seek: Forcing a network to be meticulous for weakly-supervised object and action localization. In 2017 IEEE international conference on computer vision (ICCV). IEEE, 3544--3553.

[29]

Yunong Tian, Guodong Yang, Zhe Wang, Hao Wang, En Li, and Zize Liang. 2019. Apple detection during different growth stages in orchards using the improved YOLO-V3 model. Computers and electronics in agriculture, Vol. 157 (2019), 417--426.

[30]

Catherine Wah, Steve Branson, Peter Welinder, Pietro Perona, and Serge Belongie. 2011. The caltech-ucsd birds-200--2011 dataset. (2011).

[31]

Mingqing Xiao, Adam Kortylewski, Ruihai Wu, Siyuan Qiao, Wei Shen, and Alan Yuille. 2019. Tdapnet: Prototype network with recurrent top-down attention for robust object classification under partial occlusion. arXiv preprint arXiv:1909.03879 (2019).

[32]

Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He. 2017. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1492--1500.

[33]

Tianshu Xie, Xuan Cheng, Xiaomin Wang, Minghui Liu, Jiali Deng, Tao Zhou, and Ming Liu. 2021. Cut-Thumbnail: A Novel Data Augmentation for Convolutional Neural Network. In Proceedings of the 29th ACM International Conference on Multimedia. 1627--1635.

Digital Library

[34]

Sangdoo Yun, Dongyoon Han, Seong Joon Oh, Sanghyuk Chun, Junsuk Choe, and Youngjoon Yoo. 2019. Cutmix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of the IEEE/CVF international conference on computer vision. 6023--6032.

[35]

Sergey Zagoruyko and Nikos Komodakis. 2016. Wide residual networks. arXiv preprint arXiv:1605.07146 (2016).

[36]

Hongyi Zhang, Moustapha Cisse, Yann N Dauphin, and David Lopez-Paz. 2017. mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017).

[37]

Zhun Zhong, Liang Zheng, Guoliang Kang, Shaozi Li, and Yi Yang. 2020. Random erasing data augmentation. In Proceedings of the AAAI conference on artificial intelligence, Vol. 34. 13001--13008.

[38]

Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba. 2016. Learning deep features for discriminative localization. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2921--2929.

[39]

Hongru Zhu, Peng Tang, Jeongho Park, Soojin Park, and Alan Yuille. 2019. Robustness of object recognition under extreme occlusion in humans and computational models. arXiv preprint arXiv:1905.04598 (2019).

[40]

Long Leo Zhu, Chenxi Lin, Haoda Huang, Yuanhao Chen, and Alan Yuille. 2008. Unsupervised structure learning: Hierarchical recursive composition, suspicious coincidence and competitive exclusion. In European Conference on Computer Vision. Springer, 759--773.

Digital Library

Cited By

Wang XWu LHu BYang XFan XLiu MCheng KWang SMiao JGong H(2024)PatchRLNet: A Framework Combining a Vision Transformer and Reinforcement Learning for The Separation of a PTFE Emulsion and ParaffinElectronics10.3390/electronics1302033913:2(339)Online publication date: 12-Jan-2024
https://doi.org/10.3390/electronics13020339
Sun YWu LChen PZhang FXu L(2023)Using deep learning in pathology image analysis: A novel active learning strategy based on latent representationElectronic Research Archive10.3934/era.202327131:9(5340-5361)Online publication date: 2023
https://doi.org/10.3934/era.2023271
Liu KFeng YZhang LWang RWang WYuan XCui XLi XLi H(2023)An Effective Personality-Based Model for Short Text Sentiment Classification Using BiLSTM and Self-AttentionElectronics10.3390/electronics1215327412:15(3274)Online publication date: 30-Jul-2023
https://doi.org/10.3390/electronics12153274

Index Terms

CyclicShift: A Data Augmentation Method For Enriching Data Patterns
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision

Recommendations

LesionMix: A Lesion-Level Data Augmentation Method for Medical Image Segmentation
Data Augmentation, Labelling, and Imperfections
Abstract
Data augmentation has become a de facto component of deep learning-based medical image segmentation methods. Most data augmentation techniques used in medical imaging focus on spatial and intensity transformations to improve the diversity of ...
Smart data augmentation: One equation is all you need
Abstract
Class imbalance is a common and critical challenge in machine learning classification problems, resulting in low prediction accuracy. While numerous methods, especially data augmentation methods, have been proposed to address this issue, a ...
ClaveNet: Generating Afro-Cuban Drum Patterns through Data Augmentation
AM '24: Proceedings of the 19th International Audio Mostly Conference: Explorations in Sonic Cultures

We present ClaveNet: a generative MIDI model for Afro-Cuban percussion. We adapt the Monotonic Groove Transformer (MGT) —originally trained on the Groove MIDI Dataset (GMD)— to generate Afro-Cuban-influenced MIDI drum grooves. As Afro-Cuban drum MIDI ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '22: Proceedings of the 30th ACM International Conference on Multimedia

October 2022

7537 pages

ISBN:9781450392037

DOI:10.1145/3503161

General Chairs:
João Magalhães
NOVA University of Lisbon, Portugal
,
Alberto del Bimbo
University of Florence, Italy
,
Shin'ichi Satoh
National Institute of Informatics, Japan
,
Nicu Sebe
University of Trento, Italy
,
Program Chairs:
Xavier Alameda-Pineda
Inria, Grenoble, France
,
Qin Jin
Renmin University of China, China
,
Vincent Oria
New Jersey Institute of Technology, USA
,
Laura Toni
University College London, UK

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 October 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Medico-Engineering Cooperation Funds from University of Electronic Science and Technology of China
Science and Technology Program of Quzhou

Conference

MM '22

Sponsor:

SIGMM

MM '22: The 30th ACM International Conference on Multimedia

October 10 - 14, 2022

Lisboa, Portugal

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
114
Total Downloads

Downloads (Last 12 months)21
Downloads (Last 6 weeks)4

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wang XWu LHu BYang XFan XLiu MCheng KWang SMiao JGong H(2024)PatchRLNet: A Framework Combining a Vision Transformer and Reinforcement Learning for The Separation of a PTFE Emulsion and ParaffinElectronics10.3390/electronics1302033913:2(339)Online publication date: 12-Jan-2024
https://doi.org/10.3390/electronics13020339
Sun YWu LChen PZhang FXu L(2023)Using deep learning in pathology image analysis: A novel active learning strategy based on latent representationElectronic Research Archive10.3934/era.202327131:9(5340-5361)Online publication date: 2023
https://doi.org/10.3934/era.2023271
Liu KFeng YZhang LWang RWang WYuan XCui XLi XLi H(2023)An Effective Personality-Based Model for Short Text Sentiment Classification Using BiLSTM and Self-AttentionElectronics10.3390/electronics1215327412:15(3274)Online publication date: 30-Jul-2023
https://doi.org/10.3390/electronics12153274

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten