skip to main content
10.1145/3565291.3565333acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicbdtConference Proceedingsconference-collections
research-article

EPCL:Encoder Perturbation can Replace the Data Augmentations in Contrastive Learning for Image Classification

Published:16 December 2022Publication History

ABSTRACT

Contrastive learning has been applied to various fields such as visual representation, graph representation, etc. due to its advantages of being able to extract self-supervised information from raw data without manual annotation. The contrastive view is the core of contrastive learning, and current contrastive learning methods are mainly constructed through data augmentation. Data augmentation forms pairs of samples that learn by attracting pairs of positive samples and repelling pairs of negative samples. However, current contrastive learning methods mainly perform instance-level data augmentation operations on the original input data, which may lead to differences in semantic information between positive sample pairs that degrade performance. At the same time, studies have shown that data augmentation does not play a large role in contrastive learning. Therefore, in order to effectively avoid the negative effects of data augmentation, in this paper, we propose encoder perturbation contrastive learning(EPCL), which abandons the image augmentation operation. Specifically, we take the original image as input and perform feature extraction on the image with a perturbed version of the convolutional neural network model to obtain two related views for contrast. Experiments show that the information of the image can still be well preserved and represented in the process of encoder perturbation. Compared with the current contrastive learning methods, the results show that our proposed method has strong competitiveness and performance.

References

  1. Yuki Markus Asano, Christian Rupprecht, and Andrea Vedaldi. 2019. Self-labelling via simultaneous clustering and representation learning. arXiv preprint arXiv:1911.05371(2019).Google ScholarGoogle Scholar
  2. Lukas Bossard, Matthieu Guillaumin, and Luc Van Gool. 2014. Food-101–mining discriminative components with random forests. In European conference on computer vision. Springer, 446–461.Google ScholarGoogle ScholarCross RefCross Ref
  3. Mathilde Caron, Piotr Bojanowski, Armand Joulin, and Matthijs Douze. 2018. Deep clustering for unsupervised learning of visual features. In Proceedings of the European conference on computer vision (ECCV). 132–149.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Mathilde Caron, Ishan Misra, Julien Mairal, Priya Goyal, Piotr Bojanowski, and Armand Joulin. 2020. Unsupervised learning of visual features by contrasting cluster assignments. Advances in Neural Information Processing Systems 33 (2020), 9912–9924.Google ScholarGoogle Scholar
  5. Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In International conference on machine learning. PMLR, 1597–1607.Google ScholarGoogle Scholar
  6. Ting Chen, Simon Kornblith, Kevin Swersky, Mohammad Norouzi, and Geoffrey E Hinton. 2020. Big self-supervised models are strong semi-supervised learners. Advances in neural information processing systems 33 (2020), 22243–22255.Google ScholarGoogle Scholar
  7. Xinlei Chen, Haoqi Fan, Ross Girshick, and Kaiming He. 2020. Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297(2020).Google ScholarGoogle Scholar
  8. Xinlei Chen and Kaiming He. 2021. Exploring simple siamese representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 15750–15758.Google ScholarGoogle ScholarCross RefCross Ref
  9. Xinlei Chen, Saining Xie, and Kaiming He. 2021. An empirical study of training self-supervised visual transformers. arXiv e-prints (2021), arXiv–2104.Google ScholarGoogle Scholar
  10. Mircea Cimpoi, Subhransu Maji, Iasonas Kokkinos, Sammy Mohamed, and Andrea Vedaldi. 2014. Describing textures in the wild. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3606–3613.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929(2020).Google ScholarGoogle Scholar
  12. Gamaleldin Elsayed, Dilip Krishnan, Hossein Mobahi, Kevin Regan, and Samy Bengio. 2018. Large margin deep networks for classification. Advances in neural information processing systems 31 (2018).Google ScholarGoogle Scholar
  13. Linus Ericsson, Henry Gouk, and Timothy M Hospedales. 2021. How well do self-supervised models transfer?. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5414–5423.Google ScholarGoogle ScholarCross RefCross Ref
  14. Mark Everingham, Luc Van Gool, Christopher KI Williams, John Winn, and Andrew Zisserman. 2010. The pascal visual object classes (voc) challenge. International journal of computer vision 88, 2 (2010), 303–338.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Li Fei-Fei, Rob Fergus, and Pietro Perona. 2004. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. In 2004 conference on computer vision and pattern recognition workshop. IEEE, 178–178.Google ScholarGoogle ScholarCross RefCross Ref
  16. Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Guo, Mohammad Gheshlaghi Azar, 2020. Bootstrap your own latent-a new approach to self-supervised learning. Advances in Neural Information Processing Systems 33 (2020), 21271–21284.Google ScholarGoogle Scholar
  17. Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9729–9738.Google ScholarGoogle ScholarCross RefCross Ref
  18. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778.Google ScholarGoogle ScholarCross RefCross Ref
  19. Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4700–4708.Google ScholarGoogle ScholarCross RefCross Ref
  20. Jing Huang, S Ravi Kumar, Mandar Mitra, Wei-Jing Zhu, and Ramin Zabih. 1997. Image indexing using color correlograms. In Proceedings of IEEE computer society conference on Computer Vision and Pattern Recognition. IEEE, 762–768.Google ScholarGoogle ScholarCross RefCross Ref
  21. Forrest N Iandola, Song Han, Matthew W Moskewicz, Khalid Ashraf, William J Dally, and Kurt Keutzer. 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0.5 MB model size. arXiv preprint arXiv:1602.07360(2016).Google ScholarGoogle Scholar
  22. Jonathan Krause, Jia Deng, Michael Stark, and Li Fei-Fei. 2013. Collecting a large-scale dataset of fine-grained cars. (2013).Google ScholarGoogle Scholar
  23. Alex Krizhevsky, Geoffrey Hinton, 2009. Learning multiple layers of features from tiny images. (2009).Google ScholarGoogle Scholar
  24. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25 (2012).Google ScholarGoogle Scholar
  25. Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278–2324.Google ScholarGoogle ScholarCross RefCross Ref
  26. Junnan Li, Pan Zhou, Caiming Xiong, and Steven CH Hoi. 2020. Prototypical contrastive learning of unsupervised representations. arXiv preprint arXiv:2005.04966(2020).Google ScholarGoogle Scholar
  27. Subhransu Maji, Esa Rahtu, Juho Kannala, Matthew Blaschko, and Andrea Vedaldi. 2013. Fine-grained visual classification of aircraft. arXiv preprint arXiv:1306.5151(2013).Google ScholarGoogle Scholar
  28. Ishan Misra and Laurens van der Maaten. 2020. Self-supervised learning of pretext-invariant representations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6707–6717.Google ScholarGoogle ScholarCross RefCross Ref
  29. Maria-Elena Nilsback and Andrew Zisserman. 2008. Automated flower classification over a large number of classes. In 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing. IEEE, 722–729.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Timo Ojala, Matti Pietikainen, and David Harwood. 1994. Performance evaluation of texture measures with classification based on Kullback discrimination of distributions. In Proceedings of 12th international conference on pattern recognition, Vol. 1. IEEE, 582–585.Google ScholarGoogle ScholarCross RefCross Ref
  31. Omkar M Parkhi, Andrea Vedaldi, Andrew Zisserman, and CV Jawahar. 2012. Cats and dogs. In 2012 IEEE conference on computer vision and pattern recognition. IEEE, 3498–3505.Google ScholarGoogle ScholarCross RefCross Ref
  32. Zheng Qin, Zhaoning Zhang, Xiaotao Chen, Changjian Wang, and Yuxing Peng. 2018. Fd-mobilenet: Improved mobilenet with a fast downsampling strategy. In 2018 25th IEEE International Conference on Image Processing (ICIP). IEEE, 1363–1367.Google ScholarGoogle ScholarCross RefCross Ref
  33. Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556(2014).Google ScholarGoogle Scholar
  34. Michael J Swain and Dana H Ballard. 1991. Color indexing. International journal of computer vision 7, 1 (1991), 11–32.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1–9.Google ScholarGoogle ScholarCross RefCross Ref
  36. Yonglong Tian, Chen Sun, Ben Poole, Dilip Krishnan, Cordelia Schmid, and Phillip Isola. 2020. What makes for good views for contrastive learning?Advances in Neural Information Processing Systems 33 (2020), 6827–6839.Google ScholarGoogle Scholar
  37. Tongzhou Wang and Phillip Isola. 2020. Understanding contrastive representation learning through alignment and uniformity on the hypersphere. In International Conference on Machine Learning. PMLR, 9929–9939.Google ScholarGoogle Scholar
  38. Zhirong Wu, Yuanjun Xiong, Stella X Yu, and Dahua Lin. 2018. Unsupervised feature learning via non-parametric instance discrimination. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3733–3742.Google ScholarGoogle ScholarCross RefCross Ref
  39. Jianxiong Xiao, James Hays, Krista A Ehinger, Aude Oliva, and Antonio Torralba. 2010. Sun database: Large-scale scene recognition from abbey to zoo. In 2010 IEEE computer society conference on computer vision and pattern recognition. IEEE, 3485–3492.Google ScholarGoogle Scholar
  40. Jure Zbontar, Li Jing, Ishan Misra, Yann LeCun, and Stéphane Deny. 2021. Barlow twins: Self-supervised learning via redundancy reduction. In International Conference on Machine Learning. PMLR, 12310–12320.Google ScholarGoogle Scholar
  41. Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, and Jian Sun. 2018. Shufflenet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE conference on computer vision and pattern recognition. 6848–6856.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. EPCL:Encoder Perturbation can Replace the Data Augmentations in Contrastive Learning for Image Classification

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        ICBDT '22: Proceedings of the 5th International Conference on Big Data Technologies
        September 2022
        454 pages
        ISBN:9781450396875
        DOI:10.1145/3565291

        Copyright © 2022 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 16 December 2022

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited
      • Article Metrics

        • Downloads (Last 12 months)29
        • Downloads (Last 6 weeks)2

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format