Skip to main content
Log in

Transformer-based contrastive learning framework for image anomaly detection

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Anomaly detection refers to the problem of uncovering patterns in a given data set that do not conform to the expected behavior. Recently, owing to the continuous development of deep representation learning, a large number of anomaly detection approaches based on deep learning models have been developed and achieved promising performance. In this work, an image anomaly detection approach based on contrastive learning framework is proposed. Rather than adopting ResNet or other CNN-based deep neural networks as in most of the previous deep learning-based image anomaly detection approaches to learn representations from training samples, a contrastive learning framework is developed for anomaly detection in which Transformer is adopted for extracting better representations. Then, we develop a triple contrastive loss function and embed it into the proposed contrastive learning framework to alleviate the problem of catastrophic collapse that is often encountered in many anomaly detection approaches. Furthermore, a nonlinear Projector is integrated with our model to improve the performance of anomaly detection. The effectiveness of our image anomaly detection approach is validated through experiments on multiple benchmark data sets. According to the experimental results, our approach can obtain better or comparative performance in comparison with state-of-the-art anomaly detection approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data Availability

The CIFAR10 and CIFAR100 data sets are available at: [http://www.cs.toronto.edu/~kriz/cifar.html]. The CatsVsDogs data set is available at: [https://www.kaggle.com/c/dogs-vs-cats/overview]. The BreakHis data set is available at: [https://web.inf.ufpr.br/vri/databases/].

Notes

  1. Our source code is available at https://github.com/sutusutu/TMSCL.

References

  1. Akcay S, Atapour-Abarghouei A, Breckon TP (2018) Ganomaly: semi-supervised anomaly detection via adversarial training. In: Asian conference on computer vision. Springer, Berlin, pp 622–637

  2. Beggel L, Pfeiffer M, Bischl B (2020) Robust anomaly detection in images using adversarial autoencoders. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Springer, Cham, pp 206–222

  3. Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: Proceedings of COMPSTAT’2010. Springer, Berlin, pp 177–186

  4. Brown CD, Davis HT (2006) Receiver operating characteristics curves and related decision measures: a tutorial. Chemom Intell Lab Syst 80(1):24–38

    Article  Google Scholar 

  5. Caron M, Misra I, Mairal J, Goyal P, Bojanowski P, Joulin A (2020) Unsupervised learning of visual features by contrasting cluster assignments. In: Larochelle H, Ranzato M, Hadsell R, Balcan M, Lin H (eds) Advances in neural information processing systems, vol 33. Curran Associates, Inc., Red Hook, pp 9912–9924

    Google Scholar 

  6. Cha H, Lee J, Shin J (2021) Co2l: contrastive continual learning. In: Proceedings of the IEEE/CVF international conference on computer vision, IEEE, Montreal, QC, Canada, pp 9516–9525

  7. Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv (CSUR) 41(3):1–58

    Article  Google Scholar 

  8. Chang Y, Tu Z, Xie W, Yuan J (2020) Clustering driven deep autoencoder for video anomaly detection. In: European conference on computer vision. Springer, Berlin, pp 329–345

  9. Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International conference on machine learning, JMLR.org, pp 1597–1607

  10. Chen X, He K (2021) Exploring simple Siamese representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), Nashville, TN, USA, pp 15750–15758

  11. Chen X, Fan H, Girshick R, He K (2020) Improved baselines with momentum contrastive learning. arXiv preprint. arXiv:2003.04297

  12. Cheng J, Hussein ME, Billa J, AbdAlmgaeed W (2022) Attack-agnostic adversarial detection. In: Workshop on trustworthy and socially responsible machine learning, NeurIPS 2022, Virtual

  13. Demertzis K, Iliadis L, Tziritas N, Kikiras P (2020) Anomaly detection via blockchained deep learning smart contracts in industry 4.0. Neural Comput Appl 32(23):17361–17378

    Article  Google Scholar 

  14. Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, IEEE, Miami, FL, USA, pp 248–255

  15. Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2021) An image is worth 16x16 words: transformers for image recognition at scale. In: International conference on learning representations, Virtual Event, Austria

  16. Elson J, Douceur JR, Howell J, Saul J (2007) Asirra: a captcha that exploits interest-aligned manual image categorization. CCS 7:366–374

    Google Scholar 

  17. Fan L, Liu S, Chen PY, Zhang G, Gan C (2021) When does contrastive learning preserve adversarial robustness from pretraining to finetuning? Adv Neural Inf Process Syst 34:21480–21492

    Google Scholar 

  18. Fan W, Liang C, Wang T (2022) Contrastive semantic disentanglement in latent space for generalized zero-shot learning. Knowl Based Syst 257(109):949

    Google Scholar 

  19. Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th international conference on machine learning, vol 70. JMLR.org, Sydney, NSW, Australia, pp 1126–1135

  20. Frikha A, Krompaß D, Koepken HG, Tresp V (2021) Few-shot one-class classification via meta-learning. In: Proceedings of AAAI-21, AAAI Press, Virtual, pp 7448–7456

  21. Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics, vol 15. PMLR, Fort Lauderdale, FL, USA, pp 315–323

  22. Gong D, Liu L, Le V, Saha B, Mansour MR, Venkatesh S, van den Hengel A (2019) Memorizing normality to detect anomaly: memory-augmented deep autoencoder for unsupervised anomaly detection. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), Seoul, Korea (South), pp 1705–1714

  23. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. Adv Neural Inf Process Syst 27:2672–2680

  24. Han C, Rundo L, Murao K, Noguchi T, Shimahara Y, Milacski ZÁ, Koshino S, Sala E, Nakayama H, Satoh S (2021) MADGAN: unsupervised medical anomaly detection GAN using multiple adjacent brain MRI slice reconstruction. BMC Bioinform 22(2):1–20

    Google Scholar 

  25. Han K, Wang Y, Guo J, Tang Y, Wu E (2022) Vision GNN: an image is worth graph of nodes. In: Advances in neural information processing systems, Curran Associates, Inc., New Orleans, USA, pp 8291–8303

  26. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, IEEE, Las Vegas, NV, USA, pp 770–778

  27. He K, Fan H, Wu Y, Xie S, Girshick R (2020) Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, IEEE, Seattle, WA, USA, pp 9729–9738

  28. Hendrycks D, Mazeika M, Kadavath S, Song D (2019) Using self-supervised learning can improve model robustness and uncertainty. In: Advances in neural information processing systems, vol 32, Curran Associates Inc., Red Hook, NY, USA, pp 15663–15674

  29. van Hespen KM, Zwanenburg JJ, Dankbaar JW, Geerlings MI, Hendrikse J, Kuijf HJ (2021) An anomaly detection approach to identify chronic brain infarcts on MRI. Sci Rep 11(1):1–10

    Google Scholar 

  30. Kamat P, Sugandhi R (2020) Anomaly detection for predictive maintenance in industry 4.0—a survey. In: E3S web of conferences, EDP Sciences, vol 170. EDP Sciences, pp 02007

  31. Krizhevsky A (2009) Learning multiple layers of features from tiny images. Technical report, pp 1–60

  32. Lee W, Xiang D (2001) Information-theoretic measures for anomaly detection. In: Proceedings 2001 IEEE symposium on security and privacy. S &P 2001. IEEE, Oakland, CA, USA, pp 130–143

  33. Li T, Wang Z, Liu S, Lin WY (2021) Deep unsupervised anomaly detection. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, IEEE, Virtual, pp 3636–3645

  34. Lieber RL (1990) Statistical significance and statistical power in hypothesis testing. J Orthop Res 8(2):304–309

    Article  Google Scholar 

  35. Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2017) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252

    MathSciNet  Google Scholar 

  36. Raghu M, Unterthiner T, Kornblith S, Zhang C, Dosovitskiy A (2021) Do vision transformers see like convolutional neural networks? In: Advances in neural information processing systems, Curran Associates, Inc., Virtual, pp 12116–12128

  37. Reiss T, Hoshen Y (2021) Mean-shifted contrastive loss for anomaly detection. arXiv preprint. arXiv:2106.03844

  38. Reiss T, Cohen N, Bergman L, Hoshen Y (2021) Panda—adapting pretrained features for anomaly detection. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Nashville, TN, USA, pp 2805–2813

  39. Ruff L, Vandermeulen R, Goernitz N, Deecke L, Siddiqui SA, Binder A, Müller E, Kloft M (2018) Deep one-class classification. In: Proceedings of the 35th international conference on machine learning, vol 80, PMLR, Stockholm, Sweden, pp 4393–4402

  40. Ruff L, Vandermeulen R, Goernitz N, Deecke L, Siddiqui SA, Binder A, Müller E, Kloft M (2018b) Deep one-class classification. In: International conference on machine learning, PMLR, Stockholm, Sweden, pp 4393–4402

  41. Sabokrou M, Khalooei M, Fathy M, Adeli E (2018) Adversarially learned one-class classifier for novelty detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), IEEE, Salt Lake City, UT, USA, pp 3379–3388

  42. Sohn K, Li CL, Yoon J, Jin M, Pfister T (2021) Learning and evaluating representations for deep one-class classification. In: International conference on learning representations, Virtual Event, Austria

  43. Spanhol FA, Oliveira LS, Petitjean C, Heutte L (2015) A dataset for breast cancer histopathological image classification. IEEE Trans Biomed Eng 63(7):1455–1462

    Article  Google Scholar 

  44. Tack J, Mo S, Jeong J, Shin J (2020) CSI: novelty detection via contrastive learning on distributionally shifted instances. In: 34th Conference on neural information processing systems, Curran Associates, Inc., Virtual, pp 11839–11852

  45. Tax DM, Duin RP (2004) Support vector data description. Mach Learn 54(1):45–66

    Article  MATH  Google Scholar 

  46. Wang J, Cherian A (2019) Gods: generalized one-class discriminative subspaces for anomaly detection. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), IEEE, Seoul, Korea (South), pp 8200–8210

  47. Wu JC, Chen DJ, Fuh CS, Liu TL (2021) Learning unsupervised metaformer for anomaly detection. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), IEEE, Virtual, pp 4369–4378

  48. Zagoruyko S, Komodakis N (2015) Learning to compare image patches via convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, IEEE, Boston, MA, USA, pp 4353–4361

  49. Zong B, Song Q, Min MR, Cheng W, Lumezanu C, Cho D, Chen H (2018) Deep autoencoding Gaussian mixture model for unsupervised anomaly detection. In: International conference on learning representations, Vancouver, BC, Canada

Download references

Acknowledgements

The completion of this work was supported in part by the National Natural Science Foundation of China (62276106), the UIC Start-up Research Fund (UICR0700056-23), the Guangdong Provincial Key Laboratory of Interdisciplinary Research and Application for Data Science (2022B1212010006), the Guangdong Higher Education Upgrading Plan (2021-2025) of “Rushing to the Top, Making Up Shortcomings and Strengthening Special Features” (R0400001-22), and the Artificial Intelligence and Data Science Research Hub (AIRH) of BNU-HKBU United International College (UIC).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wentao Fan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fan, W., Shangguan, W. & Chen, Y. Transformer-based contrastive learning framework for image anomaly detection. Int. J. Mach. Learn. & Cyber. 14, 3413–3426 (2023). https://doi.org/10.1007/s13042-023-01840-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-023-01840-7

Keywords

Navigation