Skip to main content

Advertisement

CSRP: Modeling class spatial relation with prototype network for novel class discovery

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Novel Class Discovery(NCD) is a learning paradigm within the open-world task, in which machine learning models leverage prior knowledge to guide unknown samples into semantic clusters in an unsupervised environment. Recent research notes that maintaining class relations can assist classifiers in better recognizing unknown classes. Inspired by this study, we propose Class-Spatial-Relation modeling with a Prototype network (CSRP). A prototype network is a machine learning model used to classify tasks. It performs by learning prototypes for each class and makes classification decisions based on the similarity between a given sample and these prototypes. It conducts complex class boundaries better than linear classification models, providing higher flexibility and accuracy for classification tasks. Specifically, the proposed prototype network enables spatial modeling based on the distance between samples and each prototype, which can better obtain class relation information to improve the model’s interpretability and robustness. In addition, we simultaneously perform knowledge distillation on known and unknown classes to balance the model’s classification performance for each class. To evaluate the effectiveness and generality of our method, we perform extensive experiments on the CIFAR-100 dataset and fine-grained datasets: Stanford Cars, CUB-200-2011, and FGVC-Aircraft, respectively. Our method results are comparable to existing state-of-the-art performance in the standard dataset CIFAF100, while outstanding performance on three fine-grained datasets surpassed the baseline by 3%-9%. In addition, our method creates more compact clusters in the latent space than in linear classification. The success demonstrates the effectiveness of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability

The data used in this study are sourced from a publicly available dataset. The dataset can be accessed at the following URL: \(\bullet \) Animals: https://www.kaggle.com/datasets/alessiocorrado99/animals10\(\bullet \) CIFAR100: http://www.cs.toronto.edu/ kriz/cifar.html\(\bullet \) Stanford Cars: http://imagenet.stanford.edu/internal/car196\(\bullet \) CUB-200-2011: https://www.vision.caltech.edu/datasets/cub-200-2011/\(\bullet \) Aircraft: https://www.robots.ox.ac.uk/ vgg/data/fgvc-aircraft/

References

  1. Parmar J, Chouhan S, Raychoudhury V, Rathore S (2023) Open-world machine learning: applications, challenges, and opportunities. ACM Comput Surv 55(10):1–37

    Article  Google Scholar 

  2. Kejriwal M, Kildebeck E, Steininger R, Shrivastava A (2024) Challenges, evaluation and opportunities for open-world learning. Nat Mach Intell pp 1–9

  3. MacQueen J (1962) Classification and analysis of multivariate observations. In: 5th Berkeley symp math statist. Probability, pp 281–297

  4. Han K, Vedaldi A, Zisserman A (2019) Learning to discover novel visual categories via deep transfer clustering. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8401–8409

  5. Gupta SN, Brown NB (2022) Adjusting for bias with procedural data. arXiv preprint arXiv:2204.01108

  6. Krause J, Jin H, Yang J, Fei-Fei L (2015) Fine-grained recognition without part annotations. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5546–5555

  7. Han K, Rebuffi SA, Ehrhardt S, Vedaldi A, Zisserman A (2021) Autonovel: Automatically discovering and learning novel visual categories. IEEE Trans Pattern Anal Mach Intell 44(10):6767–6781

  8. Zhao B, Han K (2021) Novel visual category discovery with dual ranking statistics and mutual knowledge distillation. Adv Neural Inf Process Syst 34:22982–22994

    Google Scholar 

  9. Zhong Z, Fini E, Roy S, Luo Z, Ricci E, Sebe N (2021) Neighborhood contrastive learning for novel class discovery. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10867–10875

  10. Yang M, Zhu Y, Yu J, Wu A, Deng C (2022) Divide and conquer: Compositional experts for generalized novel class discovery. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14268–14277

  11. Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International conference on machine learning, PMLR, pp 1597–1607

  12. Khosla P, Teterwak P, Wang C, Sarna A, Tian Y, Isola P, Maschinot A, Liu C, Krishnan D (2020) Supervised contrastive learning. Adv Neural Inf Process Syst 33:18661–18673

  13. Caron M, Misra I, Mairal J, Goyal P, Bojanowski P, Joulin A (2020) Unsupervised learning of visual features by contrasting cluster assignments. Adv Neural Inf Process Syst 33:9912–9924

  14. He K, Fan H, Wu Y, Xie S, Girshick R (2020) Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9729–9738

  15. Gu P, Zhang C, Xu R, He X (2023) Class-relation knowledge distillation for novel class discovery. lamp 12(15.0):17–5

  16. Fini E, Sangineto E, Lathuiliére S, Zhong Z, Nabi M, Ricci E (2021) A unified objective for novel class discovery. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9284–9292

  17. Krause J, Stark M, Deng J, Fei-Fei L (2013) 3d object representations for fine-grained categorization. In: Proceedings of the IEEE international conference on computer vision workshops, pp 554–561

  18. Maji S, Rahtu E, Kannala J, Blaschko M, Vedaldi A (2013) Fine-grained visual classification of aircraft. arXiv preprint arXiv:1306.5151

  19. Krizhevsky A, Hinton G et al (2009) Learning multiple layers of features from tiny images

  20. Liu Y, Tuytelaars T (2022) Residual tuning: Toward novel category discovery without labels. IEEE Transactions on neural networks and learning systems 34(10):7271–7285

    Article  Google Scholar 

  21. Vaze S, Han K, Vedaldi A, Zisserman A (2022) Generalized category discovery. In: IEEE Conference on computer vision and pattern recognition

  22. Duan Y, He J, Zhang R, Wang R, Li X Nie F(2024) Prediction consistency regularization for generalized category discovery. Inform Fus 112:102547

  23. Liu J, Li X, Dong C (2024) Unknown sample selection and discriminative classifier learning for generalized category discovery. J Vis Commun Image Rep 104203

  24. Zhao Z, Li X, Zhai Z, Chang Z (2024) Pseudo-supervised contrastive learning with inter-class separability for generalized category discovery. Knowl-Based Syst 289:111477

    Article  Google Scholar 

  25. Huang Z, Chen J, Zhang J, Shan H (2022) Learning representation for clustering via prototype scattering and positive sampling. IEEE Trans Pattern Anal Mach Intell 45(6):7509–7524

    Article  MATH  Google Scholar 

  26. Assran M, Caron M, Misra I, Bojanowski P, Bordes F, Vincent P, Joulin A, Rabbat M, Ballas N (2022) Masked siamese networks for label-efficient learning. In: European conference on computer vision, Springer, pp 456–473

  27. Yang H-M, Zhang X-Y, Yin F, Yang Q, Liu C-L (2020) Convolutional prototype network for open set recognition. IEEE Trans Pattern Anal Mach Intell 44(5):2358–2370

    MATH  Google Scholar 

  28. Yue X, Zheng Z, Zhang S, Gao Y, Darrell T, Keutzer K, Vincentelli AS (2021) Prototypical cross-domain self-supervised learning for few-shot unsupervised domain adaptation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13834–13844

  29. Pan Y, Yao T, Li Y, Wang Y, Ngo C-W, Mei T (2019) Transferrable prototypical networks for unsupervised domain adaptation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2239–2247

  30. Chen C, Li O, Tao D, Barnett A, Rudin C, Su JK (2019) This looks like that: deep learning for interpretable image recognition. Adv Neural Inform Process Syst 32

  31. Sun Y, Li Y (2023) Opencon: Open-world contrastive learning. In: Transactions on machine learning research. https://openreview.net/forum?id=2wWJxtpFer

  32. An W, Tian F, Zheng Q, Ding W, Wang Q, Chen P (2023) Generalized category discovery with decoupled prototypical network. In: Proceedings of the AAAI conference on artificial intelligence, vol 37, pp 12527–12535

  33. Gou J, Yu B, Maybank SJ, Tao D (2021) Knowledge distillation: A survey. Int J Comput Vision 129(6):1789–1819

  34. Park W, Kim D, Lu Y, Cho M (2019) Relational knowledge distillation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3967–3976

  35. Wang L, Yoon K-J (2021) Knowledge distillation and student-teacher learning for visual intelligence: A review and new outlooks. IEEE Trans Pattern Anal Mach Intell 44(6):3048–3068

  36. Ahn S, Hu SX, Damianou A, Lawrence ND, Dai Z (2019) Variational information distillation for knowledge transfer. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9163–9171

  37. Asano YM, Rupprecht C, Vedaldi A (2019) Self-labelling via simultaneous clustering and representation learning. arXiv preprint arXiv:1911.05371

  38. Caron M, Bojanowski P, Joulin A, Douze M (2018) Deep clustering for unsupervised learning of visual features. In: Proceedings of the European conference on computer vision (ECCV), pp 132–149

  39. Cuturi M (2013) Sinkhorn distances: Lightspeed computation of optimal transport. Adv Neural Inform Process Syst 26

  40. Yang H-M, Zhang X-Y, Yin F, Liu C-L (2018) Robust classification with convolutional prototype learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3474–3482

  41. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vision 115:211–252

    Article  MathSciNet  Google Scholar 

  42. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  43. Liu S, Zeng Z, Ren T, Li F, Zhang H, Yang J, Li C, Yang J, Su H, Zhu J et al (2023) Grounding dino: Marrying dino with grounded pre-training for open-set object detection. arXiv preprint arXiv:2303.05499

  44. Maaten L, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9(11)

  45. Li W, Fan Z, Huo J, Gao Y (2023) Modeling inter-class and intra-class constraints in novel class discovery. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 3449–3458

  46. Liu J, Wang Y, Zhang T, Fan Y, Yang Q, Shao J (2023) Open-world semi-supervised novel class discovery. In: Proceedings of the thirty-second international joint conference on artificial intelligence, pp 4002–4010

  47. Zhao B, Wen X, Han K (2023) Learning semi-supervised gaussian mixture models for generalized category discovery. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 16623–16633

  48. Liu F, Deng Y (2020) Determine the number of unknown targets in open world based on elbow method. IEEE Trans Fuzzy Syst 29(5):986–995

Download references

Acknowledgements

We acknowledge and appreciate the efforts of the dataset creators in making their data available to the research community. This study utilized the dataset in accordance with its stated permissions and guidelines. Researchers interested in accessing and utilizing the data for their own analyses can refer to the provided URL to obtain the necessary files and information.

Funding

This work was supported by the Science and Technology Development Fund (FDCT) of Macau under Grant No. 0071/2022/A, and The Science and Technology Development Fund (FDCT) of Macau under Grant No. 0010/2024/AGJ.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Wei Jin, Jiuqing Dong and Huiwen Guo. The first draft of the manuscript was written by Wei Jin and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript. Supervision: Nannan Li, Wenmin Wang and Chuanchuan You.

Corresponding author

Correspondence to Nannan Li.

Ethics declarations

Ethical and informed consent for data used

This study adheres to ethical guidelines and obtained informed consent for the data used. All data sources and participants involved in the research provided their consent for the collection, analysis, and publication of the data. This declaration affirms our commitment to conducting research in an ethical manner and upholding the principles of informed consent.

Competing interests

The authors have no competing interests to declare that are relevant to the content of this article.

Conflicts of interest

There are no financial or non-financial relationships that could be perceived as influencing the integrity or objectivity of the research conducted. This declaration ensures transparency and assures readers that there are no conflicts of interest that could compromise the validity of the findings presented in this manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jin, W., Li, N., Dong, J. et al. CSRP: Modeling class spatial relation with prototype network for novel class discovery. Appl Intell 55, 207 (2025). https://doi.org/10.1007/s10489-024-05946-5

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10489-024-05946-5

Keywords