Abstract
Novel Class Discovery(NCD) is a learning paradigm within the open-world task, in which machine learning models leverage prior knowledge to guide unknown samples into semantic clusters in an unsupervised environment. Recent research notes that maintaining class relations can assist classifiers in better recognizing unknown classes. Inspired by this study, we propose Class-Spatial-Relation modeling with a Prototype network (CSRP). A prototype network is a machine learning model used to classify tasks. It performs by learning prototypes for each class and makes classification decisions based on the similarity between a given sample and these prototypes. It conducts complex class boundaries better than linear classification models, providing higher flexibility and accuracy for classification tasks. Specifically, the proposed prototype network enables spatial modeling based on the distance between samples and each prototype, which can better obtain class relation information to improve the model’s interpretability and robustness. In addition, we simultaneously perform knowledge distillation on known and unknown classes to balance the model’s classification performance for each class. To evaluate the effectiveness and generality of our method, we perform extensive experiments on the CIFAR-100 dataset and fine-grained datasets: Stanford Cars, CUB-200-2011, and FGVC-Aircraft, respectively. Our method results are comparable to existing state-of-the-art performance in the standard dataset CIFAF100, while outstanding performance on three fine-grained datasets surpassed the baseline by 3%-9%. In addition, our method creates more compact clusters in the latent space than in linear classification. The success demonstrates the effectiveness of our approach.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The data used in this study are sourced from a publicly available dataset. The dataset can be accessed at the following URL: \(\bullet \) Animals: https://www.kaggle.com/datasets/alessiocorrado99/animals10\(\bullet \) CIFAR100: http://www.cs.toronto.edu/ kriz/cifar.html\(\bullet \) Stanford Cars: http://imagenet.stanford.edu/internal/car196\(\bullet \) CUB-200-2011: https://www.vision.caltech.edu/datasets/cub-200-2011/\(\bullet \) Aircraft: https://www.robots.ox.ac.uk/ vgg/data/fgvc-aircraft/
References
Parmar J, Chouhan S, Raychoudhury V, Rathore S (2023) Open-world machine learning: applications, challenges, and opportunities. ACM Comput Surv 55(10):1–37
Kejriwal M, Kildebeck E, Steininger R, Shrivastava A (2024) Challenges, evaluation and opportunities for open-world learning. Nat Mach Intell pp 1–9
MacQueen J (1962) Classification and analysis of multivariate observations. In: 5th Berkeley symp math statist. Probability, pp 281–297
Han K, Vedaldi A, Zisserman A (2019) Learning to discover novel visual categories via deep transfer clustering. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 8401–8409
Gupta SN, Brown NB (2022) Adjusting for bias with procedural data. arXiv preprint arXiv:2204.01108
Krause J, Jin H, Yang J, Fei-Fei L (2015) Fine-grained recognition without part annotations. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5546–5555
Han K, Rebuffi SA, Ehrhardt S, Vedaldi A, Zisserman A (2021) Autonovel: Automatically discovering and learning novel visual categories. IEEE Trans Pattern Anal Mach Intell 44(10):6767–6781
Zhao B, Han K (2021) Novel visual category discovery with dual ranking statistics and mutual knowledge distillation. Adv Neural Inf Process Syst 34:22982–22994
Zhong Z, Fini E, Roy S, Luo Z, Ricci E, Sebe N (2021) Neighborhood contrastive learning for novel class discovery. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10867–10875
Yang M, Zhu Y, Yu J, Wu A, Deng C (2022) Divide and conquer: Compositional experts for generalized novel class discovery. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14268–14277
Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International conference on machine learning, PMLR, pp 1597–1607
Khosla P, Teterwak P, Wang C, Sarna A, Tian Y, Isola P, Maschinot A, Liu C, Krishnan D (2020) Supervised contrastive learning. Adv Neural Inf Process Syst 33:18661–18673
Caron M, Misra I, Mairal J, Goyal P, Bojanowski P, Joulin A (2020) Unsupervised learning of visual features by contrasting cluster assignments. Adv Neural Inf Process Syst 33:9912–9924
He K, Fan H, Wu Y, Xie S, Girshick R (2020) Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9729–9738
Gu P, Zhang C, Xu R, He X (2023) Class-relation knowledge distillation for novel class discovery. lamp 12(15.0):17–5
Fini E, Sangineto E, Lathuiliére S, Zhong Z, Nabi M, Ricci E (2021) A unified objective for novel class discovery. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9284–9292
Krause J, Stark M, Deng J, Fei-Fei L (2013) 3d object representations for fine-grained categorization. In: Proceedings of the IEEE international conference on computer vision workshops, pp 554–561
Maji S, Rahtu E, Kannala J, Blaschko M, Vedaldi A (2013) Fine-grained visual classification of aircraft. arXiv preprint arXiv:1306.5151
Krizhevsky A, Hinton G et al (2009) Learning multiple layers of features from tiny images
Liu Y, Tuytelaars T (2022) Residual tuning: Toward novel category discovery without labels. IEEE Transactions on neural networks and learning systems 34(10):7271–7285
Vaze S, Han K, Vedaldi A, Zisserman A (2022) Generalized category discovery. In: IEEE Conference on computer vision and pattern recognition
Duan Y, He J, Zhang R, Wang R, Li X Nie F(2024) Prediction consistency regularization for generalized category discovery. Inform Fus 112:102547
Liu J, Li X, Dong C (2024) Unknown sample selection and discriminative classifier learning for generalized category discovery. J Vis Commun Image Rep 104203
Zhao Z, Li X, Zhai Z, Chang Z (2024) Pseudo-supervised contrastive learning with inter-class separability for generalized category discovery. Knowl-Based Syst 289:111477
Huang Z, Chen J, Zhang J, Shan H (2022) Learning representation for clustering via prototype scattering and positive sampling. IEEE Trans Pattern Anal Mach Intell 45(6):7509–7524
Assran M, Caron M, Misra I, Bojanowski P, Bordes F, Vincent P, Joulin A, Rabbat M, Ballas N (2022) Masked siamese networks for label-efficient learning. In: European conference on computer vision, Springer, pp 456–473
Yang H-M, Zhang X-Y, Yin F, Yang Q, Liu C-L (2020) Convolutional prototype network for open set recognition. IEEE Trans Pattern Anal Mach Intell 44(5):2358–2370
Yue X, Zheng Z, Zhang S, Gao Y, Darrell T, Keutzer K, Vincentelli AS (2021) Prototypical cross-domain self-supervised learning for few-shot unsupervised domain adaptation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13834–13844
Pan Y, Yao T, Li Y, Wang Y, Ngo C-W, Mei T (2019) Transferrable prototypical networks for unsupervised domain adaptation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2239–2247
Chen C, Li O, Tao D, Barnett A, Rudin C, Su JK (2019) This looks like that: deep learning for interpretable image recognition. Adv Neural Inform Process Syst 32
Sun Y, Li Y (2023) Opencon: Open-world contrastive learning. In: Transactions on machine learning research. https://openreview.net/forum?id=2wWJxtpFer
An W, Tian F, Zheng Q, Ding W, Wang Q, Chen P (2023) Generalized category discovery with decoupled prototypical network. In: Proceedings of the AAAI conference on artificial intelligence, vol 37, pp 12527–12535
Gou J, Yu B, Maybank SJ, Tao D (2021) Knowledge distillation: A survey. Int J Comput Vision 129(6):1789–1819
Park W, Kim D, Lu Y, Cho M (2019) Relational knowledge distillation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3967–3976
Wang L, Yoon K-J (2021) Knowledge distillation and student-teacher learning for visual intelligence: A review and new outlooks. IEEE Trans Pattern Anal Mach Intell 44(6):3048–3068
Ahn S, Hu SX, Damianou A, Lawrence ND, Dai Z (2019) Variational information distillation for knowledge transfer. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9163–9171
Asano YM, Rupprecht C, Vedaldi A (2019) Self-labelling via simultaneous clustering and representation learning. arXiv preprint arXiv:1911.05371
Caron M, Bojanowski P, Joulin A, Douze M (2018) Deep clustering for unsupervised learning of visual features. In: Proceedings of the European conference on computer vision (ECCV), pp 132–149
Cuturi M (2013) Sinkhorn distances: Lightspeed computation of optimal transport. Adv Neural Inform Process Syst 26
Yang H-M, Zhang X-Y, Yin F, Liu C-L (2018) Robust classification with convolutional prototype learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3474–3482
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vision 115:211–252
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Liu S, Zeng Z, Ren T, Li F, Zhang H, Yang J, Li C, Yang J, Su H, Zhu J et al (2023) Grounding dino: Marrying dino with grounded pre-training for open-set object detection. arXiv preprint arXiv:2303.05499
Maaten L, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9(11)
Li W, Fan Z, Huo J, Gao Y (2023) Modeling inter-class and intra-class constraints in novel class discovery. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 3449–3458
Liu J, Wang Y, Zhang T, Fan Y, Yang Q, Shao J (2023) Open-world semi-supervised novel class discovery. In: Proceedings of the thirty-second international joint conference on artificial intelligence, pp 4002–4010
Zhao B, Wen X, Han K (2023) Learning semi-supervised gaussian mixture models for generalized category discovery. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 16623–16633
Liu F, Deng Y (2020) Determine the number of unknown targets in open world based on elbow method. IEEE Trans Fuzzy Syst 29(5):986–995
Acknowledgements
We acknowledge and appreciate the efforts of the dataset creators in making their data available to the research community. This study utilized the dataset in accordance with its stated permissions and guidelines. Researchers interested in accessing and utilizing the data for their own analyses can refer to the provided URL to obtain the necessary files and information.
Funding
This work was supported by the Science and Technology Development Fund (FDCT) of Macau under Grant No. 0071/2022/A, and The Science and Technology Development Fund (FDCT) of Macau under Grant No. 0010/2024/AGJ.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Wei Jin, Jiuqing Dong and Huiwen Guo. The first draft of the manuscript was written by Wei Jin and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript. Supervision: Nannan Li, Wenmin Wang and Chuanchuan You.
Corresponding author
Ethics declarations
Ethical and informed consent for data used
This study adheres to ethical guidelines and obtained informed consent for the data used. All data sources and participants involved in the research provided their consent for the collection, analysis, and publication of the data. This declaration affirms our commitment to conducting research in an ethical manner and upholding the principles of informed consent.
Competing interests
The authors have no competing interests to declare that are relevant to the content of this article.
Conflicts of interest
There are no financial or non-financial relationships that could be perceived as influencing the integrity or objectivity of the research conducted. This declaration ensures transparency and assures readers that there are no conflicts of interest that could compromise the validity of the findings presented in this manuscript.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Jin, W., Li, N., Dong, J. et al. CSRP: Modeling class spatial relation with prototype network for novel class discovery. Appl Intell 55, 207 (2025). https://doi.org/10.1007/s10489-024-05946-5
Accepted:
Published:
DOI: https://doi.org/10.1007/s10489-024-05946-5