ABSTRACT
Graph few-shot learning (GFSL) has achieved great success in node classification tasks with rare labels. However, graph few-shot classification (GFSC) models often encounter the problem of classifying test samples with unobserved (or unknown) classes due to the rareness of labels. We formulate this problem as out-of-distribution (OOD) sample detection in inductive graph few-shot learning. This paper presents SMUG, a novel GFSL framework that can detect unobserved classes. Since we have no ground-truth OOD samples in a practical training dataset, it is challenging for the GFSC model to retrieve knowledge about unknown classes from labeled samples. To address this difficulty, we propose a sand mixing scheme to introduce observed classes as artificial OOD samples into meta-tasks. We also develop two unsupervised OOD discriminators to identify OOD samples. Thus, we can assess the performance of OOD discriminators since we know the true classes of these artificial OOD samples. Subsequently, we design a novel training procedure to optimize the encoder based on the performance of the OOD discriminators and the GFSC model. It not only enables the GFSL model to distinguish OOD samples but also promotes the classification accuracy of normal samples. We conduct extensive experiments to evaluate the effectiveness of SMUG based on four benchmark datasets. Experimental results demonstrate that SMUG achieves superior performance over state-of-the-art approaches in OOD detection and node classification. The source code of this paper is available at https://github.com/Memepp/SMUG.
Supplemental Material
- V. G. Satorras and J. B. Estrach, "Few-shot learning with graph neural networks," in ICLR, 2018.Google Scholar
- J. Kim, T. Kim, S. Kim, and C. D. Yoo, "Edge-labeling graph neural network for few-shot learning," in CVPR, 2019, pp. 11--20.Google Scholar
- K. Ding, J. Wang, J. Li, K. Shu, C. Liu, and H. Liu, "Graph prototypical networks for few-shot learning on attributed networks," in CIKM, 2020, pp. 295--304.Google Scholar
- M. Rußwurm, S. Wang, M. Körner, and D. B. Lobell, "Meta-learning for few-shot land cover classification," in CVPR Workshop, 2020, pp. 788--796.Google Scholar
- T. Gong, X. Zheng, and X. Lu, "Meta self-supervised learning for distribution shifted few-shot scene classification," IEEE Geosci. Remote. Sens. Lett., pp. 1--5, 2022.Google Scholar
- N.Wang, M. Luo, K. Ding, L. Zhang, J. Li, and Q. Zheng, "Graph few-shot learning with attribute matching," in CIKM, 2020, pp. 1545--1554.Google Scholar
- C. Zhang, H. Ding, G. Lin, R. Li, C. Wang, and C. Shen, "Meta navigator: Search for a good adaptation policy for few-shot learning," in ICCV, 2021, pp. 9415--9424.Google Scholar
- Y. Hu, A. Chapman, G.Wen, andW. Hall, "What can knowledge bring to machine learning? - A survey of low-shot learning for structured data," ACM Trans. Intell. Syst. Technol., pp. 48:1--48:45, 2022.Google Scholar
- K. Wang, P. Vicol, E. Triantafillou, and R. Zemel, "Few-shot out-of-distribution detection," in ICML Workshop, 2020.Google Scholar
- N. Dionelis, "FROB: few-shot robust model for classification and out-ofdistribution detection," CoRR, 2021.Google Scholar
- C. S. Sastry and S. Oore, "Detecting out-of-distribution examples with indistribution examples and gram matrices," CoRR, 2019.Google Scholar
- J. Bitterwolf, A. Meinke, and M. Hein, "Certifiably adversarially robust detection of out-of-distribution data," in NeurIPS, 2020.Google Scholar
- F. Möller, D. Botache, D. Huseljic, F. Heidecker, M. Bieshaar, and B. Sick, "Out-ofdistribution detection and generation using soft brownian offset sampling and autoencoders," in CVPR, 2021, pp. 46--55.Google Scholar
- H. Kvinge, S. Howland, N. Courts, L. A. Phillips, J. Buckheit, Z. New, E. Skomski, J. H. Lee, S. Tiwari, J. Hibler, C. D. Corley, and N. O. Hodas, "One representation to rule them all: Identifying out-of-support examples in few-shot learning with generic representations," CoRR, 2021.Google Scholar
- S. Ando, "Deep representation learning with an information-theoretic loss," arXiv:2111.12950, 2021.Google Scholar
- S. Jiang, F. Feng, W. Chen, X. Li, and X. He, "Structure-enhanced meta-learning for few-shot graph classification," AI Open, pp. 160--167, 2021.Google Scholar
- V. Ranjan, U. Sharma, T. Nguyen, and M. Hoai, "Learning to count everything," in CVPR, 2021, pp. 3394--3403.Google Scholar
- S. Wang, X. Huang, C. Chen, L. Wu, and J. Li, "REFORM: error-aware few-shot knowledge graph completion," in CIKM, 2021, pp. 1979--1988.Google Scholar
- J. Howard and S. Ruder, "Universal language model fine-tuning for text classification," in ACL, 2018, pp. 328--339.Google Scholar
- Z. Shen, Z. Liu, J. Qin, M. Savvides, and K. Cheng, "Partial is better than all: Revisiting fine-tuning strategy for few-shot learning," in AAAI, 2021, pp. 9594--9602.Google Scholar
- Z. KL, J. XL, and W. YZ, "Survey on few-shot learning," Journal of Software, pp. 349--369, 2021.Google Scholar
- J. A. Royle, R. M. Dorazio, and W. A. Link, "Analysis of multinomial models with unknown index using data augmentation," Journal of Computational and Graphical Statistics, pp. 67--85, 2007.Google Scholar
- B. Liu, X. Wang, M. Dixit, R. Kwitt, and N. Vasconcelos, "Feature space transfer for data augmentation," in CVPR, 2018, pp. 9090--9098.Google Scholar
- Y. Liu, J. Lee, M. Park, S. Kim, E. Yang, S. J. Hwang, and Y. Yang, "Learning to propagate labels: Transductive propagation network for few-shot learning," in ICLR, 2019.Google Scholar
- R. Hou, H. Chang, B. Ma, S. Shan, and X. Chen, "Cross attention network for few-shot classification," in NeurIPS, 2019, pp. 4005--4016.Google Scholar
- Y. Xian, S. Sharma, B. Schiele, and Z. Akata, "F-VAEGAN-D2: A feature generating framework for any-shot learning," in CVPR, 2019, pp. 10 275--10 284.Google Scholar
- Z. Chen, Y. Fu, Y. Wang, L. Ma, W. Liu, and M. Hebert, "Image deformation meta-networks for one-shot learning," in CVPR, 2019, pp. 8680--8689.Google Scholar
- Z. Chen, Y. Fu, Y. Zhang, Y. Jiang, X. Xue, and L. Sigal, "Semantic feature augmentation in few-shot learning," CoRR, 2018.Google Scholar
- W. Shen, Z. Shi, and J. Sun, "Learning from adversarial features for few-shot classification," CoRR, 2019.Google Scholar
- X. Liu, X. LUAN, Y. Xie et al., "Transfer learning research and algorithm review," Journal of Changsha University, pp. 29--36, 2018.Google Scholar
- W. H, "Research review on transfer learning," Computer Knowledge and Technology, p. 3, 2017.Google Scholar
- A. Bellet, A. Habrard, and M. Sebban, "A survey on metric learning for feature vectors and structured data," CoRR, 2013.Google Scholar
- G. Koch, R. Zemel, R. Salakhutdinov et al., "Siamese neural networks for one-shot image recognition," in ICML workshop, 2015, p. 0.Google Scholar
- O. Vinyals, C. Blundell, T. Lillicrap, K. Kavukcuoglu, and D. Wierstra, "Matching networks for one shot learning," in NeurIPS, 2016, pp. 3630--3638.Google Scholar
- J. Snell, K. Swersky, and R. Zemel, "Prototypical networks for few-shot learning," Advances in Neural Information Processing Systems, pp. 4077--4087, 2017.Google Scholar
- S. Thrun and L. Pratt, "Learning to learn: Introduction and overview," in Learning to learn, 1998, pp. 3--17.Google ScholarCross Ref
- T. N. Kipf and M. Welling, "Semi-supervised classification with graph convolutional networks," in ICLR, 2017.Google Scholar
- Y. Li, D. Tarlow, M. Brockschmidt, and R. S. Zemel, "Gated graph sequence neural networks," in ICLR, 2016.Google Scholar
- P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio, "Graph attention networks," in ICLR, 2018.Google Scholar
- D. Hendrycks and K. Gimpel, "A baseline for detecting misclassified and out-ofdistribution examples in neural networks," in ICLR, 2017.Google Scholar
- S. Liang, Y. Li, and R. Srikant, "Enhancing the reliability of out-of-distribution image detection in neural networks," arXiv:1706.02690, 2017.Google Scholar
- W. Liu, X.Wang, J. Owens, and Y. Li, "Energy-based out-of-distribution detection," Advances in Neural Information Processing Systems, pp. 21 464--21 475, 2020.Google Scholar
- Y. Sun, C. Guo, and Y. Li, "React: Out-of-distribution detection with rectified activations," in NeurIPS, 2021, pp. 144--157.Google Scholar
- K. Lee, K. Lee, H. Lee, and J. Shin, "A simple unified framework for detecting out-of-distribution samples and adversarial attacks," in NeurIPS, 2018, pp. 7167--7177.Google Scholar
- H. Choi, E. Jang, and A. A. Alemi, "Waic, but why? generative ensembles for robust anomaly detection," arXiv preprint arXiv:1810.01392, 2018.Google Scholar
- P. Kirichenko, P. Izmailov, and A. G. Wilson, "Why normalizing flows fail to detect out-of-distribution data," in NeurIPS, 2020.Google Scholar
- Z. Xiao, Q. Yan, and Y. Amit, "Likelihood regret: An out-of-distribution detection score for variational auto-encoder," Advances in neural information processing systems, pp. 20 685--20 696, 2020.Google Scholar
- J. Tian, M. H. Azarian, and M. Pecht, "Anomaly detection using self-organizing maps-based k-nearest neighbor algorithm," in PHM Society European Conference, 2014.Google Scholar
- S. Fort, J. Ren, and B. Lakshminarayanan, "Exploring the limits of out-ofdistribution detection," in NeurIPS, 2021, pp. 7068--7081.Google Scholar
- J. Bitterwolf, A. Meinke, and M. Hein, "Certifiably adversarially robust detection of out-of-distribution data," Advances in Neural Information Processing Systems, pp. 16 085--16 095, 2020.Google Scholar
- W. L. Hamilton, R. Ying, and J. Leskovec, "Inductive representation learning on large graphs," in Proceedings of the 31st International Conference on Neural Information Processing Systems, ser. NIPS'17. Red Hook, NY, USA: Curran Associates Inc., 2017, p. 1025--1035.Google Scholar
- J. J. McAuley, R. Pandey, and J. Leskovec, "Inferring networks of substitutable and complementary products," in KDD, 2015, pp. 785--794.Google Scholar
- P. Sen, G. Namata, M. Bilgic, L. Getoor, B. Galligher, and T. Eliassi-Rad, "Collective classification in network data," AI magazine, pp. 93--93, 2008.Google Scholar
- J. Tang, J. Zhang, L. Yao, J. Li, L. Zhang, and Z. Su, "Arnetminer: extraction and mining of academic social networks," in KDD, 2008, pp. 990--998.Google ScholarDigital Library
- F. Zhou, C. Cao, K. Zhang, G. Trajcevski, T. Zhong, and J. Geng, "Meta-gnn: On few-shot node classification in graph meta-learning," in CIKM, 2019, pp. 2357--2360.Google ScholarDigital Library
- L. Van der Maaten and G. Hinton, "Visualizing data using t-sne." Journal of machine learning research, 2008.Google Scholar
- N. Tishby, F. C. N. Pereira, and W. Bialek, "The information bottleneck method," CoRR, 2000.Google Scholar
- N. Tishby and N. Zaslavsky, "Deep learning and the information bottleneck principle," in ITW, 2015, pp. 1--5.Google Scholar
Index Terms
- SMUG: Sand Mixing for Unobserved Class Detection in Graph Few-Shot Learning
Recommendations
FROB: Few-Shot ROBust Model for Joint Classification and Out-of-Distribution Detection
Machine Learning and Knowledge Discovery in DatabasesAbstractClassification and Out-of-Distribution (OoD) detection in the few-shot setting remain challenging aims, but are important for devising critical systems in security where samples are limited. OoD detection requires that classifiers are aware of ...
Multi-task Curriculum Framework for Open-Set Semi-supervised Learning
Computer Vision – ECCV 2020AbstractSemi-supervised learning (SSL) has been proposed to leverage unlabeled data for training powerful models when only limited labeled data is available. While existing SSL methods assume that samples in the labeled and unlabeled data share the ...
GOOD-D: On Unsupervised Graph Out-Of-Distribution Detection
WSDM '23: Proceedings of the Sixteenth ACM International Conference on Web Search and Data MiningMost existing deep learning models are trained based on the closed-world assumption, where the test data is assumed to be drawn i.i.d. from the same distribution as the training data, known as in-distribution (ID). However, when models are deployed in an ...
Comments