skip to main content
10.1145/3589334.3645466acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article
Free Access
Artifacts Available / v1.1

SMUG: Sand Mixing for Unobserved Class Detection in Graph Few-Shot Learning

Published:13 May 2024Publication History

ABSTRACT

Graph few-shot learning (GFSL) has achieved great success in node classification tasks with rare labels. However, graph few-shot classification (GFSC) models often encounter the problem of classifying test samples with unobserved (or unknown) classes due to the rareness of labels. We formulate this problem as out-of-distribution (OOD) sample detection in inductive graph few-shot learning. This paper presents SMUG, a novel GFSL framework that can detect unobserved classes. Since we have no ground-truth OOD samples in a practical training dataset, it is challenging for the GFSC model to retrieve knowledge about unknown classes from labeled samples. To address this difficulty, we propose a sand mixing scheme to introduce observed classes as artificial OOD samples into meta-tasks. We also develop two unsupervised OOD discriminators to identify OOD samples. Thus, we can assess the performance of OOD discriminators since we know the true classes of these artificial OOD samples. Subsequently, we design a novel training procedure to optimize the encoder based on the performance of the OOD discriminators and the GFSC model. It not only enables the GFSL model to distinguish OOD samples but also promotes the classification accuracy of normal samples. We conduct extensive experiments to evaluate the effectiveness of SMUG based on four benchmark datasets. Experimental results demonstrate that SMUG achieves superior performance over state-of-the-art approaches in OOD detection and node classification. The source code of this paper is available at https://github.com/Memepp/SMUG.

Skip Supplemental Material Section

Supplemental Material

rfp0899.mp4

Supplemental video

mp4

105.8 MB

References

  1. V. G. Satorras and J. B. Estrach, "Few-shot learning with graph neural networks," in ICLR, 2018.Google ScholarGoogle Scholar
  2. J. Kim, T. Kim, S. Kim, and C. D. Yoo, "Edge-labeling graph neural network for few-shot learning," in CVPR, 2019, pp. 11--20.Google ScholarGoogle Scholar
  3. K. Ding, J. Wang, J. Li, K. Shu, C. Liu, and H. Liu, "Graph prototypical networks for few-shot learning on attributed networks," in CIKM, 2020, pp. 295--304.Google ScholarGoogle Scholar
  4. M. Rußwurm, S. Wang, M. Körner, and D. B. Lobell, "Meta-learning for few-shot land cover classification," in CVPR Workshop, 2020, pp. 788--796.Google ScholarGoogle Scholar
  5. T. Gong, X. Zheng, and X. Lu, "Meta self-supervised learning for distribution shifted few-shot scene classification," IEEE Geosci. Remote. Sens. Lett., pp. 1--5, 2022.Google ScholarGoogle Scholar
  6. N.Wang, M. Luo, K. Ding, L. Zhang, J. Li, and Q. Zheng, "Graph few-shot learning with attribute matching," in CIKM, 2020, pp. 1545--1554.Google ScholarGoogle Scholar
  7. C. Zhang, H. Ding, G. Lin, R. Li, C. Wang, and C. Shen, "Meta navigator: Search for a good adaptation policy for few-shot learning," in ICCV, 2021, pp. 9415--9424.Google ScholarGoogle Scholar
  8. Y. Hu, A. Chapman, G.Wen, andW. Hall, "What can knowledge bring to machine learning? - A survey of low-shot learning for structured data," ACM Trans. Intell. Syst. Technol., pp. 48:1--48:45, 2022.Google ScholarGoogle Scholar
  9. K. Wang, P. Vicol, E. Triantafillou, and R. Zemel, "Few-shot out-of-distribution detection," in ICML Workshop, 2020.Google ScholarGoogle Scholar
  10. N. Dionelis, "FROB: few-shot robust model for classification and out-ofdistribution detection," CoRR, 2021.Google ScholarGoogle Scholar
  11. C. S. Sastry and S. Oore, "Detecting out-of-distribution examples with indistribution examples and gram matrices," CoRR, 2019.Google ScholarGoogle Scholar
  12. J. Bitterwolf, A. Meinke, and M. Hein, "Certifiably adversarially robust detection of out-of-distribution data," in NeurIPS, 2020.Google ScholarGoogle Scholar
  13. F. Möller, D. Botache, D. Huseljic, F. Heidecker, M. Bieshaar, and B. Sick, "Out-ofdistribution detection and generation using soft brownian offset sampling and autoencoders," in CVPR, 2021, pp. 46--55.Google ScholarGoogle Scholar
  14. H. Kvinge, S. Howland, N. Courts, L. A. Phillips, J. Buckheit, Z. New, E. Skomski, J. H. Lee, S. Tiwari, J. Hibler, C. D. Corley, and N. O. Hodas, "One representation to rule them all: Identifying out-of-support examples in few-shot learning with generic representations," CoRR, 2021.Google ScholarGoogle Scholar
  15. S. Ando, "Deep representation learning with an information-theoretic loss," arXiv:2111.12950, 2021.Google ScholarGoogle Scholar
  16. S. Jiang, F. Feng, W. Chen, X. Li, and X. He, "Structure-enhanced meta-learning for few-shot graph classification," AI Open, pp. 160--167, 2021.Google ScholarGoogle Scholar
  17. V. Ranjan, U. Sharma, T. Nguyen, and M. Hoai, "Learning to count everything," in CVPR, 2021, pp. 3394--3403.Google ScholarGoogle Scholar
  18. S. Wang, X. Huang, C. Chen, L. Wu, and J. Li, "REFORM: error-aware few-shot knowledge graph completion," in CIKM, 2021, pp. 1979--1988.Google ScholarGoogle Scholar
  19. J. Howard and S. Ruder, "Universal language model fine-tuning for text classification," in ACL, 2018, pp. 328--339.Google ScholarGoogle Scholar
  20. Z. Shen, Z. Liu, J. Qin, M. Savvides, and K. Cheng, "Partial is better than all: Revisiting fine-tuning strategy for few-shot learning," in AAAI, 2021, pp. 9594--9602.Google ScholarGoogle Scholar
  21. Z. KL, J. XL, and W. YZ, "Survey on few-shot learning," Journal of Software, pp. 349--369, 2021.Google ScholarGoogle Scholar
  22. J. A. Royle, R. M. Dorazio, and W. A. Link, "Analysis of multinomial models with unknown index using data augmentation," Journal of Computational and Graphical Statistics, pp. 67--85, 2007.Google ScholarGoogle Scholar
  23. B. Liu, X. Wang, M. Dixit, R. Kwitt, and N. Vasconcelos, "Feature space transfer for data augmentation," in CVPR, 2018, pp. 9090--9098.Google ScholarGoogle Scholar
  24. Y. Liu, J. Lee, M. Park, S. Kim, E. Yang, S. J. Hwang, and Y. Yang, "Learning to propagate labels: Transductive propagation network for few-shot learning," in ICLR, 2019.Google ScholarGoogle Scholar
  25. R. Hou, H. Chang, B. Ma, S. Shan, and X. Chen, "Cross attention network for few-shot classification," in NeurIPS, 2019, pp. 4005--4016.Google ScholarGoogle Scholar
  26. Y. Xian, S. Sharma, B. Schiele, and Z. Akata, "F-VAEGAN-D2: A feature generating framework for any-shot learning," in CVPR, 2019, pp. 10 275--10 284.Google ScholarGoogle Scholar
  27. Z. Chen, Y. Fu, Y. Wang, L. Ma, W. Liu, and M. Hebert, "Image deformation meta-networks for one-shot learning," in CVPR, 2019, pp. 8680--8689.Google ScholarGoogle Scholar
  28. Z. Chen, Y. Fu, Y. Zhang, Y. Jiang, X. Xue, and L. Sigal, "Semantic feature augmentation in few-shot learning," CoRR, 2018.Google ScholarGoogle Scholar
  29. W. Shen, Z. Shi, and J. Sun, "Learning from adversarial features for few-shot classification," CoRR, 2019.Google ScholarGoogle Scholar
  30. X. Liu, X. LUAN, Y. Xie et al., "Transfer learning research and algorithm review," Journal of Changsha University, pp. 29--36, 2018.Google ScholarGoogle Scholar
  31. W. H, "Research review on transfer learning," Computer Knowledge and Technology, p. 3, 2017.Google ScholarGoogle Scholar
  32. A. Bellet, A. Habrard, and M. Sebban, "A survey on metric learning for feature vectors and structured data," CoRR, 2013.Google ScholarGoogle Scholar
  33. G. Koch, R. Zemel, R. Salakhutdinov et al., "Siamese neural networks for one-shot image recognition," in ICML workshop, 2015, p. 0.Google ScholarGoogle Scholar
  34. O. Vinyals, C. Blundell, T. Lillicrap, K. Kavukcuoglu, and D. Wierstra, "Matching networks for one shot learning," in NeurIPS, 2016, pp. 3630--3638.Google ScholarGoogle Scholar
  35. J. Snell, K. Swersky, and R. Zemel, "Prototypical networks for few-shot learning," Advances in Neural Information Processing Systems, pp. 4077--4087, 2017.Google ScholarGoogle Scholar
  36. S. Thrun and L. Pratt, "Learning to learn: Introduction and overview," in Learning to learn, 1998, pp. 3--17.Google ScholarGoogle ScholarCross RefCross Ref
  37. T. N. Kipf and M. Welling, "Semi-supervised classification with graph convolutional networks," in ICLR, 2017.Google ScholarGoogle Scholar
  38. Y. Li, D. Tarlow, M. Brockschmidt, and R. S. Zemel, "Gated graph sequence neural networks," in ICLR, 2016.Google ScholarGoogle Scholar
  39. P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio, "Graph attention networks," in ICLR, 2018.Google ScholarGoogle Scholar
  40. D. Hendrycks and K. Gimpel, "A baseline for detecting misclassified and out-ofdistribution examples in neural networks," in ICLR, 2017.Google ScholarGoogle Scholar
  41. S. Liang, Y. Li, and R. Srikant, "Enhancing the reliability of out-of-distribution image detection in neural networks," arXiv:1706.02690, 2017.Google ScholarGoogle Scholar
  42. W. Liu, X.Wang, J. Owens, and Y. Li, "Energy-based out-of-distribution detection," Advances in Neural Information Processing Systems, pp. 21 464--21 475, 2020.Google ScholarGoogle Scholar
  43. Y. Sun, C. Guo, and Y. Li, "React: Out-of-distribution detection with rectified activations," in NeurIPS, 2021, pp. 144--157.Google ScholarGoogle Scholar
  44. K. Lee, K. Lee, H. Lee, and J. Shin, "A simple unified framework for detecting out-of-distribution samples and adversarial attacks," in NeurIPS, 2018, pp. 7167--7177.Google ScholarGoogle Scholar
  45. H. Choi, E. Jang, and A. A. Alemi, "Waic, but why? generative ensembles for robust anomaly detection," arXiv preprint arXiv:1810.01392, 2018.Google ScholarGoogle Scholar
  46. P. Kirichenko, P. Izmailov, and A. G. Wilson, "Why normalizing flows fail to detect out-of-distribution data," in NeurIPS, 2020.Google ScholarGoogle Scholar
  47. Z. Xiao, Q. Yan, and Y. Amit, "Likelihood regret: An out-of-distribution detection score for variational auto-encoder," Advances in neural information processing systems, pp. 20 685--20 696, 2020.Google ScholarGoogle Scholar
  48. J. Tian, M. H. Azarian, and M. Pecht, "Anomaly detection using self-organizing maps-based k-nearest neighbor algorithm," in PHM Society European Conference, 2014.Google ScholarGoogle Scholar
  49. S. Fort, J. Ren, and B. Lakshminarayanan, "Exploring the limits of out-ofdistribution detection," in NeurIPS, 2021, pp. 7068--7081.Google ScholarGoogle Scholar
  50. J. Bitterwolf, A. Meinke, and M. Hein, "Certifiably adversarially robust detection of out-of-distribution data," Advances in Neural Information Processing Systems, pp. 16 085--16 095, 2020.Google ScholarGoogle Scholar
  51. W. L. Hamilton, R. Ying, and J. Leskovec, "Inductive representation learning on large graphs," in Proceedings of the 31st International Conference on Neural Information Processing Systems, ser. NIPS'17. Red Hook, NY, USA: Curran Associates Inc., 2017, p. 1025--1035.Google ScholarGoogle Scholar
  52. J. J. McAuley, R. Pandey, and J. Leskovec, "Inferring networks of substitutable and complementary products," in KDD, 2015, pp. 785--794.Google ScholarGoogle Scholar
  53. P. Sen, G. Namata, M. Bilgic, L. Getoor, B. Galligher, and T. Eliassi-Rad, "Collective classification in network data," AI magazine, pp. 93--93, 2008.Google ScholarGoogle Scholar
  54. J. Tang, J. Zhang, L. Yao, J. Li, L. Zhang, and Z. Su, "Arnetminer: extraction and mining of academic social networks," in KDD, 2008, pp. 990--998.Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. F. Zhou, C. Cao, K. Zhang, G. Trajcevski, T. Zhong, and J. Geng, "Meta-gnn: On few-shot node classification in graph meta-learning," in CIKM, 2019, pp. 2357--2360.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. L. Van der Maaten and G. Hinton, "Visualizing data using t-sne." Journal of machine learning research, 2008.Google ScholarGoogle Scholar
  57. N. Tishby, F. C. N. Pereira, and W. Bialek, "The information bottleneck method," CoRR, 2000.Google ScholarGoogle Scholar
  58. N. Tishby and N. Zaslavsky, "Deep learning and the information bottleneck principle," in ITW, 2015, pp. 1--5.Google ScholarGoogle Scholar

Index Terms

  1. SMUG: Sand Mixing for Unobserved Class Detection in Graph Few-Shot Learning

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        WWW '24: Proceedings of the ACM on Web Conference 2024
        May 2024
        4826 pages
        ISBN:9798400701719
        DOI:10.1145/3589334

        Copyright © 2024 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 13 May 2024

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate1,899of8,196submissions,23%
      • Article Metrics

        • Downloads (Last 12 months)53
        • Downloads (Last 6 weeks)53

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader