Abstract
Graph learning, such as node classification, is typically carried out in a closed-world setting. A number of nodes are labeled, and the learning goal is to correctly classify remaining (unlabeled) nodes into classes, represented by the labeled nodes. In reality, due to limited labeling capability or dynamic evolving nature of networks, some nodes in the networks may not belong to any existing/seen classes and therefore cannot be correctly classified by closed-world learning algorithms. In this paper, we propose a new open-world graph learning paradigm, where the learning goal is to correctly classify nodes belonging to labeled classes into correct categories and also classify nodes not belonging to labeled classes to an unseen class. Open-world graph learning has three major challenges: (1) Graphs do not have features to represent nodes for learning; (2) unseen class nodes do not have labels and may exist in an arbitrary form different from labeled classes; and (3) graph learning should differentiate whether a node belongs to an existing/seen class or an unseen class. To tackle the challenges, we propose an uncertain node representation learning principle to use multiple versions of node feature representation to test a classifier’s response on a node, through which we can differentiate whether a node belongs to the unseen class. Technical wise, we propose constrained variational graph autoencoder, using label loss and class uncertainty loss constraints, to ensure that node representation learning is sensitive to the unseen class. As a result, node embedding features are denoted by distributions, instead of deterministic feature vectors. In order to test the certainty of a node belonging to seen classes, a sampling process is proposed to generate multiple versions of feature vectors to represent each node, using automatic thresholding to reject nodes not belonging to seen classes as unseen class nodes. Experiments, using graph convolutional networks and graph attention networks on four real-world networks, demonstrate the algorithm performance. Case studies and ablation analysis also show the advantage of the uncertain representation learning and automatic threshold selection for open-world graph learning.









Similar content being viewed by others
Notes
Note that node classification in this article refers to single-label classification.
References
Shu L, Xu H, Liu B (2017) “Doc: Deep open classification of text documents,” arXiv preprint arXiv:1709.08716
Xu H, Liu B, Shu L, Yu P (2019) “Open-world learning and application to product classification,” In: Proceedings of WWW conference, pp. 3413–3419
Fei G, Wang S, Liu B (2016) “Learning cumulatively to become more knowledgeable,” In: Proceedings of KDD, pp. 1565–1574
Chen Z, Liu B (2018) Lifelong machine learning. Synth Lect Artif Intel Mach Learn 12(3):1–207
Schölkopf B, Platt JC, Shawe-Taylor J, Smola AJ, Williamson RC (2001) Estimating the support of a high-dimensional distribution. Neural Comput 13(7):1443–1471
Fei G, Liu B (2015) “Social media text classification under negative covariate shift,” In: Proceedings of EMNLP, pp. 2347–2356
Scheirer WJ, de Rezende Rocha A, Sapkota A, Boult TE (2012) Toward open set recognition. IEEE Trans Pattern Anal Mach Intell 35(7):1757–1772
Scheirer WJ, Jain LP, Boult TE (2014) Probability models for open set recognition. IEEE Trans Pattern Anal Mach Intell 36(11):2317–2324
Jain LP, Scheirer WJ, Boult TE (2014) “Multi-class open set recognition using probability of inclusion,” In: ECCV. Springer, pp. 393–409
Wu M, Pan S, Zhu X (2020) “Openwgl: Open-world graph learning,” In: Proceeginds of IEEE ICDM conference
Gao Y, Chandra S, Li Y, Kan L, Thuraisingham B (2020) Saccos: A semi-supervised framework for emerging class detection and concept drift adaption over data streams. Knowl Data Eng IEEE Trans
Cai X-QC, Zhao P, Ting K-M, Mu X, Jiang Y (2019) “Nearest neighbor ensembles: An effective method for difficult problems in streaming classification with emerging new classes,” In: ICDM
Wei X-S, Ye H-JY, Wu X, Wu J, Shen C, Zhou Z-H (2021) Multiple instance learning with emerging novel class. IEEE Trans Knowledge Data Eng 33(5):2109–2120
Na G, Kim DK, Yu H (2019) “Dilof: Effective and memory efficient local outlier detection in data streams,” In: Proceedings of KDD
Park CH, Shim H (2007) “On detecting an emerging class,” In: IEEE International Conference on Granular Computing (GRC 2007). IEEE, pp. 265–265
Zadrozny B, Elkan C (2002) “Transforming classifier scores into accurate multiclass probability estimates,” In: Proceedings of SIGKDD
Li M, Sethi IK (2006) Confidence-based classifier design. Pattern Recogn 39(7):1230–1240
Proedrou K, Nouretdinov I, Vovk V, Gammerman A (2002) “Transductive confidence machines for pattern recognition,” In: European conference on machine learning. Springer, pp. 381–390
Soares-Filho W, Seixas J, Caloba LP (2002) “Enlarging neural class detection capacity in passive sonar systems,” In: 2002 IEEE international symposium on circuits and systems. Proceedings (Cat. No. 02CH37353), vol. 3. IEEE, pp. III–III
Knorr EM, Ng RT (1999) Finding intensional knowledge of distance-based outliers. Vldb 99:211–222
Akoglu L, Tong H, Koutra D (2015) Graph based anomaly detection and description: a survey. Data Mining Knowl Discovery 29(3):626–688
Spinosa EJ, Carvalho A (2005) Support vector machines for novel class detection in bioinformatics. Genet Mol Res 4(3):608–15
Barbará D, Domeniconi C, Rogers JP (2006) “Detecting outliers using transduction and statistical testing,” In: Proceedings of KDD, pp. 55–64
Gori M, Monfardini G, Scarselli F (2005) “A new model for learning in graph domains,” In: IJCNN, vol. 2. IEEE, pp. 729–734
Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G (2008) The graph neural network model. IEEE Trans Neural Netw 20(1):61–80
Wu M, Pan S, Zhou C, Chang X, Zhu X (2020) “Unsupervised domain adaptive graph convolutional networks,” In: WWW ’20: the web conference, April 20-24, 2020, pp. 1457–1467
Wu M, Pan S, Zhu X, Zhou C, Pan L (2019) “Domain-adversarial graph neural networks for text classification,” In: IEEE international conference on data mining, ICDM, pp. 648–657
Zhu S, Zhou L, Pan S, Zhou C, Yan G, Wang B (2020) “GSSNN: Graph smoothing splines neural networks,” In: AAAI, pp. 7007–7014
Wu Z, Pan S, Chen F, Long G, Zhang C, Yu PS (2020) “A comprehensive survey on graph neural networks,” TNNLS
Pan S, Hu R, Fung S-f, Long G, Jiang J, Zhang C (2020) Learning graph embedding with adversarial training methods. IEEE Trans Cybern 50(6): 2475–2487
Wu M, Pan S, Du L, Tsang IW, Zhu X, Du B (2019) “Long-short distance aggregation networks for positive unlabeled graph learning,” In: Proceedings of CIKM, pp. 2157–2160
Kipf TN, Welling M (2016) “Semi-supervised classification with graph convolutional networks,” arXiv preprint arXiv:1609.02907
Hamilton W, Ying Z, Leskovec J (2017) “Inductive representation learning on large graphs,” In: Advances in neural information processing systems, pp. 1024–1034
Veličković P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2017) “Graph attention networks,” arXiv preprint arXiv:1710.10903
Kipf TN, Welling M (2016) “Variational graph auto-encoders,” arXiv preprint arXiv:1611.07308
Kingma DP, Welling M (2013) “Auto-encoding variational bayes,” arXiv preprint arXiv:1312.6114
Clevert D, Unterthiner T, Hochreiter S (2015) “Fast and accurate deep network learning by exponential linear units (elus),” arXiv preprint arXiv:1511.07289
Yang C, Liu Z, Zhao D, Sun M, Chang EY (2015) “Network representation learning with rich text information.” In: Proceedings of IJCAI, pp. 2111–2117
Pan S, Wu J, Zhu X, Zhang C, Wang Y (2016) “Tri-party deep network representation,” In: Proceedings of IJCAI, pp. 1895–1901
Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M, Ghemawat S, Goodfellow I, Harp A, Irving G, Isard M, Jia Y, Jozefowicz R, Kaiser L, Kudlur M, Levenberg J, Mané D, Monga R, Moore S, Murray D, Olah C, Schuster M, Shlens J, Steiner B, Sutskever I, Talwar K, Tucker P, Vanhoucke V, Vasudevan V, Viégas F, Vinyals O, Warden P, Wattenberg M. Wicke M, Yu Y, Zheng X (2015) “TensorFlow: Large-scale machine learning on heterogeneous systems,” software available from tensorflow.org. [Online]. Available: http://tensorflow.org/
Hu J, Qian S, Fang Q, Wang Y, Zhao Q, Zhang H, Xu C (2021) “Efficient graph deep learning in tensorflow with tf_geometric,” CoRR, vol. arXiv:2101.11552
Chi L, Li B, Zhu X, Pan S, Chen L (2018) Hashing for adaptive real-time graph stream classification with concept drifts. IEEE Trans Cybern 48(5):1591–1604
Zhang P, Gao BJ, Zhu X, Guo L (2011) “Enabling fast lazy learning for data streams,” In: Proceedings of IEEE ICDM Conference, pp. 932–941
Xu K, Hu W, Leskovec J, Jegelka S (2019) “How powerful are graph neural networks?” In: 7th international conference on learning representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019
Zhu S, Pan S, Zhou C, Wu J, Cao Y, Wang B (2020) “Graph geometry interaction learning,” Adv Neural Inf Process Syst 33:7548–7558
Klicpera J, Bojchevski A, Günnemann S (2019) “Predict then propagate: Graph neural networks meet personalized pagerank,” In: 7th International conference on learning representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019, 2019
Brockschmidt M (2020) “Gnn-film: Graph neural networks with feature-wise linear modulation,” In: Proceedings of the 37th international conference on machine learning, ICML 2020, 13-18 July 2020, virtual event, ser. Proceedings of machine learning research, vol. 119, pp. 1144–1152
Acknowledgements
This research was supported by the U.S. National Science Foundation (NSF) through Grant Nos. IIS-1763452, CNS-1828181, and IIS-2027339.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wu, M., Pan, S. & Zhu, X. OpenWGL: open-world graph learning for unseen class node classification. Knowl Inf Syst 63, 2405–2430 (2021). https://doi.org/10.1007/s10115-021-01594-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-021-01594-0