Abstract
Graph Convolutional Networks (GCNs) have made significant improvements in semi-supervised learning for graph structured data and have been successfully used in node classification tasks in network data mining. So far, there have been many methods that can improve GCNs, but only a few works improved it by expanding the training set. Some existing methods try to expand the label sets by using a random walk that only considers the structural relationships or selecting the most confident predictions for each class by comparing the softmax scores. However, the spatial relationships in low-dimensional feature space between nodes is ignored. In this paper, we propose a method to expand the training set by considering the spatial relationships in low-dimensional feature space between nodes. Firstly, we use existing classification methods to predict the pseudo-label information of nodes, and use such information to compute the category center of nodes which has the same pseudo label. Then, we select the k nearest nodes of the category center to expand the training set. At last, we use the expanded training set to reclassify the nodes. In order to further verify our proposed method, we randomly select the same number of nodes to expand the training set, and use the expanded training set to reclassify nodes. Comprehensive experiments conducted on several public data sets demonstrate effectiveness of the proposed method over the state-of-art methods.
Supported by NSFC-Guangdong Joint Found (U1501254) and the Joint Fund of NSFC-General Technology Fundamental Research (U1836215).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Kipf, T., Max W.: Semi-supervised classification with graph convolutional networks. In: 5th International Conference on Learning Representations (ICLR 2017), pp. 1–14 (2017)
Joan, B., Wojciech, Z., Arthur S., Yann, L.: Spectral networks and locally connected networks on graphs. In: 2nd International Conference on Learning Representations (ICLR 2014) (2014)
Michaël, D., Xavier, B., Pierre, V.: Convolutional neural networks on graphs with fast localized spectral filtering. In: 29th Advances in Neural Information Processing Systems(NIPS 2016), pp. 3844–3852 (2016)
Chenyi, Z., Qiang, M.: Dual graph convolutional networks for graph-based semi supervised classification. In 27th World Wide Web Conference (WWW 2018), pp. 499–508 (2018)
Qimai, L., Zhichao, H., Xiao-ming, W.: Deeper insights into graph convolutional networks for semi-supervised learning. In: 32nd AAAI Conference on Artificial Intelligence (AAAI 2018), pp. 3538–3545. ACM (2018)
Martin, E., Hans-Peter, K., Jörg, S., Xiaowei, X.: Density-based spatial clustering of applications with noise. In: 12nd International Conference on Knowledge Discovery and Data Mining (KDD 1996), pp. 226–231. ACM (1996)
Zhu, Y., Ting, K., Carman, M.: Density-ratio based clustering for discovering clusters with varying densities. Pattern Recogn. 2016, 3844–3852 (2016)
Alex, R., Alessandro, L.: Clustering by fast search and find of density peaks. Science 344, 1492 (2014)
Fukunaga, K., Hostetler, L.: The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans. Inf. Theory 21(1), 32–40 (1975)
Macqueen, J.: Some methods for classification and analysis of multi variate observations. In: Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297 (1967)
Joachims, T.: Transductive learning via spectral graph partitioning. In: ICML, pp. 290–297. ACM (2003)
Zhu, X., Ghahramani, Z., Lafferty, J.: Semi-supervised learning using Gaussian fields and harmonic functions. In: ICML 2003, pp. 912–919. ACM (2003)
Qing, L., Lise, G.: Link-based classification. In: International Conference on Machine Learning (ICML), vol. 3, pp. 496–503 (2003)
Belkin, M., Niyogi, P., Sindhwani, V.: Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J. Mach. Learn. Res. 7, 2434 (2006)
Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: 11th Annual Conference on Computational Learning Theory, pp. 92–100. ACM (1998)
Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: online learning of social representations. In: 20th International Conference on Knowledge Discovery and Data Mining (KDD 2014), pp. 701–710. ACM (2014)
Prithviraj, S., Galileo, N., Mustafa, B., Lise, G., Brian, G., Tina, E.: Collective classification in network data. AI Mag. 29(3), 93 (2008)
Kipf, T., Max, W.: Semi-supervised classification with graph convolutional networks. GCN open source code. https://github.com/tkipf/gcn
Van, L., Maaten, D., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(2579–2605), 85 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Tan, L., Yao, W., Li, X. (2020). Expanding Training Set for Graph-Based Semi-supervised Classification. In: Hartmann, S., Küng, J., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2020. Lecture Notes in Computer Science(), vol 12392. Springer, Cham. https://doi.org/10.1007/978-3-030-59051-2_16
Download citation
DOI: https://doi.org/10.1007/978-3-030-59051-2_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59050-5
Online ISBN: 978-3-030-59051-2
eBook Packages: Computer ScienceComputer Science (R0)