Abstract
Graph-regularized semi-supervised learning has been effectively used for classification when (i) data instances are connected through a graph, and (ii) labeled data is scarce. Leveraging multiple relations (or graphs) between the instances can improve the prediction performance, however noisy and/or irrelevant relations may deteriorate the performance. As a result, an effective weighing scheme needs to be put in place for robustness.
In this paper, we propose iMUNE, a robust and effective approach for multi-relational graph-regularized semi-supervised classification, that is immune to noise. Under a convex formulation, we infer weights for the multiple graphs as well as a solution (i.e., labeling). We provide a careful analysis of the inferred weights, based on which we devise an algorithm that filters out irrelevant and noisy graphs and produces weights proportional to the informativeness of the remaining graphs. Moreover, iMUNEĀ is linearly scalable w.r.t. the number of edges. Through extensive experiments on various real-world datasets, we show the effectiveness of our method, which yields superior results under different noise models, and under increasing number of noisy graphs and intensity of noise, as compared to a list of baselines and state-of-the-art approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Tsuda, K., Shin, H., Schƶlkopf, B.: Fast protein classification with multiple networks. Bioinformatics 21, 59ā65 (2005)
Kato, T., Kashima, H., Sugiyama, M.: Robust label propagation on multiple networks. IEEE Trans. Neural Netw. 20(1), 35ā44 (2009)
Shin, H., Tsuda, K., Schƶlkopf, B.: Protein functional class prediction with a combined graph. Expert Syst. Appl. 36(2), 3284ā3292 (2009)
Wan, M., Ouyang, Y., Kaplan, L., Han, J.: Graph regularized meta-path based transductive regression in heterogeneous information network. In: SDM, SIAM (2015)
Mostafavi, S., Ray, D., Warde-Farley, D., Grouios, C., Morris, Q.: GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function. Genome Biol. 9(Suppl 1), S4 (2008)
Mostafavi, S., Morris, Q.: Fast integration of heterogeneous data sources for predicting gene function with limited annotation. Bioinformatics 26(14), 1759ā1765 (2010)
Luo, C., Guan, R., Wang, Z., Lin, C.: HetPathMine: a novel transductive classification algorithm on heterogeneous information networks. In: de Rijke, M., Kenter, T., de Vries, A.P., Zhai, C.X., de Jong, F., Radinsky, K., Hofmann, K. (eds.) ECIR 2014. LNCS, vol. 8416, pp. 210ā221. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-06028-6_18
Lanckriet, G.R.G., Bie, T.D., Cristianini, N., Jordan, M.I., Noble, W.S.: A statistical framework for genomic data fusion. Bioinformatics 20(16), 2626ā2635 (2004)
Argyriou, A., Herbster, M., Pontil, M.: Combining graph laplacians for semi-supervised learning. In: NIPS (2005)
Yu, G.X., Rangwala, H., Domeniconi, C., Zhang, G., Zhang, Z.: Protein function prediction by integrating multiple kernels. In: IJCAI (2013)
Wang, S., Jiang, S., Huang, Q., Tian, Q.: S3MKL: scalable semi-supervised multiple kernel learning for image data mining. In: ACM Multimedia, ACM, pp. 163ā172 (2010)
Macskassy, S., Provost, F.: Classification in networked data: a toolkit and a univariate case study. J. Mach. Learn. Res. 8, 935ā983 (2007)
Blum, A., Chawla, S.: Learning from labeled and unlabeled data using graph mincuts. In: ICML, pp. 19ā26 (2001)
Zhu, X., Ghahramani, Z., Lafferty, J., et al.: Semi-supervised learning using gaussian fields and harmonic functions. In: ICML (2003)
Zhou, D., Bousquet, O., Lal, T.N., Weston, J., Schƶlkopf, B.: Learning with local and global consistency. In: NIPS (2003)
Belkin, M., Matveeva, I., Niyogi, P.: Regularization and semi-supervised learning on large graphs. In: COLT (2004)
Spielman, D.A., Teng, S.H.: Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. In: STOC, ACM, pp. 81ā90 (2004)
Eagle, N., Pentland, A.S., Lazer, D.: Inferring friendship network structure by using mobile phone data. PNAS 106(36), 15274ā15278 (2009)
Wang, S., Cho, H., Zhai, C., Berger, B., Peng, J.: Exploiting ontology graph for predicting sparsely annotated gene function. Bioinformatics 31(12), i357āi364 (2015)
Acknowledgments
This research is sponsored by NSF CAREER 1452425 and IIS 1408287. Any conclusions expressed in this material are of the authors and do not necessarily reflect the views, expressed or implied, of the funding parties.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Ā© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Ye, J., Akoglu, L. (2018). Robust Semi-Supervised Learning on Multiple Networks with Noise. In: Phung, D., Tseng, V., Webb, G., Ho, B., Ganji, M., Rashidi, L. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2018. Lecture Notes in Computer Science(), vol 10937. Springer, Cham. https://doi.org/10.1007/978-3-319-93034-3_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-93034-3_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93033-6
Online ISBN: 978-3-319-93034-3
eBook Packages: Computer ScienceComputer Science (R0)