Abstract
Most existing network representation algorithms learn the network representations based on network structure, however, they neglect the rich external information associated with nodes (i.e. text contents, communities and label information). Meanwhile, the learnt representations usually lack the discriminative ability for the tasks of node classification and linking prediction. We consequently overcame the above challenges by presenting a novel semi-supervised algorithm, text-associated max-margin DeepWalk algorithm (TMDW). TMDW incorporates text contents and network structures into the network representation learning based on the inductive matrix completion algorithm, and then we use node’s category to optimize the learnt network representations based on the mar-margin principle and biased gradient. For integrating the above tasks, we propose a novel and efficient framework of network representation learning, this framework is easy to extend and generate discriminative representations. We then evaluate our model using the multi-class classification tasks. The experimental results demonstrate that TMDW outperforms other baseline methods on three real-world datasets. The visualization task of TMDW shows that our model is more discriminative than the other unsupervised approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Tsoumakas, G., Katakis, I.: Multi-label classification: an overview. Int. J. Data Warehous. Min. 3(3), 1–13 (2007)
Tu, C., Liu, Z., Sun, M.S.: Inferring correspondences from multiple sources for microblog user tags. In: Huang, H., Liu, T., Zhang, H.-P., Tang, J. (eds.) SMP 2014. CCIS, vol. 489, pp. 1–12. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-45558-6_1
Yu, H.F., Jian, P., Kar, P., Dhillon, I.S.: Large-scale multi-label learning with missing labels. In: Proceedings of ICML, pp. 593–601 (2014)
Liben-nowell, D., Kleinberg, J.: The link-prediction problem for social networks. J. Assoc. Inf. Sci. Technol. 58(7), 1019–1031 (2007)
Perozzi, B., Al-Rfou, R., Skiena, S.: DeepWalk: online learning of social representations. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 701–710 (2014)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Tang, J., Qu, M., Wang, M.Z., Zhang, M., Yan, J., Mei, Q.Z.: Line: large-scale information network embedding. In: Proceedings of WWW, pp. 1067–1077 (2015)
Cao, S.S., Lu, W., Xu, Q.K.: GraRep: learning graph representations with global structural information. In: Conference on Information and Knowledge Management, pp. 891–900 (2015)
Wang, D.X., Cui, P., Zhu, W.W.: Structural deep network embedding. In: The ACM SIGKDD International Conference, pp. 1225–1234 (2016)
Grover, A., Leskovec, J.: Node2vec: scalable feature learning for networks. In: Proceedings of ACM SIGKDD, pp. 855–864 (2016)
Tang, J., Qu, M., Mei, Q.Z.: PTE: predictive text embedding through large-scale heterogeneous text networks. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1165–1174 (2015)
Sun, M.S., Guo, J., Ding, X., Liu, T.: A general framework for content-enhanced network representation learning. arXiv:1610.02906 (2016)
Tu, C.C., Wang, H., Zeng, X.K., Liu, Z.Y., Sun, M.S.: Community-enhanced network representation learning for network analysis. arXiv:1611.06645 (2016)
Pan, S.R., Wu, J., Zhu, X.Q., Zhang, C.Q., Wang, Y.: Tri-party deep network representation. In: Proceedings of IJCAI 2016, pp. 1895–1901 (2016)
Natarajan, N., Dhillon, I.S.: Inductive matrix completion for predicting gene-disease associations. Bioinformatics 30(12), 60–68 (2014)
Levy, O., Goldberg, Y.: Neural word embedding as implicit matrix factorization. In: Advances in Neural Information Processing Systems, pp. 2177–2185 (2014)
Yang, C., Liu, Z.Y.: Comprehend deepwalk as matrix factorization. arXiv:1501.00358 (2015)
Yang, C., Liu, Z.Y., Zhao, D.L.: Network representation learning with rich text information. In: International Conference on Artificial Intelligence, pp. 2111–2117 (2016)
Tu, C.C., Zhang, W.C., Liu, Z.Y., Sun, M.S.: Max-margin DeepWalk: discriminative learning of network representation. In: Proceedings of IJCAI (2016)
Hearst, M.A., Dumais, S.T., Osman, E.: Support vector machines. IEEE Intell. Syst. Their Appl. 13(4), 18–28 (1998)
Zhu, J., Ahmed, A., Xing, E.P.: MedLDA: maximum margin supervised topic models. In: International Conference on Machine Learning, pp. 2237–2278 (2009)
Borg, I., Groenen, P.: Modern multidimensional scaling: theory and applications. Econ. Inst. Res. Pap. 40(3), 277–280 (2005)
Tenenbaum, J.B., De Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)
Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3(5), 993–1022 (2003)
Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of ACM, pp. 50–57 (2000)
Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855–864 (2016)
Wang, X., Cui, P., Wang, J.: Community preserving network embedding. In: AAAI Conference on Artificial Intelligence (2017)
Huang, Z.P., Mamoulis, N.: Heterogeneous information network embedding for meta path based proximity. arXiv:1701.05291 (2017)
Yang, C., Sun, M.S., Li, Z.Y., Tu, C.C.: Fast network embedding enhancement via high order proximity approximation. In: Proceedings of IJCAI, pp. 3894–3900 (2017)
Perozzi, B., Kulkarni, V., Skiena, S.: Walklets: multiscale graph embeddings for interpretable network classification. arXiv:1605.02115 (2016)
Kipf, T.N., Welling, N.: Semi supervised classification with graph convolutional networks. arXiv:1609.02907 (2016)
Taskar, B., Guestrin, C., Koller, D.: Max-margin Markov networks. In: Proceedings of NIPS, pp. 25–32 (2003)
Pei, W.Z., Ge, T., Chang, B.B.: Max-margin tensor neural network for chinese word segmentation. In: Proceedings of ACL, pp. 293–303 (2014)
Zhou, D.Y., Huang, J.Y., Schölkopf, B.: Beyond pairwise classification and clustering using hypergraphs. In: proceedings of NIPS 2006, pp. 1601–1608 (2005)
Tu, K., Cui, P., Wang, X., Wang, F., Zhu, W.W.: Structural deep embedding for hyper-networks. In: AAAI Conference on Artificial Intelligence (2018)
Li, D., Xu, Z.M., Li, S.: Link prediction in social networks based on hypergraph. In: Proceedings of WWW, pp. 41–42 (2013)
Ma, J.X., Cui, P., Zhu, W.W.: DepthLGP: learning embeddings of out-of-sample nodes in dynamic networks. In: AAAI Conference on Artificial Intelligence (2018)
Lee, D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)
Morin, F., Bengio, Y.: Hierarchical probabilistic neural network language model. In: Tenth International Workshop on Artificial Intelligence and Statistics, pp. 246–252 (2005)
Crammer, K., Singer, Y.: On the algorithmic implementation of multiclass kernel-based vector machines. J. Mach. Learn. Res. 2(2), 265–292 (2002)
Fan, R., Chang, K., Hsieh, C., Wang, X., Lin, C.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)
Acknowledgement
This project is supported by NSFC (No. 61663041, 61763041), the Program for Changjiang Scholars and Innovative Research Team in Universities (No. IRT_15R40), the Research Fund for the Chunhui Program of Ministry of Education of China (No. Z2014022) and the Nature Science Foundation of Qinghai Province (2014-ZJ-721), the Fundamental Research Funds for the Central Universities (2017TS045), and the Tibetan Information Processing and Machine Translation Key Laboratory (2013-Z-Y17).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Ye, Z., Zhao, H., Zhang, K., Zhu, Y., Xiao, Y. (2018). Text-Associated Max-Margin DeepWalk. In: Xu, Z., Gao, X., Miao, Q., Zhang, Y., Bu, J. (eds) Big Data. Big Data 2018. Communications in Computer and Information Science, vol 945. Springer, Singapore. https://doi.org/10.1007/978-981-13-2922-7_21
Download citation
DOI: https://doi.org/10.1007/978-981-13-2922-7_21
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-2921-0
Online ISBN: 978-981-13-2922-7
eBook Packages: Computer ScienceComputer Science (R0)