Abstract
Link prediction is an important task in social network analysis. The task is to predict missing links in current networks or new links in future networks. The key challenge in link prediction is being lack of features when machine learning methods are applied. Most relevant studies solve the problem by using features derived from network topology. In our work, we propose a novel feature extraction method by employing Gradient Boosting Decision Tree (GBDT), which effectively derive attributes from initial feature set. For GBDT model, input features are transformed by means of boosted decision trees. The output of each individual tree is treated as a categorical input feature to a sparse linear classifier. Extensive experiments demonstrate that the proposed method outperforms a number of mainstream baselines when GBDT features considered. The proposed method is efficient to solve the feature shortage problem in the prediction of links.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Liben-Nowell, D., Kleinberg, J.: The link-prediction problem for social networks. J. Am. Soc. Inform. Sci. Technol. 58(7), 1019–1031 (2007)
Lü, L., Zhou, T.: Link prediction in complex networks: a survey. Physica A 390(6), 1150–1170 (2011)
Wang, P., Xu, B., Wu, Y., Zhou, X.: Link prediction in social networks: the state-of-the-art. Sci. China Inf. Sci. 58(1), 1–38 (2015)
Newman, M.E.: Clustering and preferential attachment in growing networks. Phys. Rev. E 64(2), 025102 (2001)
Adamic, L.A., Adar, E.: Friends and neighbors on the web. Soc. Netw. 25(3), 211–230 (2003)
Barabási, A.L., Jeong, H., Néda, Z., Ravasz, E., Schubert, A., Vicsek, T.: Evolution of the social network of scientific collaborations. Physica A 311(3), 590–614 (2002)
Sun, Y., Han, J., Aggarwal, C.C., Chawla, N.V.: When will it happen?: Relationship prediction in heterogeneous information networks. In: Proceedings of the Fifth ACM International Conference on Web Search and Data Mining, pp. 663–672. ACM, February 2012
Lu, Z., Savas, B., Tang, W., Dhillon, I.S.: Supervised link prediction using multiple sources. In: 2010 IEEE 10th International Conference on Data Mining (ICDM), pp. 923–928. IEEE, December 2010
Rowe, M., Stankovic, M., Alani, H.: Who will follow whom? Exploiting semantics for link prediction in attention-information networks. In: Cudré-Mauroux, P., et al. (eds.) ISWC 2012, Part I. LNCS, vol. 7649, pp. 476–491. Springer, Heidelberg (2012)
Sachan, M., Ichise, R.: Using semantic information to improve link prediction results in network datasets. Int. J. Comput. Theory Eng. 71–76 (2011)
De Sá, H.R., Prudêncio, R.B.: Supervised link prediction in weighted networks. In: The 2011 International Joint Conference on Neural Networks (IJCNN), pp. 2281–2288. IEEE, July 2011
Scellato, S., Noulas, A., Mascolo, C.: Exploiting place features in link prediction on location-based social networks. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1046–1054. ACM, August 2011
Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., Su, Z.: Arnetminer: extraction and mining of academic social networks. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 990–998. ACM, August 2008
He, X., Pan, J., Jin, O., Xu, T., Liu, B., Xu, T., et al.: Practical lessons from predicting clicks on ads at facebook. In: Proceedings of the Eighth International Workshop on Data Mining for Online Advertising, pp. 1–9. ACM, August 2014
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval, vol. 1, no. 1, p. 496. Cambridge University Press, Cambridge (2008)
Zhou, T., Lü, L., Zhang, Y.C.: Predicting missing links via local information. Eur. Phys. J. B 71(4), 623–630 (2009)
Leicht, E.A., Holme, P., Newman, M.E.: Vertex similarity in networks. Phys. Rev. E 73(2), 026120 (2006)
Ho, T.K.: Random decision forests. In: Proceedings of the Third International Conference on Document Analysis and Recognition, 1995, vol. 1. IEEE (1995)
Budur, E., Lee, S., Kong, V.S.: Structural Analysis of Criminal Network and Predicting Hidden Links using Machine Learning. arXiv preprint arXiv:1507.05739 (2015)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Yin, D., Hong, L., Davison, B.D.: Structural link analysis and prediction in microblogs. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 1163–1168. ACM, October 2011
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al.: Scikit-Learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Liu, F., Liu, B., Sun, C., Liu, M., Wang, X.: Deep learning approaches for link prediction in social network services. In: Lee, M., Hirose, A., Hou, Z.-G., Kil, R.M. (eds.) ICONIP 2013, Part II. LNCS, vol. 8227, pp. 425–432. Springer, Heidelberg (2013)
Li, X., Du, N., Li, H., Li, K., Gao, J., Zhang, A.: A deep learning approach to link prediction in dynamic networks. In: SDM, vol. 14, pp. 289–297 (2014)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Appendix
Appendix
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Li, T., Wang, J., Tu, M., Zhang, Y., Yan, Y. (2016). Enhancing Link Prediction Using Gradient Boosting Features. In: Huang, DS., Jo, KH. (eds) Intelligent Computing Theories and Application. ICIC 2016. Lecture Notes in Computer Science(), vol 9772. Springer, Cham. https://doi.org/10.1007/978-3-319-42294-7_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-42294-7_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42293-0
Online ISBN: 978-3-319-42294-7
eBook Packages: Computer ScienceComputer Science (R0)