Skip to main content

Enhancing Link Prediction Using Gradient Boosting Features

  • Conference paper
  • First Online:
Intelligent Computing Theories and Application (ICIC 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9772))

Included in the following conference series:

Abstract

Link prediction is an important task in social network analysis. The task is to predict missing links in current networks or new links in future networks. The key challenge in link prediction is being lack of features when machine learning methods are applied. Most relevant studies solve the problem by using features derived from network topology. In our work, we propose a novel feature extraction method by employing Gradient Boosting Decision Tree (GBDT), which effectively derive attributes from initial feature set. For GBDT model, input features are transformed by means of boosted decision trees. The output of each individual tree is treated as a categorical input feature to a sparse linear classifier. Extensive experiments demonstrate that the proposed method outperforms a number of mainstream baselines when GBDT features considered. The proposed method is efficient to solve the feature shortage problem in the prediction of links.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Liben-Nowell, D., Kleinberg, J.: The link-prediction problem for social networks. J. Am. Soc. Inform. Sci. Technol. 58(7), 1019–1031 (2007)

    Article  Google Scholar 

  2. Lü, L., Zhou, T.: Link prediction in complex networks: a survey. Physica A 390(6), 1150–1170 (2011)

    Article  Google Scholar 

  3. Wang, P., Xu, B., Wu, Y., Zhou, X.: Link prediction in social networks: the state-of-the-art. Sci. China Inf. Sci. 58(1), 1–38 (2015)

    Google Scholar 

  4. Newman, M.E.: Clustering and preferential attachment in growing networks. Phys. Rev. E 64(2), 025102 (2001)

    Article  Google Scholar 

  5. Adamic, L.A., Adar, E.: Friends and neighbors on the web. Soc. Netw. 25(3), 211–230 (2003)

    Article  Google Scholar 

  6. Barabási, A.L., Jeong, H., Néda, Z., Ravasz, E., Schubert, A., Vicsek, T.: Evolution of the social network of scientific collaborations. Physica A 311(3), 590–614 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  7. Sun, Y., Han, J., Aggarwal, C.C., Chawla, N.V.: When will it happen?: Relationship prediction in heterogeneous information networks. In: Proceedings of the Fifth ACM International Conference on Web Search and Data Mining, pp. 663–672. ACM, February 2012

    Google Scholar 

  8. Lu, Z., Savas, B., Tang, W., Dhillon, I.S.: Supervised link prediction using multiple sources. In: 2010 IEEE 10th International Conference on Data Mining (ICDM), pp. 923–928. IEEE, December 2010

    Google Scholar 

  9. Rowe, M., Stankovic, M., Alani, H.: Who will follow whom? Exploiting semantics for link prediction in attention-information networks. In: Cudré-Mauroux, P., et al. (eds.) ISWC 2012, Part I. LNCS, vol. 7649, pp. 476–491. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  10. Sachan, M., Ichise, R.: Using semantic information to improve link prediction results in network datasets. Int. J. Comput. Theory Eng. 71–76 (2011)

    Google Scholar 

  11. De Sá, H.R., Prudêncio, R.B.: Supervised link prediction in weighted networks. In: The 2011 International Joint Conference on Neural Networks (IJCNN), pp. 2281–2288. IEEE, July 2011

    Google Scholar 

  12. Scellato, S., Noulas, A., Mascolo, C.: Exploiting place features in link prediction on location-based social networks. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1046–1054. ACM, August 2011

    Google Scholar 

  13. Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., Su, Z.: Arnetminer: extraction and mining of academic social networks. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 990–998. ACM, August 2008

    Google Scholar 

  14. He, X., Pan, J., Jin, O., Xu, T., Liu, B., Xu, T., et al.: Practical lessons from predicting clicks on ads at facebook. In: Proceedings of the Eighth International Workshop on Data Mining for Online Advertising, pp. 1–9. ACM, August 2014

    Google Scholar 

  15. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval, vol. 1, no. 1, p. 496. Cambridge University Press, Cambridge (2008)

    Google Scholar 

  16. Zhou, T., Lü, L., Zhang, Y.C.: Predicting missing links via local information. Eur. Phys. J. B 71(4), 623–630 (2009)

    Article  MATH  Google Scholar 

  17. Leicht, E.A., Holme, P., Newman, M.E.: Vertex similarity in networks. Phys. Rev. E 73(2), 026120 (2006)

    Article  Google Scholar 

  18. Ho, T.K.: Random decision forests. In: Proceedings of the Third International Conference on Document Analysis and Recognition, 1995, vol. 1. IEEE (1995)

    Google Scholar 

  19. Budur, E., Lee, S., Kong, V.S.: Structural Analysis of Criminal Network and Predicting Hidden Links using Machine Learning. arXiv preprint arXiv:1507.05739 (2015)

  20. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)

    MATH  Google Scholar 

  21. Yin, D., Hong, L., Davison, B.D.: Structural link analysis and prediction in microblogs. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 1163–1168. ACM, October 2011

    Google Scholar 

  22. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al.: Scikit-Learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  23. Liu, F., Liu, B., Sun, C., Liu, M., Wang, X.: Deep learning approaches for link prediction in social network services. In: Lee, M., Hirose, A., Hou, Z.-G., Kil, R.M. (eds.) ICONIP 2013, Part II. LNCS, vol. 8227, pp. 425–432. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  24. Li, X., Du, N., Li, H., Li, K., Gao, J., Zhang, A.: A deep learning approach to link prediction in dynamic networks. In: SDM, vol. 14, pp. 289–297 (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Taisong Li , Jing Wang , Manshu Tu , Yan Zhang or Yonghong Yan .

Editor information

Editors and Affiliations

Appendix

Appendix

Table 5. Results for different supervised methods on Aminer dataset
Fig. 3.
figure 3

Improvement with GBDT features of Aminer dataset (Color figure online)

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Li, T., Wang, J., Tu, M., Zhang, Y., Yan, Y. (2016). Enhancing Link Prediction Using Gradient Boosting Features. In: Huang, DS., Jo, KH. (eds) Intelligent Computing Theories and Application. ICIC 2016. Lecture Notes in Computer Science(), vol 9772. Springer, Cham. https://doi.org/10.1007/978-3-319-42294-7_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-42294-7_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-42293-0

  • Online ISBN: 978-3-319-42294-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics