Skip to main content

Evaluating and Extending Latent Methods for Link-Based Classification

  • Conference paper
Formalisms for Reuse and Systems Integration (FMI 2014)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 346))

Included in the following conference series:

  • 420 Accesses

Abstract

Data describing networks such as social networks, citation graphs, hypertext systems, and communication networks is becoming increasingly common and important for analysis. Research on link-based classification studies methods to leverage connections in such networks to improve accuracy. Recently, a number of such methods have been proposed that first construct a set of latent features or links that summarize the network, then use this information for inference. Some work has claimed that such latent methods improve accuracy, but has not compared against the best non-latent methods. In response, this article provides the first substantial comparison between these two groups. Using six real datasets, a range of synthetic data, and multiple underlying models, we show that (non-latent) collective inference methods usually perform best, but that the dataset’s label sparsity, attribute predictiveness, and link density can dramatically affect the performance trends. Inspired by these findings, we introduce three novel algorithms that combine a latent construction with a latent or non-latent method, and demonstrate that they can sometimes substantially increase accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Bilgic, M., Mihalkova, L., Getoor, L.: Active learning for networked data. In: Proc. of ICML, pp. 79–86 (2010)

    Google Scholar 

  2. BollobĂ¡s, B., Borgs, C., Chayes, J., Riordan, O.: Directed scale-free graphs. In: Proceedings of the 14th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 132–139 (2003)

    Google Scholar 

  3. Chakrabarti, S., Dom, B., Indyk, P.: Enhanced hypertext categorization using hyperlinks. In: Proc. of SIGMOD, pp. 307–318 (1998)

    Google Scholar 

  4. Crane, R., McDowell, L.: Investigating markov logic networks for collective classification. In: Proc. of ICAART, pp. 5–15 (2012)

    Google Scholar 

  5. Fleming, A., McDowell, L.K., Markel, Z.: A Hidden Treasure? Evaluating and Extending Latent Methods for Link-based Classification. In: Proc. of IRI, pp. 669–676 (2014)

    Google Scholar 

  6. Gallagher, B., Tong, H., Eliassi-Rad, T., Faloutsos, C.: Using ghost edges for classification in sparsely labeled networks. In: Proc. of KDD, pp. 256–264 (2008)

    Google Scholar 

  7. Hoff, P.: Multiplicative latent factor models for description and prediction of social networks. Computational & Mathematical Organization Theory 15(4), 261–272 (2009)

    Article  Google Scholar 

  8. Jensen, D., Neville, J., Gallagher, B.: Why collective inference improves relational classification. In: Proc. of KDD, pp. 593–598 (2004)

    Google Scholar 

  9. Jensen, D., Neville, J.: Autocorrelation and linkage cause bias in evaluation of relational learners. In: Proc. of ILP, pp. 259–266 (2002)

    Google Scholar 

  10. Jensen, D., Neville, J.: Linkage and autocorrelation cause feature selection bias in relational learning. In: Proc. of ICML, pp. 259–266 (2002)

    Google Scholar 

  11. Kuwadekar, A., Neville, J.: Relational active learning for joint collective classification models. In: Proc. of ICML, pp. 385–392 (2011)

    Google Scholar 

  12. Lin, F., Cohen, W.W.: Semi-supervised classification of network data using very few labels. In: Proc. of ASONAM, pp. 192–199 (2010)

    Google Scholar 

  13. Macskassy, S., Provost, F.: Classification in networked data: A toolkit and a univariate case study. J. of Machine Learning Research 8, 935–983 (2007)

    Google Scholar 

  14. McDowell, L.K., Aha, D.: Semi-supervised collective classification via hybrid label regularization. In: Proc. of ICML, pp. 975–982 (2012)

    Google Scholar 

  15. McDowell, L.K., Aha, D.W.: Labels or attributes? Rethinking the neighbors for collective classification in sparsely-labeled networks. In: Proc. of CIKM, pp. 847–852 (2013)

    Google Scholar 

  16. McDowell, L., Gupta, K., Aha, D.: Cautious collective classification. J. of Machine Learning Research 10, 2777–2836 (2009)

    MATH  MathSciNet  Google Scholar 

  17. McDowell, L.K., Gupta, K.M., Aha, D.W.: Cautious inference in collective classification. In: Proc. of AAAI, pp. 596–601 (2007)

    Google Scholar 

  18. Menon, A., Elkan, C.: Link prediction via matrix factorization. Machine Learning and Knowledge Discovery in Databases, pp. 437–452 (2011)

    Google Scholar 

  19. Menon, A., Elkan, C.: Predicting labels for dyadic data. Data Mining and Knowledge Discovery 21(2), 327–343 (2010)

    Article  MathSciNet  Google Scholar 

  20. Miller, K., Griffiths, T., Jordan, M.: Nonparametric latent feature models for link prediction. In: Advances in Neural Information Processing Systems (NIPS), pp. 1276–1284 (2009)

    Google Scholar 

  21. Namata, G.M., London, B., Getoor, L., Huang, B.: Query-driven active surveying for collective classification. In: Workshop on Mining and Learning with Graphs at ICML 2012 (2012)

    Google Scholar 

  22. Namata, G., Kok, S., Getoor, L.: Collective graph identification. In: Proc. of KDD, pp. 87–95 (2011)

    Google Scholar 

  23. Neville, J., Jensen, D.: Iterative classification in relational data. In: Proc. of the Workshop on Learning Statistical Models from Relational Data at AAAI-2000, pp. 13–20 (2000)

    Google Scholar 

  24. Neville, J., Jensen, D.: Leveraging relational autocorrelation with latent group models. In: Proc. of ICDM, pp. 170–177 (2005)

    Google Scholar 

  25. Neville, J., Jensen, D.: Relational dependency networks. J. of Machine Learning Research 8, 653–692 (2007)

    MATH  Google Scholar 

  26. Neville, J., Simsek, Ö., Jensen, D., Komoroske, J., Palmer, K., Goldberg, H.G.: Using relational knowledge discovery to prevent securities fraud. In: Proc. of KDD, pp. 449–458 (2005)

    Google Scholar 

  27. Sen, P., Namata, G., Bilgic, M., Getoor, L., Gallagher, B., Eliassi-Rad, T.: Collective classification in network data. AI Magazine 29(3), 93–106 (2008)

    Google Scholar 

  28. Shi, X., Li, Y., Yu, P.: Collective prediction with latent graphs. In: Proc. of CIKM, pp. 1127–1136 (2011)

    Google Scholar 

  29. Tang, L., Liu, H.: Relational learning via latent social dimensions. In: Proc. of KDD, pp. 817–826 (2009)

    Google Scholar 

  30. Tang, L., Liu, H.: Leveraging social media networks for classification. Data Mining and Knowledge Discovery, pp. 1–32 (2011)

    Google Scholar 

  31. Tang, L., Wang, X., Liu, H.: Scalable learning of collective behavior. IEEE Transactions on Knowledge and Data Engineering (2011)

    Google Scholar 

  32. Taskar, B., Abbeel, P., Koller, D.: Discriminative probalistic models for relational data. In: Proc. of UAI, pp. 485–492 (2002)

    Google Scholar 

  33. Wang, T., Neville, J., Gallagher, B., Eliassi-Rad, T.: Correcting bias in statistical tests for network classifier evaluation. In: Proc. of ECML, pp. 506–521 (2011)

    Google Scholar 

  34. Xiang, R., Neville, J.: Pseudolikelihood EM for within-network relational learning. In: Proc. of ICDM, pp. 1103–1108 (2008)

    Google Scholar 

  35. Zhu, S., Yu, K., Chi, Y., Gong, Y.: Combining content and link for classification using matrix factorization. In: Proc. of SIGIR, pp. 487–494. ACM (2007)

    Google Scholar 

  36. Zhu, X., Ghahramani, Z., Lafferty, J.D.: Semi-supervised learning using gaussian fields and harmonic functions. In: Proc. of ICML, pp. 912–919 (2003)

    Google Scholar 

  37. Zhu, Y., Yan, X., Getoor, L., Moore, C.: Scalable text and link analysis with mixed-topic link models. In: Proc. of KDD, pp. 473–481 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luke K. McDowell .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

McDowell, L.K., Fleming, A., Markel, Z. (2015). Evaluating and Extending Latent Methods for Link-Based Classification. In: Bouabana-Tebibel, T., Rubin, S. (eds) Formalisms for Reuse and Systems Integration. FMI 2014. Advances in Intelligent Systems and Computing, vol 346. Springer, Cham. https://doi.org/10.1007/978-3-319-16577-6_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-16577-6_10

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-16576-9

  • Online ISBN: 978-3-319-16577-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics