Abstract
Data describing networks such as social networks, citation graphs, hypertext systems, and communication networks is becoming increasingly common and important for analysis. Research on link-based classification studies methods to leverage connections in such networks to improve accuracy. Recently, a number of such methods have been proposed that first construct a set of latent features or links that summarize the network, then use this information for inference. Some work has claimed that such latent methods improve accuracy, but has not compared against the best non-latent methods. In response, this article provides the first substantial comparison between these two groups. Using six real datasets, a range of synthetic data, and multiple underlying models, we show that (non-latent) collective inference methods usually perform best, but that the dataset’s label sparsity, attribute predictiveness, and link density can dramatically affect the performance trends. Inspired by these findings, we introduce three novel algorithms that combine a latent construction with a latent or non-latent method, and demonstrate that they can sometimes substantially increase accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bilgic, M., Mihalkova, L., Getoor, L.: Active learning for networked data. In: Proc. of ICML, pp. 79–86 (2010)
BollobĂ¡s, B., Borgs, C., Chayes, J., Riordan, O.: Directed scale-free graphs. In: Proceedings of the 14th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 132–139 (2003)
Chakrabarti, S., Dom, B., Indyk, P.: Enhanced hypertext categorization using hyperlinks. In: Proc. of SIGMOD, pp. 307–318 (1998)
Crane, R., McDowell, L.: Investigating markov logic networks for collective classification. In: Proc. of ICAART, pp. 5–15 (2012)
Fleming, A., McDowell, L.K., Markel, Z.: A Hidden Treasure? Evaluating and Extending Latent Methods for Link-based Classification. In: Proc. of IRI, pp. 669–676 (2014)
Gallagher, B., Tong, H., Eliassi-Rad, T., Faloutsos, C.: Using ghost edges for classification in sparsely labeled networks. In: Proc. of KDD, pp. 256–264 (2008)
Hoff, P.: Multiplicative latent factor models for description and prediction of social networks. Computational & Mathematical Organization Theory 15(4), 261–272 (2009)
Jensen, D., Neville, J., Gallagher, B.: Why collective inference improves relational classification. In: Proc. of KDD, pp. 593–598 (2004)
Jensen, D., Neville, J.: Autocorrelation and linkage cause bias in evaluation of relational learners. In: Proc. of ILP, pp. 259–266 (2002)
Jensen, D., Neville, J.: Linkage and autocorrelation cause feature selection bias in relational learning. In: Proc. of ICML, pp. 259–266 (2002)
Kuwadekar, A., Neville, J.: Relational active learning for joint collective classification models. In: Proc. of ICML, pp. 385–392 (2011)
Lin, F., Cohen, W.W.: Semi-supervised classification of network data using very few labels. In: Proc. of ASONAM, pp. 192–199 (2010)
Macskassy, S., Provost, F.: Classification in networked data: A toolkit and a univariate case study. J. of Machine Learning Research 8, 935–983 (2007)
McDowell, L.K., Aha, D.: Semi-supervised collective classification via hybrid label regularization. In: Proc. of ICML, pp. 975–982 (2012)
McDowell, L.K., Aha, D.W.: Labels or attributes? Rethinking the neighbors for collective classification in sparsely-labeled networks. In: Proc. of CIKM, pp. 847–852 (2013)
McDowell, L., Gupta, K., Aha, D.: Cautious collective classification. J. of Machine Learning Research 10, 2777–2836 (2009)
McDowell, L.K., Gupta, K.M., Aha, D.W.: Cautious inference in collective classification. In: Proc. of AAAI, pp. 596–601 (2007)
Menon, A., Elkan, C.: Link prediction via matrix factorization. Machine Learning and Knowledge Discovery in Databases, pp. 437–452 (2011)
Menon, A., Elkan, C.: Predicting labels for dyadic data. Data Mining and Knowledge Discovery 21(2), 327–343 (2010)
Miller, K., Griffiths, T., Jordan, M.: Nonparametric latent feature models for link prediction. In: Advances in Neural Information Processing Systems (NIPS), pp. 1276–1284 (2009)
Namata, G.M., London, B., Getoor, L., Huang, B.: Query-driven active surveying for collective classification. In: Workshop on Mining and Learning with Graphs at ICML 2012 (2012)
Namata, G., Kok, S., Getoor, L.: Collective graph identification. In: Proc. of KDD, pp. 87–95 (2011)
Neville, J., Jensen, D.: Iterative classification in relational data. In: Proc. of the Workshop on Learning Statistical Models from Relational Data at AAAI-2000, pp. 13–20 (2000)
Neville, J., Jensen, D.: Leveraging relational autocorrelation with latent group models. In: Proc. of ICDM, pp. 170–177 (2005)
Neville, J., Jensen, D.: Relational dependency networks. J. of Machine Learning Research 8, 653–692 (2007)
Neville, J., Simsek, Ö., Jensen, D., Komoroske, J., Palmer, K., Goldberg, H.G.: Using relational knowledge discovery to prevent securities fraud. In: Proc. of KDD, pp. 449–458 (2005)
Sen, P., Namata, G., Bilgic, M., Getoor, L., Gallagher, B., Eliassi-Rad, T.: Collective classification in network data. AI Magazine 29(3), 93–106 (2008)
Shi, X., Li, Y., Yu, P.: Collective prediction with latent graphs. In: Proc. of CIKM, pp. 1127–1136 (2011)
Tang, L., Liu, H.: Relational learning via latent social dimensions. In: Proc. of KDD, pp. 817–826 (2009)
Tang, L., Liu, H.: Leveraging social media networks for classification. Data Mining and Knowledge Discovery, pp. 1–32 (2011)
Tang, L., Wang, X., Liu, H.: Scalable learning of collective behavior. IEEE Transactions on Knowledge and Data Engineering (2011)
Taskar, B., Abbeel, P., Koller, D.: Discriminative probalistic models for relational data. In: Proc. of UAI, pp. 485–492 (2002)
Wang, T., Neville, J., Gallagher, B., Eliassi-Rad, T.: Correcting bias in statistical tests for network classifier evaluation. In: Proc. of ECML, pp. 506–521 (2011)
Xiang, R., Neville, J.: Pseudolikelihood EM for within-network relational learning. In: Proc. of ICDM, pp. 1103–1108 (2008)
Zhu, S., Yu, K., Chi, Y., Gong, Y.: Combining content and link for classification using matrix factorization. In: Proc. of SIGIR, pp. 487–494. ACM (2007)
Zhu, X., Ghahramani, Z., Lafferty, J.D.: Semi-supervised learning using gaussian fields and harmonic functions. In: Proc. of ICML, pp. 912–919 (2003)
Zhu, Y., Yan, X., Getoor, L., Moore, C.: Scalable text and link analysis with mixed-topic link models. In: Proc. of KDD, pp. 473–481 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
McDowell, L.K., Fleming, A., Markel, Z. (2015). Evaluating and Extending Latent Methods for Link-Based Classification. In: Bouabana-Tebibel, T., Rubin, S. (eds) Formalisms for Reuse and Systems Integration. FMI 2014. Advances in Intelligent Systems and Computing, vol 346. Springer, Cham. https://doi.org/10.1007/978-3-319-16577-6_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-16577-6_10
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16576-9
Online ISBN: 978-3-319-16577-6
eBook Packages: EngineeringEngineering (R0)