Link classification with probabilistic graphs

Di Mauro, Nicola; Taranto, Claudio; Esposito, Floriana

doi:10.1007/s10844-013-0293-0

Link classification with probabilistic graphs

Published: 15 January 2014

Volume 42, pages 181–206, (2014)
Cite this article

Journal of Intelligent Information Systems Aims and scope Submit manuscript

Nicola Di Mauro¹,
Claudio Taranto¹ &
Floriana Esposito¹

374 Accesses
5 Citations
Explore all metrics

Abstract

The need to deal with the inherent uncertainty in real-world relational or networked data leads to the proposal of new probabilistic models, such as probabilistic graphs. Every edge in a probabilistic graph is associated with a probability whose value represents the likelihood of its existence, or the strength of the relation between the entities it connects. The aim of this paper is to propose two machine learning techniques for the link classification problem in relational data exploiting the probabilistic graph representation. Both the proposed methods will exploit a language-constrained reachability method to infer the probability of possible hidden relationships that may exists between two nodes in a probabilistic graph. Each hidden relationships between two nodes may be viewed as a feature (or a factor), and its corresponding probability as its weight, while an observed relationship is considered as a positive instance for its corresponding link label. Given a training set of observed links, the first learning approach is to use a propositionalization technique adopting a L2-regularized Logistic Regression to learn a model able to predict unobserved link labels. Since in some cases the edges’ probability may be not known in advance or they could not be precisely defined for a classification task, the second xposed approach is to exploit the inference method and to use a mean squared technique to learn the edges’ probabilities. Both the proposed methods have been evaluated on real world data sets and the corresponding results proved their validity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning in Probabilistic Graphs Exploiting Language-Constrained Patterns

Link Prediction via Higher-Order Motif Features

A Recursive Bayesian Approach for the Link Prediction Problem

Article 01 September 2018

Notes

Sometimes called certain graph.
In the rest of the paper, if not otherwise specified, \(\mathbb{I} {C}\) denotes the indicator function returning 1 if the condition C is true, and 0 otherwise.
http://www.csie.ntu.edu.tw/~cjlin/liblinear.
http://ir.ii.uam.es/hetrec2011/datasets.html
http://www.lastfm.com
http://www.di.uniba.it/~claudiotaranto/eagle.html

References

Baccianella, S., Esuli, A., Sebastiani, F. (2009). Evaluation measures for ordinal regression. In Proceedings of the 9th international conference on intelligent systems design and applications. IEEE, (pp. 283–287).
Bottou, L. (1998). Online algorithms and stochastic approximations. In D. Saad (Ed.), Online Learning and Neural Networks. Cambridge: Cambridge University Press.
Google Scholar
Cantador, I., Brusilovsky, P., Kuflik, T. (eds.) (2011). 2nd Workshop on information heterogeneity and fusion. Recommender Systems (HetRec 2011), ACM.
Colbourn, C.J. (1987). The Combinatorics of Network Reliability. Oxford University Press.
Craven, M., & Slattery, S. (2001). Relational learning with statistical predicate invention: better models for hypertext. Machine Learning, 43(1–2), 97–119.
Article MATH Google Scholar
Davis, J., & Goadrich, M. (2006). The relationship between precision-recall and roc curve. In Proceedings of the 23rd international conference on machine learning (pp. 233–240).
De Raedt, L., Frasconi, P., Kersting, K. (2008). Probabilistic Inductive Logic Programming. In S. Muggleto, (Ed.) Theory and Applications, LNCS, (vol 4911). Springer.
Desrosiers, C., & Karypis, G. (2011). A comprehensive survey of neighborhood-based recommendation methods. In F. Ricci, L. Rokach, B. Shapira, P. B. Kantor (Eds.) Recommender Systems Handbook (pp. 107–144). Springer.
Domingos, P., & Lowd, D. (2009). Markov Logic: an interface layer for artificial intelligence, 1st edn. Morgan and Claypool Publishers.
Duchi, J.C., Hazan, E., Singer, Y. (2010). Adaptive subgradient methods for online learning and stochastic optimization. In A. T. Kalai & M. Mohri (Eds.) The 23rd Conference on Learning Theory, Omnipress, (pp. 257–269).
Georgiev, K., & Nakov, P. (2013). A non-iid framework for collaborative filtering with restricted Boltzmann machines. In S. Dasgupta & D. McAllester (Eds.) Proceedings of the 30th international conference on machine learning, JMLR workshop and conference proceedings (Vol. 28, pp. 1148–1156).
Getoor, L., & Diehl, C.P. (2005). Link mining: a survey. SIGKDD Explorations, 7(2), 3–12.
Article Google Scholar
Getoor, L., & Taskar, B. (2007). Introduction to Statistical Relational Learning Adaptive Computation and Machine Learning. The MIT Press.
Goldberg, D.S., & Roth, F.P. (2003). Assessing experimentally derived interactions in a small world. Proceedings of the National Academy of Sciences, 100(8), 4372–4376.
Article MATH MathSciNet Google Scholar
Gutmann, B., Kimmig, A., Kersting, K., Raedt, L. (2008). Parameter learning in probabilistic databases: a least squares approach. In Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I (pp. 473–488). Springer.
Gutmann, B., Thon, I., De Raedt, L. (2011). Learning the parameters of probabilistic logic programs from interpretations. In Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Part I (pp. 581–596). Springer.
He, J., & Chu, W.W. (2010). A social network-based recommender system (snrs). In N. Memon, J. J. Xu, D. L. Hicks, H. Chen (Eds.) Data Mining for Social Network Data, Annals of Information Systems (Vol. 12, pp. 47–74). Springer.
Jin, R., Liu, L., Ding, B., Wang, H. (2011). Distance-constraint reachability computation in uncertain graphs. Proceedings of the VLDB Endownment, 4, 551–562.
Google Scholar
Kramer, S., Lavrač N., Flach, P. (2000). In Relational data mining,chap propositionalization approaches to relational data mining, (pp. 262–286). Berlin: Springer-Verlag.
Google Scholar
Langseth, H., & Nielsen, T.D. (2012). A latent model for collaborative filtering. International Journal of Approximate Reasoning, 53(4), 447–466.
Article MathSciNet Google Scholar
Lin, C.J., Weng, R.C., Keerthi, S.S. (2008). Trust region newton method for logistic regression. Journal of Machine Learning Research, 9, 627–650.
MATH MathSciNet Google Scholar
Macskassy, S.A. (2011). Relational classifiers in a non-relational world: using homophily to create relations. In X. Chen, T. S. Dillon, H. Ishbuchi, J. Pei, H. Wang, M. A. Wani (Eds.) 10th International Conference on Machine Learning and Applications and Workshops, IEEE, (pp. 406–411).
Newman, M.E.J. (2001a). Clustering and preferential attachment in growing networks. Physical Review E, 64.
Newman, M.E.J. (2001b) The structure of scientific collaboration networks. Proceedings of the National Academy of Sciences of the United States of America, 98(2), 404–409.
Article MATH Google Scholar
Pfeiffer, I.J.J., & Neville, J. (2011). Methods to determine node centrality and clustering in graphs with uncertain structure. In Proceedings of the Fifth International Conference on Weblogs and Social Media, The AAAI Press.
Popescul, A., & Ungar, L.H. (2003). Statistical relational learning for link prediction. In IJCAI03 Workshop on Learning Statistical Models from Relational Data.
Potamias, M., Bonchi, F., Gionis, A., Kollios, G. (2010). K-nearest neighbors in uncertain graphs. Proceedings of the VLDB Endowment, 3, 997–1008.
Google Scholar
Robbins, H., & Monro, S. (1951). A stochastic approximation method. Annals of Mathematical Statistics, 22(3), 400–407.
Article MATH MathSciNet Google Scholar
Sato, T. (1995). A statistical learning method for logic programs with distribution semantics. In Proceedings of the 12th International Conference on Logic Programming. MIT Press (pp. 715–729).
Taranto, C., Di Mauro, N., Esposito, F. (2011). Probabilistic inference over image networks. Italian Research 7 Conference on Digital Libraries 2011 (Vol 249, pp. 1-13). CCIS.
Taranto, C., Di Mauro, N., Esposito, F. (2012a). Uncertain graphs meet collaborative filtering. In 3rd Italian Information Retrieval Workshop.
Taranto, C., Di Mauro, N., Esposito, F. (2012b). Uncertain (multi)graphs for personalization services in digital libraries. In M. Agosti, F. Esposito, S. Ferilli, N. Ferro (Eds.) 8th Italian Research Conference on Digital Libraries, Vol. 354. Berlin: Springer, CCIS.
Taranto, C., Di Mauro, N., Esposito, F. (2013). Learning in probabilistic graphs exploiting language-constrained patterns. In A. Appice, M. Ceci, C. Loglisci, G. Manco, E. Masciari, Z. W. Ras (Eds.) New Frontiers in Mining Complex Patterns, LNCS (Vol. 7765, pp. 155–169). Berlin: Springer.
Chapter Google Scholar
Taskar, B., Wong, M.F., Abbeel, P., Koller, D. (2003). Link prediction in relational data. In S. Thrun, L. K. Saul, B. Schölkopf (Eds.) Advances in Neural Information Processing Systems (p. 16).
von Luxburg, U., Radl, A., Hein, M. (2011). Hitting and commute times in large graphs are often misleading. CORR.
Vozalis, M.G., Markos, A., Margaritis, K.G. (2010). Collaborative filtering through svd-based and hierarchical nonlinear pca. In Proceedings of the 20th international conference on Artificial neural networks. Part I, (pp. 395–400). Berlin: Springer.
Google Scholar
Witsenburg, T., & Blockeel, H. (2011). Improving the accuracy of similarity measures by using link information. In M. Kryszkiewicz, H. Rybinski, A. Skowron, Z. W. Ras (Eds.) Proceedings of the 19th International conference on Foundations of Intelligent Systems (Vol. 6804, pp. 501512). Springer: LNCS
Google Scholar
Zan, H., Xin, L., Hsinchun, C. (2005). Link prediction approach to collaborative filtering. In Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries ACM Press (pp. 141–142).
Zhu, J. (2003). Mining web site link structures for adaptive web site navigation and search. PhD thesis.
Zou, Z., Gao, H., Li, J. (2010a). Discovering frequent subgraphs over uncertain graph databases under probabilistic semantics. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM (pp. 633–642).
Zou, Z., Li, J., Gao, H., Zhang, S. (2010b). Finding top-k maximal cliques in an uncertain graph. International Conference on Data Engineering, 649–652.

Download references

Acknowledgments

This work fulfills the research objectives of the PON02_00563_3489339 project “PUGLIA@SERVICE - L’Ingegneria dei Servizi Internet-Based per lo sviluppo strutturale di un territorio intelligente” funded by the Italian Ministry of University and Research (MIUR).

Author information

Authors and Affiliations

Department of Computer Science, University of Bari “Aldo Moro”, 70125, Bari, Italy
Nicola Di Mauro, Claudio Taranto & Floriana Esposito

Authors

Nicola Di Mauro
View author publications
You can also search for this author in PubMed Google Scholar
Claudio Taranto
View author publications
You can also search for this author in PubMed Google Scholar
Floriana Esposito
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nicola Di Mauro.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Di Mauro, N., Taranto, C. & Esposito, F. Link classification with probabilistic graphs. J Intell Inf Syst 42, 181–206 (2014). https://doi.org/10.1007/s10844-013-0293-0

Download citation

Received: 12 May 2013
Revised: 27 September 2013
Accepted: 19 November 2013
Published: 15 January 2014
Issue Date: April 2014
DOI: https://doi.org/10.1007/s10844-013-0293-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Link classification with probabilistic graphs

Abstract

Access this article

Similar content being viewed by others

Learning in Probabilistic Graphs Exploiting Language-Constrained Patterns

Link Prediction via Higher-Order Motif Features

A Recursive Bayesian Approach for the Link Prediction Problem

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Link classification with probabilistic graphs

Abstract

Access this article

Similar content being viewed by others

Learning in Probabilistic Graphs Exploiting Language-Constrained Patterns

Link Prediction via Higher-Order Motif Features

A Recursive Bayesian Approach for the Link Prediction Problem

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation