Article

Why collective inference improves relational classification

Authors:
David Jensen

Univ. of Massachusetts - Amherst, Amherst, MA

Univ. of Massachusetts - Amherst, Amherst, MA
View Profile

,
Jennifer Neville

Univ. of Massachusetts - Amherst, Amherst, MA

Univ. of Massachusetts - Amherst, Amherst, MA
View Profile

,
Brian Gallagher

Univ. of Massachusetts - Amherst, Amherst, MA

Univ. of Massachusetts - Amherst, Amherst, MA
View Profile

KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data miningAugust 2004Pages 593–598https://doi.org/10.1145/1014052.1014125

Published:22 August 2004Publication History

KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 593–598

ABSTRACT

Procedures for collective inference make simultaneous statistical judgments about the same variables for a set of related data instances. For example, collective inference could be used to simultaneously classify a set of hyperlinked documents or infer the legitimacy of a set of related financial transactions. Several recent studies indicate that collective inference can significantly reduce classification error when compared with traditional inference techniques. We investigate the underlying mechanisms for this error reduction by reviewing past work on collective inference and characterizing different types of statistical models used for making inference in relational data. We show important differences among these models, and we characterize the necessary and sufficient conditions for reduced classification error based on experiments with real and simulated data.

References

Chakrabarti, S., B. Dom & P. Indyk. Enhanced Hypertext Classification Using Hyper-Links, In Proc. ACM SIGMOD Conference, pp. 307--318, 1998.]] Google ScholarDigital Library
Domingos, P. A Unified Bias-Variance Decomposition for Zero-One and Squared Loss. In Proc. of the 17th National Conference on Artificial Intelligence, pp. 564--569, 2000.]] Google ScholarDigital Library
Domingos, P. & M. Richardson. Mining the Network Value of Customers. In Proc. of the 7th International Conference on Knowledge Discovery and Data Mining, pp. 57--66, 2001.]] Google ScholarDigital Library
Getoor, L., N. Friedman, D. Koller, & A. Pfeffer. Learning Probabilistic Relational Models. In Relational Data Mining, S. Dzeroski and N. Lavrac, Eds., Springer-Verlag, 2001.]] Google ScholarDigital Library
Getoor, L., E. Segal, B. Taskar, & D. Koller. Probabilistic Models of Text and Link Structure for Hypertext Classification. In Proc. IJCAI01 Workshop on Text Learning: Beyond Supervision, 2001.]]Google Scholar
Getoor, L., J. Rhee, D. Koller, & P. Small. Understanding Tuberculosis Epidemiology using Probabilistic Relational Models. Journal of Artificial Intelligence in Medicine, vol. 30, pp. 233--256, 2004.]] Google ScholarDigital Library
Jensen, D. & J. Neville. Linkage and Autocorrelation Cause Feature Selection Bias in Relational Learning. In Proc. of the 19th International Conference on Machine Learning, pp. 259--266, 2002.]] Google ScholarDigital Library
Kersting, K. & L. De Raedt. Basic principles of learning Bayesian logic programs. Technical Report No. 174, Institute for Computer Science, University of Freiburg, Germany, June 2002.]]Google Scholar
Macskassy, S. & F. Provost. A Simple Relational Classifier. In Proc. KDD-2003 Workshop on Multi-Relational Data Mining (MRDM-2003), pp. 64--76, 2003.]]Google Scholar
Neville, J. & D. Jensen. Iterative Classification in Relational Data. In Proc. AAAI-2000 Workshop on Learning Statistical Models from Relational Data, pp. 13--20, 2000.]]Google Scholar
Neville, J. & D. Jensen. Supporting Relational Knowledge Discovery: Lessons in Architecture and Algorithm Design. In Proc. ICML2002 Data Mining Lessons Learned Workshop, pp. 57--64, 2002.]]Google Scholar
Neville, J., & Jensen, D. Collective Classification with Relational Dependency Networks. In Proc. KDD-2003 Workshop on Multi-Relational Data Mining (MRDM-2003), pp. 77--91, 2003.]]Google Scholar
Neville, J., D. Jensen & B. Gallagher. Simple Estimators for Relational Bayesian Classifiers. In Proc. of the 3rd IEEE International Conference on Data Mining, pp. 609--612, 2003.]] Google ScholarDigital Library
Slattery, S., & T. Mitchell. Discovering Test Set Regularities in Relational Domains. In Proc. 17th International Conference on Machine Learning, pp.895--902, 2000.]] Google ScholarDigital Library
Taskar, B., P. Abbeel & D. Koller. Discriminative Probabilistic Models for Relational Data. In Proc. 18th Conference on Uncertainty in Artificial Intelligence, pp. 485--492, 2002.]] Google ScholarDigital Library
Taskar, B., E. Segal & D. Koller. Probabilistic Classification and Clustering in Relational Data. In Proc. 17th International Joint Conference on Artificial Intelligence, pp. 870--878, 2001.]] Google ScholarDigital Library
Yang, Y, S. Slattery & R. Ghani. A Study of Approaches to Hypertext Categorization. Journal of Intelligent Information Systems. 18(2-3): 219--241. 2002.]] Google ScholarDigital Library

Index Terms

Why collective inference improves relational classification
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches

Recommendations

Collective inference for network data with copula latent markov networks
WSDM '13: Proceedings of the sixth ACM international conference on Web search and data mining

The popularity of online social networks and social media has increased the amount of linked data available in Web domains. Relational and Gaussian Markov networks have both been applied successfully for classification in these relational settings. ...
Read More
A bias/variance decomposition for models using collective inference

Bias/variance analysis is a useful tool for investigating the performance of machine learning algorithms. Conventional analysis decomposes loss into errors due to aspects of the learning process, but in relational domains, the inference process used for ...
Read More
Budgeted online collective inference
UAI'15: Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence

Updating inference in response to new evidence is a fundamental challenge in artificial intelligence. Many real problems require large probabilistic graphical models, containing millions of interdependent variables. For such large models, jointly ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
August 2004
874 pages
ISBN:1581138881
DOI:10.1145/1014052
General Chairs:
Won Kim
Cyber Database Solutions
,
Ronny Kohavi
Amazon.com
,
Program Chairs:
Johannes Gehrke
Cornell University
,
William DuMouchel
AT&T Labs Research
Copyright © 2004 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 August 2004
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
collective inference
probabilistic relational models
relational learning
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 167
  Total Citations
  View Citations
- 983
  Total Downloads
- Downloads (Last 12 months)12
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Why collective inference improves relational classification

KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Collective inference for network data with copula latent markov networks

A bias/variance decomposition for models using collective inference

Budgeted online collective inference