skip to main content
10.1145/1014052.1014125acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
Article

Why collective inference improves relational classification

Published:22 August 2004Publication History

ABSTRACT

Procedures for collective inference make simultaneous statistical judgments about the same variables for a set of related data instances. For example, collective inference could be used to simultaneously classify a set of hyperlinked documents or infer the legitimacy of a set of related financial transactions. Several recent studies indicate that collective inference can significantly reduce classification error when compared with traditional inference techniques. We investigate the underlying mechanisms for this error reduction by reviewing past work on collective inference and characterizing different types of statistical models used for making inference in relational data. We show important differences among these models, and we characterize the necessary and sufficient conditions for reduced classification error based on experiments with real and simulated data.

References

  1. Chakrabarti, S., B. Dom & P. Indyk. Enhanced Hypertext Classification Using Hyper-Links, In Proc. ACM SIGMOD Conference, pp. 307--318, 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Domingos, P. A Unified Bias-Variance Decomposition for Zero-One and Squared Loss. In Proc. of the 17th National Conference on Artificial Intelligence, pp. 564--569, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Domingos, P. & M. Richardson. Mining the Network Value of Customers. In Proc. of the 7th International Conference on Knowledge Discovery and Data Mining, pp. 57--66, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Getoor, L., N. Friedman, D. Koller, & A. Pfeffer. Learning Probabilistic Relational Models. In Relational Data Mining, S. Dzeroski and N. Lavrac, Eds., Springer-Verlag, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Getoor, L., E. Segal, B. Taskar, & D. Koller. Probabilistic Models of Text and Link Structure for Hypertext Classification. In Proc. IJCAI01 Workshop on Text Learning: Beyond Supervision, 2001.]]Google ScholarGoogle Scholar
  6. Getoor, L., J. Rhee, D. Koller, & P. Small. Understanding Tuberculosis Epidemiology using Probabilistic Relational Models. Journal of Artificial Intelligence in Medicine, vol. 30, pp. 233--256, 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Jensen, D. & J. Neville. Linkage and Autocorrelation Cause Feature Selection Bias in Relational Learning. In Proc. of the 19th International Conference on Machine Learning, pp. 259--266, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Kersting, K. & L. De Raedt. Basic principles of learning Bayesian logic programs. Technical Report No. 174, Institute for Computer Science, University of Freiburg, Germany, June 2002.]]Google ScholarGoogle Scholar
  9. Macskassy, S. & F. Provost. A Simple Relational Classifier. In Proc. KDD-2003 Workshop on Multi-Relational Data Mining (MRDM-2003), pp. 64--76, 2003.]]Google ScholarGoogle Scholar
  10. Neville, J. & D. Jensen. Iterative Classification in Relational Data. In Proc. AAAI-2000 Workshop on Learning Statistical Models from Relational Data, pp. 13--20, 2000.]]Google ScholarGoogle Scholar
  11. Neville, J. & D. Jensen. Supporting Relational Knowledge Discovery: Lessons in Architecture and Algorithm Design. In Proc. ICML2002 Data Mining Lessons Learned Workshop, pp. 57--64, 2002.]]Google ScholarGoogle Scholar
  12. Neville, J., & Jensen, D. Collective Classification with Relational Dependency Networks. In Proc. KDD-2003 Workshop on Multi-Relational Data Mining (MRDM-2003), pp. 77--91, 2003.]]Google ScholarGoogle Scholar
  13. Neville, J., D. Jensen & B. Gallagher. Simple Estimators for Relational Bayesian Classifiers. In Proc. of the 3rd IEEE International Conference on Data Mining, pp. 609--612, 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Slattery, S., & T. Mitchell. Discovering Test Set Regularities in Relational Domains. In Proc. 17th International Conference on Machine Learning, pp.895--902, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Taskar, B., P. Abbeel & D. Koller. Discriminative Probabilistic Models for Relational Data. In Proc. 18th Conference on Uncertainty in Artificial Intelligence, pp. 485--492, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Taskar, B., E. Segal & D. Koller. Probabilistic Classification and Clustering in Relational Data. In Proc. 17th International Joint Conference on Artificial Intelligence, pp. 870--878, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Yang, Y, S. Slattery & R. Ghani. A Study of Approaches to Hypertext Categorization. Journal of Intelligent Information Systems. 18(2-3): 219--241. 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Why collective inference improves relational classification

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
      August 2004
      874 pages
      ISBN:1581138881
      DOI:10.1145/1014052

      Copyright © 2004 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 22 August 2004

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

      KDD '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader