Abstract
Bias/variance analysis is a useful tool for investigating the performance of machine learning algorithms. Conventional analysis decomposes loss into errors due to aspects of the learning process, but in relational domains, the inference process used for prediction introduces an additional source of error. Collective inference techniques introduce additional error, both through the use of approximate inference algorithms and through variation in the availability of test-set information. To date, the impact of inference error on model performance has not been investigated. We propose a new bias/variance framework that decomposes loss into errors due to both the learning and inference processes. We evaluate the performance of three relational models on both synthetic and real-world datasets and show that (1) inference can be a significant source of error, and (2) the models exhibit different types of errors as data characteristics are varied.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Bernstein, A., Clearwater, S., & Provost, F. (2003). The relational vector-space model and industry classification. In Proceedings of the IJCAI-2003 workshop on learning statistical models from relational data (pp. 8–18).
Craven, M., DiPasquo, D., Freitag, D., McCallum, A., Mitchell, T., Nigam, K., & Slattery, S. (1998). Learning to extract symbolic knowledge from the world wide web. In Proceedings of the 15th national conference on artificial intelligence (pp. 509–516).
Domingos, P. (2000). A unified bias-variance decomposition for zero-one and squared loss. In Proceedings of the 17th national conference on artificial intelligence (pp. 564–569).
Domingos, P., & Pazzani, M. (1997). On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning, 29, 103–130.
Duda, R., Hart, P., & Stork, D. (2001). Pattern classification. New York: Wiley.
Friedman, J. (1997). On bias, variance, 0/1-loss, and the curse-of-dimensionality. Data Mining and Knowledge Discovery, 1(1), 55–77.
Geman, S., Bienenstock, E., & Doursat, R. (1992). Neural networks and the bias/variance dilemma. Neural Computation, 4, 1–58.
Getoor, L., Friedman, N., Koller, D., & Pfeffer, A. (2001). Learning probabilistic relational models. In Relational data mining (pp. 307–335). Berlin: Springer.
Goodman, L. (1961). Snowball Sampling. Annals of Mathematical Statistics, 32, 148–170.
Heckerman, D., Chickering, D., Meek, C., Rounthwaite, R., & Kadie, C. (2000). Dependency networks for inference, collaborative filtering and data visualization. Journal of Machine Learning Research, 1, 49–75.
Hill, S., Provost, F., & Volinsky, C. (2006). Network-based marketing: identifying likely adopters via consumer networks. Statistical Science, 22(2).
Holte, R. (1993). Very simple classification rules perform well on most commonly used datatsets. Machine Learning, 11, 63–91.
James, G. (2003). Variance and bias for general loss functions. Machine Learning, 51, 115–135.
Jensen, D., & Neville, J. (2002). Linkage and autocorrelation cause feature selection bias in relational learning. In Proceedings of the 19th international conference on machine learning (pp. 259–266).
Jensen, D., Neville, J., & Gallagher, B. (2004). Why collective inference improves relational classification. In Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 593–598).
Macskassy, S., & Provost, F. (2007). Classification in networked data: a toolkit and a univariate case study. Journal of Machine Learning Research, 8, 935–983.
McCallum, A., Nigam, K., Rennie, J., & Seymore, K. (1999). A machine learning approach to building domain-specific search engines. In Proceedings of the 16th international joint conference on artificial intelligence (pp. 662–667).
Murphy, K., Weiss, Y., & Jordan, M. (1999). Loopy belief propagation for approximate inference: an empirical study. In Proceedings of the 15th conference on uncertainty in artificial intelligence (pp. 467–479).
Neville, J., & Jensen, D. (2004). Dependency networks for relational data. In Proceedings of the 4th IEEE international conference on data mining (pp. 170–177).
Neville, J., & Jensen, D. (2005). Leveraging relational autocorrelation with latent group models. In Proceedings of the 5th IEEE international conference on data mining (pp. 322–329).
Neville, J., Jensen, D., Friedland, L., & Hay, M. (2003). Learning relational probability trees. In Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 625–630).
Taskar, B., Abbeel, P., & Koller, D. (2002). Discriminative probabilistic models for relational data. In Proceedings of the 18th conference on uncertainty in artificial intelligence (pp. 485–492).
Wainwright, M. (2005). Estimating the “wrong” Markov random field: benefits in the computation-limited setting. In Advances in neural information processing systems.
Author information
Authors and Affiliations
Corresponding author
Additional information
Editors: Hendrik Blockeel, Jude Shavlik, Prasad Tadepalli.
Rights and permissions
About this article
Cite this article
Neville, J., Jensen, D. A bias/variance decomposition for models using collective inference. Mach Learn 73, 87–106 (2008). https://doi.org/10.1007/s10994-008-5066-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10994-008-5066-6