Tree-Based Credal Networks for Classification

Zaffalon, Marco; Fagiuoli, Enrico

doi:10.1023/A:1025822321743

Tree-Based Credal Networks for Classification

Published: December 2003

Volume 9, pages 487–509, (2003)
Cite this article

Reliable Computing

Marco Zaffalon¹ &
Enrico Fagiuoli²

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Bayesian networks are models for uncertain reasoning which are achieving a growing importance also for the data mining task of classification. Credal networks extend Bayesian nets to sets of distributions, or credal sets. This paper extends a state-of-the-art Bayesian net for classification, called tree-augmented naive Bayes classifier, to credal sets originated from probability intervals. This extension is a basis to address the fundamental problem of prior ignorance about the distribution that generates the data, which is a commonplace in data mining applications. This issue is often neglected, but addressing it properly is a key to ultimately draw reliable conclusions from the inferred models. In this paper we formalize the new model, develop an exact linear-time classification algorithm, and evaluate the credal net-based classifier on a number of real data sets. The empirical analysis shows that the new classifier is good and reliable, and raises a problem of excessive caution that is discussed in the paper. Overall, given the favorable trade-off between expressiveness and efficient computation, the newly proposed classifier appears to be a good candidate for the wide-scale application of reliable classifiers based on credal networks, to real and complex tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Abell'an, J. and Moral, S.: Building Classification Trees Using the Total Uncertainty Criterion, in: deCooman, G., Fine, T., and Seidenfeld, T. (eds), ISIPTA'01, Shaker Publishing, TheNetherlands, 2001, pp. 1–8.
Balas, E. and Zemel, E.: An Algorithm for Large Zero-One Knapsack Problems, Operations Research 28 (1980), pp. 1130–1154.
Google Scholar
Bernard, J.-M.: Implicative Analysis for Multivariate Binary Data Using an Imprecise Dirichlet Model, Journal of Statistical Planning and Inference 105 (1) (2002), pp. 83–103.
Google Scholar
Bernardo, J.M. and Smith, A. F. M.: Bayesian Theory, Wiley, New York, 1996.
Google Scholar
Campos, L., Huete, J., and Moral, S.: Probability Intervals: A Tool for Uncertain Reasoning, International Journal of Uncertainty, Fuzziness, and Knowledge-Based Systems 2(2) (1994), pp. 167–196.
Google Scholar
Chen, T. T. and Fienberg, S. E.: Two-Dimensional Contingency Tables with both Completely and Partially Cross-Classifled Data, Biometrics 32 (1974), pp. 133–144.
Google Scholar
Chow, C. K. and Liu, C. N.: Approximating Discrete Probability Distributions with Dependence Trees, IEEE Transactions on Information Theory IT-14(3) (1968), pp. 462–467.
Google Scholar
Couso, I., Moral, S., and Walley, P.: A Survey of Concepts of Independence for Imprecise Probability, Risk, Decision and Policy 5 (2000), pp. 165–181.
Google Scholar
Cozman, F. G.: Credal Networks, Artificial Intelligence 120 (2000), pp. 199–233.
Google Scholar
Cozman, F.G.: Separation Properties of Sets of Probabilities, in: Boutilier, C. and Goldszmidt, M. (eds), UAI-2000, Morgan Kaufmann, San Francisco, 2000, pp. 107–115.
Google Scholar
Dasgupta, S.: Learning Polytrees, in: UAI-99,Morgan Kaufmann, San Francisco, 1999, pp. 134– 141.
Google Scholar
Duda, R. O. and Hart, P. E.: Pattern Classification and Scene Analysis, Wiley, New York, 1973.
Google Scholar
Duda, R. O., Hart, P. E., and Stork, D. G.: Pattern Classification, 2nd edition, Wiley, 2001.
Fagiuoli, E. and Zaffalon, M.: 2U: An Exact Interval Propagation Algorithm for Polytrees with Binary Variables, Artificial Intelligence 106(1) (1998), pp. 77–107.
Google Scholar
Fagiuoli, E. and Zaffalon, M.: Tree-Augmented Naive Credal Classifiers, in: IPMU 2000: Proceedings of the 8th Information Processing andManagement of Uncertainty in Knowledge-Based Systems Conference, Universidad Polit'ecnica de Madrid, Spain, 2000, pp. 1320–1327.
Google Scholar
Fayyad, U. M. and Irani, K. B.: Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning, in: Proceedings of the 13th International Joint Conference on Artificial Intelligence, Morgan Kaufmann, San Francisco, 1993, pp. 1022–1027.
Google Scholar
Ferreira da Rocha, J. C. and Cozman, F. G.: Inference with Separately Specified Sets of Probabilities in Credal Networks, in: Darwiche, A. and Friedman, N. (eds), Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence (UAI-2002), Morgan Kaufmann, 2002, pp. 430–437.
Friedman, N., Geiger, D., and Goldszmidt, M.: Bayesian Networks Classifiers,Machine Learning 29 (2/3) (1997), pp. 131–163.
Google Scholar
Ha, V., Doan, A., Vu, V., and Haddawy, P.: Geometric Foundations for Interval-Based Probabilities, Annals of Mathematics and Artificial Intelligence 24(1–4) (1998), pp. 1–21.
Google Scholar
Kleiter, G. D.: The Posterior Probability of BayesNets with Strong Dependences, Soft Computing 3 (1999), pp. 162–173.
Google Scholar
Kohavi, R.: A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection, in: IJCAI-95, Morgan Kaufmann, San Mateo, 1995, pp. 1137–1143.
Google Scholar
Kullback, S. and Leiber, R. A.: On Information and Sufficiency, Ann. Math. Statistics 22 (1951), pp. 79–86.
Google Scholar
Kyburg, H. E. Jr.: Rational Belief, The Behavioral and Brain Sciences 6 (1983), pp. 231–273.
Google Scholar
Lawler, E.: Fast Approximation Algorithms for Knapsack Problems, Mathematics of Operations Research 4(4) (1979), pp. 339–356.
Google Scholar
Levi, I.: The Enterprise of Knowledge, MIT Press, London, 1980.
Google Scholar
Little, R. J. A. and Rubin, D. B.: Statistical Analysis with Missing Data,Wiley, New York, 1987.
Google Scholar
Martello, S. and Toth, P.: Knapsack Problems: Algorithms andComputer Implementations,Wiley, Chichester, 1990.
Google Scholar
Moral, S. and Cano, A.: Strong Conditional Independence for Credal Sets, Annals ofMathematics and Artificial Intelligence 35(1–4) (2002), pp. 295–321.
Google Scholar
Murphy, P. M. and Aha, D. W.: UCI Repository of Machine Learning Databases, 1995, http://www.sgi.com/Technology/mlc/db/.
Nivlet, P., Fournier, F., and Royer, J.-J.: Interval Discriminant Analysis: An Efficient Method to Integrate Errors in Supervised PatternRecognition, in: de Cooman, G., Fine, T., and Seidenfeld, T. (eds), ISIPTA'01, Shaker Publishing, The Netherlands, 2001, pp. 284–292.
Google Scholar
Pearl, J.: Probabilistic Reasoning in Intelligent Systems:Networks of Plausible Inference,Morgan Kaufmann, San Mateo, 1988.
Google Scholar
Quinlan, J. R.: C4.5: Programs for Machine Learning, Morgan Kaufmann, San Mateo, 1993.
Google Scholar
Ramoni, M. and Sebastiani, P.: Robust Bayes Classifiers, Artificial Intelligence 125(1–2) (2001), pp. 209–226.
Google Scholar
Walley, P.: Inferences from Multinomial Data: Learning about a Bag of Marbles, J. R. Statist. Soc. B 58(1) (1996), pp. 3–57.
Google Scholar
Walley, P.: Statistical Reasoning with Imprecise Probabilities, Chapman and Hall, New York, 1991.
Google Scholar
Walley, P. and Fine, T. L.: Towards a Frequentist Theory of Upper and Lower Probability, Ann. Statist. 10 (1982), pp. 741–761.
Google Scholar
Witten, I. H. and Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, Morgan Kaufmann, 1999.
Zaffalon, M.: A Credal Approach to Naive Classification, in: de Cooman, G., Cozman, F., Moral, S., and Walley, P. (eds), ISIPTA'99, The Imprecise Probabilities Project, Univ. of Gent, Belgium, 1999, pp. 405–414.
Google Scholar
Zaffalon, M.: Statistical Inference of the Naive Credal Classifier, in: de Cooman, G., Fine, T., and Seidenfeld, T. (eds), ISIPTA'01, Shaker Publishing, The Netherlands, 2001, pp. 384–393.
Google Scholar
Zaffalon, M.: The Naive Credal Classifier, Journal of Statistical Planning and Inference 105(1) (2002), pp. 5–21.
Google Scholar
Zaffalon, M. and Hutter, M.: Robust Feature Selection by Mutual Information Distributions, in: Darwiche, A. and Friedman, N. (eds), Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence (UAI-2002), Morgan Kaufmann, 2002, pp. 577–584.
Zaffalon, M. and Hutter, M.: Robust Inference of Trees, Technical Report IDSIA–11–03, IDSIA, 2003.

Download references

Author information

Authors and Affiliations

IDSIA, Galleria 2, CH-6928, Manno (Lugano), Switzerland
Marco Zaffalon
DiSCo, Università degli Studi di Milano-Bicocca, Via Bicocca degli Arcimboldi 8, I-20126, Milano, Italy
Enrico Fagiuoli

Authors

Marco Zaffalon
View author publications
You can also search for this author in PubMed Google Scholar
Enrico Fagiuoli
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zaffalon, M., Fagiuoli, E. Tree-Based Credal Networks for Classification. Reliable Computing 9, 487–509 (2003). https://doi.org/10.1023/A:1025822321743

Download citation

Issue Date: December 2003
DOI: https://doi.org/10.1023/A:1025822321743

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Tree-Based Credal Networks for Classification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Recent advances in decision trees: an updated survey

Imprecise Classification with Non-parametric Predictive Inference

Credal C4.5 with Refinement of Parameters

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Tree-Based Credal Networks for Classification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Recent advances in decision trees: an updated survey

Imprecise Classification with Non-parametric Predictive Inference

Credal C4.5 with Refinement of Parameters

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now