Abstract
In this paper, we study the use of Bayesian networks to interpret breast X-ray images in the context of breast-cancer screening. In particular, we investigate the performance of a manually developed Bayesian network under various discretisation schemes to check whether the probabilistic parameters in the initial manual network with continuous features are optimal and correctly reflect the reality. The classification performance was determined using ROC analysis. A few algorithms perform better than the continuous baseline: best was the entropy-based method of Fayyad and Irani, but also simpler algorithms did outperform the continuous baseline. Two simpler methods with only 3 bins per variable gave results similar to the continuous baseline. These results indicate that it is worthwhile to consider discretising continuous data when developing Bayesian networks and support the practical importance of probabilitistic parameters in determining the network’s performance.
Keywords
- Receiver Operating Characteristic Curve
- Bayesian Network
- Bayesian Network Model
- Discretisation Technique
- Link Level
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Abraham, R., Simha, J.B., Iyengar, S.S.: A comparative analysis of discretization methods for medical datamining with naїve Bayesian classifier. In: Proc. of the Ninth International Conference on Information Technology, pp. 235–236 (2006)
Acid, S., de Campos, L.M., Fernandez-Luna, J.M., Rodriguez, S., Rodriguez, J.M., Salcedo, J.L.: A comparison of learning algorithms for Bayesian networks: a case study based on data from an emergency medical service. Artif. Intel. in Medicine 30(3), 215–232 (2004)
Bradley, A.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30(7), 1145–1159 (1997)
Burnside, E., Davis, J., Chhatwal, J., Alagoz, O., Lindstrom, M., Geller, B., Littenberg, B., Shaffer, K., Kahn Jr, C., Page, C.: Probabilistic computer model developed from clinical data in national mammography database format to classify mammographic findings. Radiology 251(3), 663–672 (2009)
Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B 39(1), 1–38 (1977)
D’Orsi, C., Bassett, L., Berg, W.e.a.: Breast Imaging Reporting and Data System: ACR BIRADS- Mammography (ed 4) (2003)
Dougherty, J., Kohavi, R., Sahami, M.: Supervised and unsupervised discretization of continuous features. In: Proc. of the 12th ICML, pp. 194–202 (1995)
Druzdzel, M.J., Onisko, A.: Are Bayesian networks sensitive to precision of their parameters? In: Proc. of the International IIS08 Conference, Intelligent Information Systems XVI, pp. 35–44 (2008)
Fayyad, U., Irani, K.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proc. of the 13th IJCAI, pp. 1022–1027 (1993)
Ferreira, N., Velikova, M., Lucas, P.: Bayesian modelling of multi-view mammography. In: Proc. of the ICML Workshop on Machine Learning for Health-Care Applications (2008)
Flores, J.L., Inza, I., naga, P.L.: Wrapper discretization by means of estimation of distribution algorithms. Intelligent Data Analysis 11(5), 525–545 (2007)
Geurts, P.,Wehenkel, L.: Investigation and reduction of discretization variance in decision tree induction. Lecture Notes In Computer Science 1810, 162–170 (2000)
Ismail, M.K., Ciesielski, V.: An empirical investigation of the impact of discretization on common data distributions. In: Proc. of the Third Int. Conf. on Hybrid Intelligent Systems: Design and Application of Hybrid Intelligent Systems, pp. 692–701 (2003)
Jensen, F., Nielsen, T.: Bayesian networks and decision graphs. Springer Verlag (2007)
Kahn, C., Roberts, L., Shaffer, K., Haddawy, P.: Construction of a Bayesian network for mammographic diagnosis of breast cancer. Comp. in Biol. and Medic. 27(1), 19–29 (1997)
Mizianty, M., Kurgan, L., Ogiela, M.: Comparative analysis of the impact of discretization on the classification with na¨ıve Bayes and semi-na¨ıve Bayes classifiers. In: Proc. of the Seventh International Conference on Machine Learning and Applications, pp. 823–828 (2008)
Murphy, K.: Bayesian network toolbox (BNT) (2007). http://people.cs.ubc.ca/_murphyk/ Software/BNT/bnt.html
Pradhan, A., Henrion, M., Provan, G., del Favero, B., Huang, K.: The sensitivity of belief networks to imprecise probabilities: an experimental investigation. Artificial Intelligence 84(1-2),357–357 (1996)
Radstake, N., Lucas, P.J.F., Velikova, M., Samulski, M.: Critiquing knowledge representation in medical image interpretation using structure learning. In: Proc. of the Second Workshop ”Knowledge Representation for Health Care”, Lisbon, Portugal (2010)
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, Second Edition. Morgan Kaufmann, San Francisco, CA, USA (2005)
Yang, Y., Webb, G.: Proportional k-interval discretization for na¨ıve-Bayes classifiers. In: Machine Learning: ECML 2001, pp. 564–575. Springer (2001)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag London Limited
About this paper
Cite this paper
Robben, S., Velikova, M., Lucas, P.J., Samulski, M. (2011). Discretisation Does Affect the Performance of Bayesian Networks. In: Bramer, M., Petridis, M., Hopgood, A. (eds) Research and Development in Intelligent Systems XXVII. SGAI 2010. Springer, London. https://doi.org/10.1007/978-0-85729-130-1_17
Download citation
DOI: https://doi.org/10.1007/978-0-85729-130-1_17
Published:
Publisher Name: Springer, London
Print ISBN: 978-0-85729-129-5
Online ISBN: 978-0-85729-130-1
eBook Packages: Computer ScienceComputer Science (R0)