Abstract
In this paper, we propose a method which removes exceptional patients in a spiral manner for obtaining a definition of a disease in the form of a table of conditional probabilities and we describe its application to chronic hepatitis data. The removal is based on a risk-ratio-based criterion and can be supported by our previously developed data mining methods and medical experts. A series of experiments in which two domain experts decided exceptional patients to be removed show that our proposed method is effective and promising from various viewpoints such as obtaining new hypotheses and improving skills of domain experts. Another series of experiments in which exceptional patients were removed automatically led us to a rediscovery of a piece of knowledge, which had been reported in an article of a medical journal as the main result of the article.
Similar content being viewed by others
References
Aha, D. W., Kibler, D. and Albert, M. K., “Instance-Based Learning Algorithms,” Machine Learning, 6(1), pp. 37-66, 1991.
Berka, P.: ECML/PKDD 2003 Discovery Challenge, Download Data about Hepatitis, http://lisp.vse.cz/challenge/ecmlpkdd2003/ (current April 26th, 2003)
Brighton, H. and Mellish, C., “Advances in Instance Selection for Instance-Based Learning Algorithms,” Data Mining and Knowledge Discovery, 6(2), pp. 131-152, 2002.
Domingos, P. and Pazzani, M., “On the Optimality of the Simple Bayesian Classifier under Zero-One Loss,” Machine Learning, 29(2/3), pp. 103-130, 1997.
Fragoudis, D., Meretakis, D. and Kokothanassis, S., “Integrating Feature and Instance Selection for Text Classification,” in Proc. Eighth ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining (KDD), pp. 501-506, 2002.
Jumi, M., Suzuki, E., Ohshima, M., Zhong, N., Yokoi, H. and Takabayashi, K., “Spiral Discovery of a Separate Prediction Model from Chronic Hepatitis Data,” in Proc. Third Int. Workshop on Active Mining (AM), pp. 1-10, 2004.
Liu, H. and Motoda, H., “On Issues of Instance Selection,” Data Mining and Knowledge Discovery, 6(2), pp. 115-130, 2002.
Madigan, D. et al., “Likelihood-Based Data Squashing: A Modeling Approach to Instance Construction,” Data Mining and Knowledge Discovery, 6(2), pp. 173-190, 2002.
Mitchell, T. M., “Machine Learning and Data Mining,” CACM, 42(11), pp. 30-36, 1999.
Pezzilli, R., “Serum Pancreatic Enzyme Concentrations in Chronic Viral Liver Diseases,” Digestive Diseases and Sciences, 44(2), pp. 350-355, 1999.
Reinartz, T., “A Unifying View on Instance Selection,” Data Mining and Knowledge Discovery, 6(2), pp. 191-210, 2002.
Suzuki, E., Watanabe, T., Yokoi, H. and Takabayashi, K., “Detecting Interesting Exceptions from Medical Test Data with Visual Summarization,” in Proc. Third IEEE Int. Conf. on Data Mining (ICDM), pp. 315-322, 2003.
Yamada, Y., Suzuki, E., Yokoi, H. and Takabayashi, K., “Decision-tree Induction from Time-series Data Based on a Standard-example Split Test,” in Proc. Twentieth Int. Conf. on Machine Learning (ICML), pp. 840-847 (erratum http://www.slab.dnj.ynu.ac.jp/erratumicml2003.pdf), 2003.
Zhong, N., Yao, Y. Y. and Ohshima, M., “Peculiarity Oriented Multi-Database Mining,” IEEE Transaction on Knowledge and Data Engineering, 15(4), pp. 952-960, 2003.
Author information
Authors and Affiliations
Corresponding author
About this article
Cite this article
Jumi, M., Ohshima, M., Zhong, N. et al. Spiral Removal of Exceptional Patients for Mining Chronic Hepatitis Data. New Gener. Comput. 25, 223–234 (2007). https://doi.org/10.1007/s00354-007-0014-8
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00354-007-0014-8