Abstract
A machine learning technique, decision tree, is used to predict the susceptibility to two liver diseases, chronic hepatitis and cirrhosis, from single nucleotide polymorphism(SNP) data. Also, it is used to identify a set of SNPs relevant to those diseases. The experimental results show that a decision tree is able to distinguish chronic hepatitis from normal with accuracy of 69.59% and cirrhosis from normal with accuracy of 76.72% and the C4.5 decision rule is with accuracy of 69.59% for chronic hepatitis and 79.31% for cirrhosis. The experimental results show that decision tree is a potential tool to predict the susceptibility to chronic hepatitis and cirrhosis from SNP data.
This work was supported by a grant from the Korea Health 21 R&D Project, Ministry of Health and Welfare, Republic of Korea (A010383). Also, it was supported by Hallym University Research Fund, HRF-2004-40.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Phillips, M., Boyce-Jacino, M.: A primer on SNPs - part 1. Innovations in Pharmaceutical Technology 1, 54–58 (2001)
TSC: The SNP consortium ltd. Website (1999), http://snp.cshl.org
Ahn, S.H., Han, K.H., Park, J.Y., Lee, C.K., Kang, S.W., Chon, C.Y., Kim, Y.S., Park, K., Kim, D.K., Moon, Y.M.: Association between hepatitis B virus infection and HLA-DR type in Korea. Hepatology 31, 1371–1373 (2000)
Ben-Ari, Z., Mor, E., Kfir, B., Sulkes, J., Tambur, A.R., Tur-Kaspa, R., Klein, T.: ytokine gene polymorphisms in patients infected with hepatitis B virus. American Journal of Gastroenterology 98, 144–150 (2003)
Höhler, T., Kruger, A., Gerken, G., Schneider, P.M., Meyer, K.H., Büschenfelde, Z., Rittner, C.: A tumour necrosis factor-alpha (TNF-α) promoter polymorphism is associated with chronic hepatitis B infection. Clinical and Experimental Immunology 111(3), 579–582 (1998)
Kim, Y., Lee, H., Yoon, J., Kim, C., Park, M., Kim, L., Park, B., Shin, H.: Association of TNF-α promoter polymorphisms with the clearance of hepatitis B virus infection. Human Molecular Genetics 12, 2541–2546 (2003)
Shin, H.D., Park, B.L., Kim, L.H., Jung, J.H., Kim, J.Y., Yoon, J.H., Kim, Y.J., Lee, H.S.: Interleukin 10 haplotype associated with increased risk of hepatocellular carcinoma. Human Molecular Genetics 12(8), 901–906 (2003)
Bell, J.I.: single nucleotide polymorphisms and disease gene mapping. Arthritis Research 4(3), S273–S278 (2002)
Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth and Brooks, Monterey, CA (1984)
Mehta, M., Agrawal, R., Rissanen, J.: SLIQ: A fast scalable classifier for data mining. In: 5th Intl. Conf. on Extending Database Technology, pp. 18–32 (March 1996)
Shafer, J.C., Agrawal, R., Mehta, M.: SPRINT: A scalable parallel classifier for data mining. In: Vijayaraman, T.M., Buchmann, A.P., Mohan, C., Sarda, N.L. (eds.) Proc. 22nd Int. Conf. Very Large Databases, pp. 544–555. Morgan Kaufmann, San Francisco (1996)
Gehrke, J., Ramakrishnan, R., Ganti, V.: Rainforest - a framework for fast decision tree construction of large datasets. In: Gupta, A., Shmueli, O., Widom, J. (eds.) Proc. 24th Intl. Conf. on Very Large Database, pp. 416–427 (August 1998)
Murthy, S.K.: On Growing Better Decision Trees from Data. PhD thesis. Johns Hopkins University, Baltimore, Maryland (1995)
Lim, T.S., Loh, W.Y., Shih, Y.S.: An empirical comparison of decision trees and other classification methods. Technical Report 979, Univ. of Wisconsin, Madison, WI (June 1997)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993), http://www.rulequest.com/Personal/
Wang, Y.: Cancer classification using loss of heterozygosity data derived from single-nucleotide polymorphism genotyping arrays. In: Proceedings of the 28th IEEE International Conference of the Engineering in Medicine and Biology Society, New York CIty, New York, pp. 5864–5867 (2006)
Krishnan, V.G., Westhead, D.R.: A comparative study of machine-learning methods to predict the effects of single nucleotide polymorphisms on protein function. Bioinformatics 19(17), 2199–2209 (2003)
Papadimitriou, C.H.: Computational complexity. Addison Wesley, Reading (1993)
Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)
Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: The Int’l Joint Conference on Artificial Intelligences, pp. 1137–1145 (1995)
AGCG: Ajou university medical center for genomic research center for gastroenterology. Website (2007), http://www.agcg.re.kr/main.php
Efron, B.: Bootstrap methods: Another look at the jackknife. The Annals of Statistics 7(1), 1–26 (1979)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kim, DH., Uhmn, S., Ko, YW., Cho, S.W., Cheong, J.Y., Kim, J. (2007). Chronic Hepatitis and Cirrhosis Classification Using SNP Data, Decision Tree and Decision Rule. In: Gervasi, O., Gavrilova, M.L. (eds) Computational Science and Its Applications – ICCSA 2007. ICCSA 2007. Lecture Notes in Computer Science, vol 4707. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74484-9_51
Download citation
DOI: https://doi.org/10.1007/978-3-540-74484-9_51
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74482-5
Online ISBN: 978-3-540-74484-9
eBook Packages: Computer ScienceComputer Science (R0)