Leukemia Prediction from Gene Expression Data—A Rough Set Approach

Fang, Jianwen; Grzymala-Busse, Jerzy W.

doi:10.1007/11785231_94

Jianwen Fang²² &
Jerzy W. Grzymala-Busse^23,24

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4029))

Included in the following conference series:

International Conference on Artificial Intelligence and Soft Computing

1301 Accesses

Abstract

We present our results on the prediction of leukemia from microarray data. Our methodology was based on data mining (rule induction) using rough set theory. We used a novel methodology based on rule generations and cumulative rule sets. The final rule set contained only eight rules, using some combinations of eight genes. All cases from the training data set and all but one cases from the testing data set were correctly classified. Moreover, six out of eight genes found by us are well known in the literature as relevant to leukemia.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Rough Sets in Machine Learning: A Review

Molecular Classification of Cancer by Gene Expression Monitoring Using Ensemble Learning

Classification of Cancer Data: Analyzing Gene Expression Data Using a Fuzzy Decision Tree Algorithm

References

Booker, L.B., Goldberg, D.E., Holland, J.F.: Classifier systems and genetic algorithms. In: Carbonell, J.G. (ed.) Machine Learning. Paradigms and Methods, pp. 235–282. The MIT Press, Menlo Park (1990)
Google Scholar
Broberg, P.: Statistical methods for ranking differentially expressed genes. Genome Biology 4 (2003), http://genomebiology.com
Chu, W., Ghahramani, Z., Falciani, F., Wild, D.L.: Biomarker discovery in microarray gene expression data with Gaussian processes. Bioinformatics 21, 3385–3393 (2005)
Article Google Scholar
Cohen, A.J., Franklin, W.A., Magill, C., Sorenson, J., Miller, Y.E.: Low neutral endopeptidase levels in bronchoalveolar lavage fluid of lung cancer patients. American Journal of Respiratory and Critical Care Medicine 159, 907–910 (1999)
Google Scholar
Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., et al.: Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)
Article Google Scholar
Grzymala-Busse, J.W.: Knowledge acquisition under uncertainty—A rough set approach. Journal of Intelligent & Robotic Systems 1, 3–16 (1988)
Article MathSciNet Google Scholar
Grzymala-Busse, J.W.: LERS—A system for learning from examples based on rough sets. In: Slowinski, R. (ed.) Intelligent Decision Support. Handbook of Applications and Advances of the Rough Sets Theory, pp. 3–18. Kluwer Academic Publishers, Dordrecht (1992)
Google Scholar
Grzymala-Busse, J.W.: A new version of the rule induction system LERS. Fundamenta Informaticae 31, 27–39 (1997)
MATH Google Scholar
Grzymala-Busse, J.W.: MLEM2: A new algorithm for rule induction from imperfect data. In: Proceedings of the 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, IPMU 2002, July 1–5, 2002, Annecy, France, pp. 243–250 (2002)
Google Scholar
Grzymala-Busse, J.W., Goodwin, L.K., Grzymala-Busse, W.J., Zheng, X.: An approach to imbalanced data sets based on changing rule strength. In: Learning from Imblanced Data Sets, AAAI Workshop at the 17th Conference on AI, AAAI-2000, Austin, TX, July 30–31, 2000, pp. 69–74 (2000)
Google Scholar
Holland, J.H., Holyoak, K.J., Nisbett, R.E.: Induction. Processes of Inference, Learning, and Discovery. The MIT Press, Menlo Park (1986)
Google Scholar
Jirapech-Umpai, T., Aitken, S.: Feature selection and classification for microarray data analysis: Evolutionary methods for identifying predictive genes. BMC Bioinformatics 6 (2005), http://www.biomedcentral.com
Kuopio, T., Kankaanranta, A., Jalava, P., Kronqvist, P., Kotkansalo, T., Weber, E., Collan, Y.: Cysteine proteinase inhibitor cystatin A in breast cancer. Cancer Research 58, 432–436 (1998)
Google Scholar
Lee, K.E., Sha, N.J., Dougherty, E.R., Vannucci, M., Mallick, B.K.: Gene selection: a Bayesian variable selection approach. Bioinformatics 19, 90–97 (2003)
Article Google Scholar
Mori, N., Murakami, Y.I., Shimada, S., Iwamizu-Watanabe, S., Yamashita, Y., Hasegawa, Y., Kojima, H., Nagasawa, T.: TIA-1 expression in hairy cell leukemia. Modern Pathology 17, 840–846 (2004)
Article Google Scholar
Nguyen, D.V., Rocke, D.M.: Tumor classification by partial least squares using microarray gene expression data. Bioinformatics 18, 39–50 (2002)
Article Google Scholar
Pawlak, Z.: Rough Sets. International Journal of Computer and Information Sciences 11, 341–356 (1982)
Article MATH MathSciNet Google Scholar
Pawlak, Z.: Rough Sets. In: Theoretical Aspects of Reasoning about Data, Kluwer Academic Publishers, Dordrecht (1991)
Google Scholar
Sakhinia, E., Faranghpour, M., Yin, J.A.L., Brady, G., Hoyland, J.A., Byers, R.J.: Routine expression profiling of microarray gene signatures in acute leukaemia by real-time PCR of human bone marrow. British Journal of Haematology 130, 233–248 (2005)
Article Google Scholar
Souza, D.G., Soares, A.C., Pinho, V., Torloni, H., Reis, L.F.L., Martins, M.T., Dias, A.A.M.: Increased mortality and inflammation in tumor necrosis factor-stimulated gene-14 transgenic mice after ischemia and reperfusion injury. American Journal of Pathology 160, 1755–1765 (2002)
Article Google Scholar
Thomas, J.G., Olson, J.M., Tapscott, S.J., Zhao, L.P.: An efficient and robust statistical modeling approach to discover differentially expressed genes using genomic expression profiles. Genome Research 11, 1227–1236 (2001)
Article Google Scholar
Vinterbo, S.A., Kim, E.Y., Ohno-Machado, L.: Small, fuzzy and interpretable gene expression based classifiers. Bioinformatics 21, 1964–1970 (2005)
Article Google Scholar
Wadman, I., Li, J.X., Bash, R.O., Forster, A., Osada, H., Rabbitts, T.H., Baer, R.: Specific in-vivo association between the Bhlh and Lim proteins implicated in human T-cell leukemia. EMBO Journal 13, 4831–4839 (1994)
Google Scholar
Yeung, K.Y., Bumgarner, R.E., Raftery, A.E.: Bayesian model averaging: development of an improved multi-class, gene selection and classification tool for microarray data. Bioinformatics 21, 2394–2402 (2005)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Bioinformatics Core Facility, and Information and Telecommunication Technology Center, University of Kansas, Lawrence, KS, 66045, USA
Jianwen Fang
Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, KS, 66045, USA
Jerzy W. Grzymala-Busse
Institute of Computer Science Polish Academy of Sciences, 01-237, Warsaw, Poland
Jerzy W. Grzymala-Busse

Authors

Jianwen Fang
View author publications
You can also search for this author in PubMed Google Scholar
Jerzy W. Grzymala-Busse
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Artificial Intelligence, Academy of Humanities and Economics, Poland
Leszek Rutkowski
Institute of Automatics, AGH University of Science and Technology, Al. Mickiewicza 30, PL-30-059, Kraków, Poland
Ryszard Tadeusiewicz
Department of Electrical Engineering and Computer Sciences, Berkeley Initiative in Soft Computing (BISC), University of California, 94720-1776, Berkeley, CA, USA
Lotfi A. Zadeh
Department of Electrical Engineering, University of Louisville, 40292, Louisville, KY, U.S.A
Jacek M. Żurada

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fang, J., Grzymala-Busse, J.W. (2006). Leukemia Prediction from Gene Expression Data—A Rough Set Approach. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Żurada, J.M. (eds) Artificial Intelligence and Soft Computing – ICAISC 2006. ICAISC 2006. Lecture Notes in Computer Science(), vol 4029. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11785231_94

Download citation

DOI: https://doi.org/10.1007/11785231_94
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-35748-3
Online ISBN: 978-3-540-35750-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Leukemia Prediction from Gene Expression Data—A Rough Set Approach

Abstract

Access this chapter

Preview

Similar content being viewed by others

Rough Sets in Machine Learning: A Review

Molecular Classification of Cancer by Gene Expression Monitoring Using Ensemble Learning

Classification of Cancer Data: Analyzing Gene Expression Data Using a Fuzzy Decision Tree Algorithm

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Leukemia Prediction from Gene Expression Data—A Rough Set Approach

Abstract

Access this chapter

Preview

Similar content being viewed by others

Rough Sets in Machine Learning: A Review

Molecular Classification of Cancer by Gene Expression Monitoring Using Ensemble Learning

Classification of Cancer Data: Analyzing Gene Expression Data Using a Fuzzy Decision Tree Algorithm

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation