Skip to main content

Leukemia Prediction from Gene Expression Data—A Rough Set Approach

  • Conference paper
Artificial Intelligence and Soft Computing – ICAISC 2006 (ICAISC 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4029))

Included in the following conference series:

  • 1301 Accesses

Abstract

We present our results on the prediction of leukemia from microarray data. Our methodology was based on data mining (rule induction) using rough set theory. We used a novel methodology based on rule generations and cumulative rule sets. The final rule set contained only eight rules, using some combinations of eight genes. All cases from the training data set and all but one cases from the testing data set were correctly classified. Moreover, six out of eight genes found by us are well known in the literature as relevant to leukemia.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Booker, L.B., Goldberg, D.E., Holland, J.F.: Classifier systems and genetic algorithms. In: Carbonell, J.G. (ed.) Machine Learning. Paradigms and Methods, pp. 235–282. The MIT Press, Menlo Park (1990)

    Google Scholar 

  2. Broberg, P.: Statistical methods for ranking differentially expressed genes. Genome Biology 4 (2003), http://genomebiology.com

  3. Chu, W., Ghahramani, Z., Falciani, F., Wild, D.L.: Biomarker discovery in microarray gene expression data with Gaussian processes. Bioinformatics 21, 3385–3393 (2005)

    Article  Google Scholar 

  4. Cohen, A.J., Franklin, W.A., Magill, C., Sorenson, J., Miller, Y.E.: Low neutral endopeptidase levels in bronchoalveolar lavage fluid of lung cancer patients. American Journal of Respiratory and Critical Care Medicine 159, 907–910 (1999)

    Google Scholar 

  5. Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., et al.: Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)

    Article  Google Scholar 

  6. Grzymala-Busse, J.W.: Knowledge acquisition under uncertainty—A rough set approach. Journal of Intelligent & Robotic Systems 1, 3–16 (1988)

    Article  MathSciNet  Google Scholar 

  7. Grzymala-Busse, J.W.: LERS—A system for learning from examples based on rough sets. In: Slowinski, R. (ed.) Intelligent Decision Support. Handbook of Applications and Advances of the Rough Sets Theory, pp. 3–18. Kluwer Academic Publishers, Dordrecht (1992)

    Google Scholar 

  8. Grzymala-Busse, J.W.: A new version of the rule induction system LERS. Fundamenta Informaticae 31, 27–39 (1997)

    MATH  Google Scholar 

  9. Grzymala-Busse, J.W.: MLEM2: A new algorithm for rule induction from imperfect data. In: Proceedings of the 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, IPMU 2002, July 1–5, 2002, Annecy, France, pp. 243–250 (2002)

    Google Scholar 

  10. Grzymala-Busse, J.W., Goodwin, L.K., Grzymala-Busse, W.J., Zheng, X.: An approach to imbalanced data sets based on changing rule strength. In: Learning from Imblanced Data Sets, AAAI Workshop at the 17th Conference on AI, AAAI-2000, Austin, TX, July 30–31, 2000, pp. 69–74 (2000)

    Google Scholar 

  11. Holland, J.H., Holyoak, K.J., Nisbett, R.E.: Induction. Processes of Inference, Learning, and Discovery. The MIT Press, Menlo Park (1986)

    Google Scholar 

  12. Jirapech-Umpai, T., Aitken, S.: Feature selection and classification for microarray data analysis: Evolutionary methods for identifying predictive genes. BMC Bioinformatics 6 (2005), http://www.biomedcentral.com

  13. Kuopio, T., Kankaanranta, A., Jalava, P., Kronqvist, P., Kotkansalo, T., Weber, E., Collan, Y.: Cysteine proteinase inhibitor cystatin A in breast cancer. Cancer Research 58, 432–436 (1998)

    Google Scholar 

  14. Lee, K.E., Sha, N.J., Dougherty, E.R., Vannucci, M., Mallick, B.K.: Gene selection: a Bayesian variable selection approach. Bioinformatics 19, 90–97 (2003)

    Article  Google Scholar 

  15. Mori, N., Murakami, Y.I., Shimada, S., Iwamizu-Watanabe, S., Yamashita, Y., Hasegawa, Y., Kojima, H., Nagasawa, T.: TIA-1 expression in hairy cell leukemia. Modern Pathology 17, 840–846 (2004)

    Article  Google Scholar 

  16. Nguyen, D.V., Rocke, D.M.: Tumor classification by partial least squares using microarray gene expression data. Bioinformatics 18, 39–50 (2002)

    Article  Google Scholar 

  17. Pawlak, Z.: Rough Sets. International Journal of Computer and Information Sciences 11, 341–356 (1982)

    Article  MATH  MathSciNet  Google Scholar 

  18. Pawlak, Z.: Rough Sets. In: Theoretical Aspects of Reasoning about Data, Kluwer Academic Publishers, Dordrecht (1991)

    Google Scholar 

  19. Sakhinia, E., Faranghpour, M., Yin, J.A.L., Brady, G., Hoyland, J.A., Byers, R.J.: Routine expression profiling of microarray gene signatures in acute leukaemia by real-time PCR of human bone marrow. British Journal of Haematology 130, 233–248 (2005)

    Article  Google Scholar 

  20. Souza, D.G., Soares, A.C., Pinho, V., Torloni, H., Reis, L.F.L., Martins, M.T., Dias, A.A.M.: Increased mortality and inflammation in tumor necrosis factor-stimulated gene-14 transgenic mice after ischemia and reperfusion injury. American Journal of Pathology 160, 1755–1765 (2002)

    Article  Google Scholar 

  21. Thomas, J.G., Olson, J.M., Tapscott, S.J., Zhao, L.P.: An efficient and robust statistical modeling approach to discover differentially expressed genes using genomic expression profiles. Genome Research 11, 1227–1236 (2001)

    Article  Google Scholar 

  22. Vinterbo, S.A., Kim, E.Y., Ohno-Machado, L.: Small, fuzzy and interpretable gene expression based classifiers. Bioinformatics 21, 1964–1970 (2005)

    Article  Google Scholar 

  23. Wadman, I., Li, J.X., Bash, R.O., Forster, A., Osada, H., Rabbitts, T.H., Baer, R.: Specific in-vivo association between the Bhlh and Lim proteins implicated in human T-cell leukemia. EMBO Journal 13, 4831–4839 (1994)

    Google Scholar 

  24. Yeung, K.Y., Bumgarner, R.E., Raftery, A.E.: Bayesian model averaging: development of an improved multi-class, gene selection and classification tool for microarray data. Bioinformatics 21, 2394–2402 (2005)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Fang, J., Grzymala-Busse, J.W. (2006). Leukemia Prediction from Gene Expression Data—A Rough Set Approach. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Żurada, J.M. (eds) Artificial Intelligence and Soft Computing – ICAISC 2006. ICAISC 2006. Lecture Notes in Computer Science(), vol 4029. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11785231_94

Download citation

  • DOI: https://doi.org/10.1007/11785231_94

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-35748-3

  • Online ISBN: 978-3-540-35750-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics