Abstract
Indentification of informative gene subsets responsible for discerning between available samples of gene expression data is an important task in bioinformatics. Reducts, from rough sets theory, corresponding to a minimal set of essential genes for discerning samples, is an efficient tool for gene selection. Due to the compuational complexty of the existing reduct algoritms, feature ranking is usually used to narrow down gene space as the first step and top ranked genes are selected . In this paper,we define a novel certierion based on the expression level difference btween classes and contribution to classification of the gene for scoring genes and present a algorithm for generating all possible reduct from informative genes.The algorithm takes the whole attribute sets into account and find short reduct with a significant reduction in computational complexity. An exploration of this approach on benchmark gene expression data sets demonstrates that this approach is successful for selecting high discriminative genes and the classification accuracy is impressive.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Frank, A.: A New Branch and Bound Feature Selection Algorithm. M.Sc. Thesis, submitted to Technion, Israel Institute of Technology (2002)
Xiong, M., Li, W., Zhao, J., Jin, L., Boerwinkle, E.: Feature (gene) Selection In Gene Expression-based Tumor Classication. Molecular Genetics and Metabolism 73, 239–247 (2001)
Wang, L.P., Feng, C., Xie, X.: Accurate Cancer Classifcation Using Expressions of Very Few Genes. EE/ACM Transactions on Computational Biology and Bioinformatics 4, 40–53 (2007)
Li, W., Yang, Y.: How Many Genes Are Needed For A Discriminant Microarray Data Analysis? In: Methods of Microarray Data Analysis. Kluwer academic Publisher, Norwell (2002)
Pawlak, Z.: Rough Set- Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishers, Dorderecht (1991)
Palawk, Z.: Rough Sets. International Journal of Computer and Information Science 11, 341–356 (1982)
Zhong, N., Dong, J., Ohsuga, S.: Using rough sets with heruristic for feature selection. Journal of Intelligent Information Systems 16, 119–214 (2001)
Mitra, S., Hayashi, Y.: Bioinformatics with Soft Computing. IEEE Transactions on Systems, Man and Cybernetics-Part C: Applications and Reviews 36, 616–635 (2006)
Hvidsten, T.R., Komorowski, J.: Rough Sets in Bioinformatics. In: Peters, J.F., Skowron, A., Marek, V.W., Orłowska, E., Słowiński, R., Ziarko, W.P. (eds.) Transactions on Rough Sets VII. LNCS, vol. 4400, pp. 225–243. Springer, Heidelberg (2007)
Midelfart, H., Komorowski, J., Nørsett, K., Yadetie, F., Sandvik, A.K., Lægreid, A.: Learning Rough Set Classifiers From Gene Expressions And Clinical Data. Fundamenta Inf. 53, 155–183 (2002)
Valdes, J.J., Barton, A.J.: Gene Discovery in Leukemia Revisited: A Computational Intelligence Perspective. In: Orchard, B., Yang, C., Ali, M. (eds.) IEA/AIE 2004. LNCS (LNAI), vol. 3029, pp. 118–127. Springer, Heidelberg (2004)
Momin, B.F., Mitra, S., Datta Gupta, R.: Reduct Generation and Classifcation of Gene Expression Data. In: Proceeding of First International Conference on Hybrid Information Technology (ICHICT 2006), pp. 699–708. IEEE Press, New York (2006)
Banerjee, M., Mitra, S., Banka, H.: Evolutinary-Rough Feature Selection in Gene Expression Data. IEEE Transaction on Systems, Man, and Cyberneticd, Part C: Application and Reviews 37, 622–632 (2007)
Wang, J., Wang, J.: Reduction Algorithms Based on Discernibly Matrix: The Ordered Attributes Method. Journal of Computer Science and Technology 16, 489–504 (2002)
Miao, D.Q., Hu, G.R.: A Heuristic Algorithm for Reduction of Knowledge. Journal of Computer Research and Development 36, 681–684 (1999)
Shen, Q., Chouchoulas, A.: A modular approach to generating fuzzy rules with reduced attributes for monitoring of complex systems. Engineering Applications of Artificial Intellegence 12, 263–278 (2000)
Skowron, A., Rauszer, C.: The discernibility matrices and functions in information systems. In: Intelligent decision Support. Handbook of Applications and Advances of the Rough Sets Theory. Kluwer Academic, Dordrecht (1992)
Wang, Y., Tetko, I.V., Hall, M.A., Frank, E., Facius, A., Mayer, K.F.X., Mewes, H.W.: Gene Selection from Microarray Data for Cancer Classification-A Machine Learning Approach. Computational Biology and Chemistry 29, 37–46 (2005)
Zhou, W.G., Zhou, C.G., Liu, G.X., Wang, Y.: Artificial Intelligence Applications and Innovations. In: Proceeding of IFIP Intemational Federation for Information, pp. 492–499. Springer, Heidelberg (2006)
Ding, C., Peng, H.C.: Minimum Redundancy Feature Selection from Microarray Gene Expression Data. Journal of Bioinformatics and Computational Biology 3, 185–205 (2003)
Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science 286, 531–537 (1999)
Alon, U., Barkai, N., Notterman, D.A.: Broad Patterns of Gene Expression Revealed By Clustering Analysis of Tumor And Normal Colon Tissues Probed By Oligonucleotide Arrays. PNASUSA 96, 6745–6750 (1999)
Armstrong, S.A.: MLL Translocations Specify A Distinct Gene Distinguishes A Expression Profile That Unique Leukemia. Nature Genetics 30, 41–47 (2002)
Alizadeh, A.A., et al.: Distict types of diffuse large B-cell lymphoma identified by gene expressionprofiling. Nature 403, 503–511 (2000)
Krishnapuram, B., et al.: Joint classifier and feature selection optimization for Cancer diagnosis using gene expression Data. In: Proceedings of the Seventh Annual International Conference on Research in Computational Molecular Biology, pp. 167–175. ACM, New York (2003)
Deb, K., Reddy, A.R.: Reliable Classifcation of Two Class Cancer Data Using Evolutionary Algorithms. BioSystems 72, 111–129 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Sun, L., Miao, D., Zhang, H. (2010). Gene Selection and Cancer Classification: A Rough Sets Based Approach. In: Peters, J.F., Skowron, A., Słowiński, R., Lingras, P., Miao, D., Tsumoto, S. (eds) Transactions on Rough Sets XII. Lecture Notes in Computer Science, vol 6190. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14467-7_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-14467-7_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14466-0
Online ISBN: 978-3-642-14467-7
eBook Packages: Computer ScienceComputer Science (R0)