Abstract
DNA microarrays have contributed to the exponential growth of genetic data from years. One of the possible applications of this large amount of gene expression data diagnosis of diseases like cancer using classification methods. In turn, explicit biological knowledge about gene functions has also grown tremendously over the last decade. This work integrates explicit biological knowledge in classification process using Rough Set Theory, making it more effective. In addition, the proposed model is able to indicate which part of biological knowledge has been used building the model and classifing new samples.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
McLachlan, G.J., Do, K.A., Ambroise, C.: Analyzing Microarray Gene Expression Data. John Wiley & Sons, Inc., Chichester (2004)
Dudoit, S., Fridlyand, J., Speed, T.P.: Comparison of discrimination methods for the classification of tumors using gene expression data. Journal of the American Statistical Association 97(457), 77–87 (2002)
Furey, Cristianini, Duffy, Bednarski, Schummer, Haussler: Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16, 906–914 (2000)
Ramaswamy, S., Tamayo, P., Rifkin, R., Mukherjee, S., Angelo, M., Ladd, C., Reich, M., Mesirov, P., Poggio, T., Gerald, W., Loda, M., Lander, E.S., Golub, T.R.: Multi-class cancer diagnosis using tumor gene expression signatures. Proceedings of the National Academy of Sciences of the United States of America 98, 15149–15154 (2001)
Meltzer, P.S., Khan, J., Wei, J.S., Ringnér, M., Saal, L.H., Ladanyi, M., Westermann, F., Berthold, F., Schwab, M., Antonescu, C.R., Peterson, C.: Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nature Medicine 7, 673–679 (2001)
Díaz-Uriarte, R., de Andrés, S.A.: Gene selection and classification of microarray data using random forest. BMC Bioinformatics 7, 3 (2006)
Demichelis, F., Magni, P., Piergiorgi, P., Rubin, M.A., Bellazzi, R.: A hierarchical naïve bayes model for handling sample heterogeneity in classification problems: an application to tissue microarrays. BMC Bioinformatics 7, 514 (2006)
Pawlak, Z.: Rough Sets, Theoretical aspects of reasoning about data. Kluwer Academic Publishers (1991)
Chen, X., Wang, L.: Integrating biological knowledge with gene expression profiles for survival prediction of cancer. Computational Biology 16(2), 265–278 (2009)
Wei, Z., Li, H.: Nonparametric pathway-based regression models for analysis of genomic data. Biostatistics 8, 265–284 (2007)
Tai, F., Pan, W.: Incorporating prior knowledge of gene functional groups into regularized discriminant analysis of microarray data. Bioinformatics 23(23), 3170–3177 (2007)
Tai, F., Pan, W.: Incorporating prior knowledge of predictors into penalized classifiers with multiple penalty terms. Bioinformatics 23(14), 1775–1782 (2007)
Galvez, J.F., Diaz, F., Carrion, P., Garcia, A.: An Application for Knowledge Discovery Based on a Revision of VPRS Model. In: Ziarko, W.P., Yao, Y. (eds.) RSCTC 2000. LNCS (LNAI), vol. 2005, pp. 296–303. Springer, Heidelberg (2001)
Ziarko, W.: Variable precision rough set model. Computer and System Sciences 46, 39–59 (1993)
Pawlak, Z.: Rough sets. International Journal of Computer and Information Sciences 11(5), 341–356 (1982)
Fodor, I.: A survey of dimension reduction techniques. tech. rep., Lawrence Livermore National Laboratory (May 2002)
Glez-Pena, D.: Modelo para la integratión de conocimiento biológico explícito en técnicas de clasificación aplicadas a datos procedentes de microarrays de ADN. PhD thesis, University of Vigo (2009)
Pearson, K.: On lines and planes of closest fit to systems of points in space. Philosophical Magazine 2, 559–572 (1901)
Glez-Pena, D., Alvarez, R., Diaz, F., Fdez-Riverola, F.: Dfp: a bioconductor package for fuzzy profile identification and gene reduction of microarray data. BMC Bioinformatics 10(1), 37 (2009)
Maji, P., Paul, S.: Rough set based maximum relevance-maximum significance criterion and gene selection from microarray data. Int. J. Approx. Reasoning 52(3), 408–426 (2011)
Galvez, J.F., Olivieri, D., Carrion, P.: An improved algorithm for determining reducts in rough set models (2003)
Platt, J.: Fast training of support vector machines using sequential minimal optimization. In: Schölkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods — Support Vector Learning, pp. 185–208. MIT Press, Cambridge (1999)
Fix, E., Hodges, J.L.: Discriminatory analysis – nonparametric discrimination: Consistency properties. Tech. Rep. Project 21-49-004, Report No. 4, 261-279, USAF School of Aviation Medicine, Randolph Field, Texas (1951)
Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)
Sotiriou, C., Wirapati, P., Loi, S., Harris, A., Fox, S., Smeds, J., Nordgren, H., Farmer, P., Praz, V., Haibe-Kains, B., Desmedt, C., Larsimont, D., Cardoso, F., Peterse, H., Nuyten, D., Buyse, M., Van de Vijver, M.J., Bergh, J., Piccart, M., Delorenzi, M.: Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. Journal of the National Cancer Institute 98(4), 262–272 (2006)
Edgar, R., Domrachev, M., Lash, A.E.: Gene expression omnibus: Ncbi gene expression and hybridization array data repository. Nucleic Acids Research 30(1), 207–210 (2002)
Wang, Y., Klijn, J., Zhang, Y., Sieuwerts, A., Look, M., Yang, F., Talantov, D., Timmermans, M., Meijervangelder, M., Yu, J.: Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. The Lancet 365, 671–679 (2005)
Amberger, J.S., Bocchini, C.A., Scott, A.F., Hamosh, A.: Mckusick’s online mendelian inheritance in man (OMIM®). Nucleic Acids Research 37(Database-Issue), 793–796 (2009)
Ben-David, A.: Comparison of classification accuracy using cohen’s weighted kappa. Expert Syst. Appl. 34(2), 825–832 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Calvo-Dmgz, D., Galvez, J.F., Glez-Peña, D., Fdez-Riverola, F. (2012). Biological Knowledge Integration in DNA Microarray Gene Expression Classification Based on Rough Set Theory. In: Rocha, M., Luscombe, N., Fdez-Riverola, F., Rodríguez, J. (eds) 6th International Conference on Practical Applications of Computational Biology & Bioinformatics. Advances in Intelligent and Soft Computing, vol 154. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28839-5_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-28839-5_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28838-8
Online ISBN: 978-3-642-28839-5
eBook Packages: EngineeringEngineering (R0)