Abstract
DNA microarray data are used to identify genes which could be considered prognostic markers. However, due to the limited sample size of each study, the signatures are unstable in terms of the composing genes and may be limited in terms of performances. Therefore, it is of great interest to integrate different studies thus increasing sample size. In the past, several studies explored the issue of microarray data merging, but the appearance of new techniques and a focus on SVM based classification needed further investigation. We used distant metastasis prediction based on SVM attribute selection and classification to three breast cancer data sets. The results showed that breast cancer classification does not take benefit of data merging, confirming the results found by other studies with different techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Gatza, M.L., Lucas, J.E., Barry, W.T., Kim, J.W., et al.: A pathway-based classification of human breast cancer. PNAS 107(15), 6994–6999 (2010)
Bild, A.H., Yao, G., Chang, J.T., Wang, Q., et al.: Oncogenic pathway signatures in human cancers as a guide to targeted therapies, vol. 439 (January 19, 2006), doi:10.1038/nature04296
Sorlie, T., Perou, C.M., Tibshirani, R., et al.: Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc. Natl. Acad. Sci. U S A 98, 10869–10874 (2001)
Van de Vijver, M.J., He, Y.D., van ’t Veer, L.J., Dai, H., et al.: A Gene- Expression Signature as a Predictor of Survival in Breast Cancer. N Engl. J. Med. 347(25), 1999–2009 (2002)
Van’t Veer, L.J., Dai, H., van de Vijver, M.J., He, Y.D., et al.: Gene expression profiling predicts clinical outcome of breast cancer. Nature 415, 530–536 (2002)
Xu, L., Choon Tan, A., Winslow, R.L., Geman, D.: Merging microarray data from separate breast cancer studies provides a robust prognostic test. BMC Bioinformatics 9, 125 (2008)
Rhodes, D.R., Yu, J., Shanker, K., Deshpande, N., Varambally, R., et al.: Oncomine: a cancer microarray database and integrated data-mining platform. Neoplasia 6, 1–6 (2004)
Wirapati, P., Sotiriou, C., Kunkel, S., Farmer, P., Pradervand, S., et al.: Metaanalysis of gene-expression profiles in breast cancer: toward a unified understanding of breast cancer sub-typing and prognosis signatures. Breast Cancer Research 10, R65+ (2008)
Johnson, W.E., Li, C.: Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics (Oxford, England) 8(1), 118–127 (2007)
Warnat, P., Eils, R., Brors, B.: Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes. BMC Bioinformatics 6, 265 (2005)
Yasrebi, H., Sperisen, P., Praz, V., Bucher, P.: Can Survival Prediction Be Improved By Merging Gene Expression Data Sets? PLoS One 4(10), e7431 (2009)
Bolstad, B.M., Irizarry, R.A., Åstrand, M., Speed, T.P.: A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19(2), 185–193 (2003)
Benito, M., Parker, J., Du, Q., Wu, J., Xiang, D., Perou, C.M., Marron, J.S.: Adjustment of systematic microarray data biases. Bioinformatics 20(1), 105–114 (2004)
Lander, E.S.: Array of hope. Nature Genetics 21, 3–4 (1999)
Chen, C., Grennan, K., Badner, J., Zhang, D., Gershon, E., Jin, L., Liu, C.: Removing Batch Effects in Analysis of Expression Microarray Data: An Evaluation of Six Batch Adjustment Methods. PLoS One 6(2), 17238 (2011)
Alter, O., Brown, P.O., Botstein, D.: Singular value decomposition for genome-wide expression data processing and modeling. Proceedings of the National Academy of Sciences of the United States of America 97, 10101–10106 (2000)
Jiang, H., Deng, Y., Chen, H.S., Tao, L., Sha, Q., et al.: Joint analysis of two microarray gene-expression data sets to select lung adenocarcinoma marker genes. BMC Bioinformatics 5, 81 (2004)
Chen, Q.R., Song, Y.K., Wei, J.S., Bilke, S., Asgharzadeh, S., et al.: An integrated cross-platform prognosis study on neuroblastoma patients. Genomics 92, 195–203 (2008)
Reyal, F., Van Vliet, M.H., Armstrong, N.J., Horlings, H.M., de Visser, K.E., et al.: A comprehensive analysis of prognostic signatures reveals the high predictive capacity of Proliferation, Immune response and RNA splicing modules in breast cancer. Breast Cancer Research 10, R93+ (2008)
Acharya, C.R., Hsu, D.S., Anders, C.K., Anguiano, A., Salter, K.H., et al.: Gene expression signatures, clinicopathological features, and individualized therapy in breast cancer. JAMA 299, 1574–1587 (2008)
Irizarry, R.A., Hobbs, B., Collin, F., Beazer-Barclay, Y.D., Antonellis, K.J., Scherf, U., Speed, T.P.: Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 4(2), 249–264 (2003)
McCall, M.N., Bolstad, B.M., Irizarry, R.A.: Department of Biostatistics, Johns Hopkins University, Baltimore, MD 21205, USA Frozen robust multiarray analysis (fRMA)
Li, C., Wong, W.: Model-based analysis of oligonucleotide arrays: Expression index computation and outlier detection. Proceedings of the National Academy of Science U S A 98, 31–36 (2001)
Scherer, A. (ed.): Batch Effects and Noise in Microarray Experiments: Sources and Solutions. John Wiley & Sons, Chichester (2009)
Subramanian, A., Tamayoa, P., Mootha, V.K., Mukherje, S., et al.: Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. PNAS (August 2, 2005)
Foekens, J.A., Atkins, D., Zhang, Y., Sweep, F.C.G., et al.: Multicenter Validation of a Gene Expression–Based Prognostic Signature in Lymph Node–Negative Primary Breast Cancer. Journal of Clinical Oncology 24, 1665–1671 (2006)
Guyon, I., Weston, J., Barnhill, S.: Machine Learning Gene Selection for Cancer Classification using Support Vector Machines 46, 389–422 (2002)
http://www.vitoantoniobevilacqua.it/supplementarymaterials/ICIC2011_1945
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bevilacqua, V., Pannarale, P., Abbrescia, M., Cava, C., Tommasi, S. (2012). Comparison of Data-Merging Methods with SVM Attribute Selection and Classification in Breast Cancer Gene Expression. In: Huang, DS., Gan, Y., Premaratne, P., Han, K. (eds) Bio-Inspired Computing and Applications. ICIC 2011. Lecture Notes in Computer Science(), vol 6840. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24553-4_66
Download citation
DOI: https://doi.org/10.1007/978-3-642-24553-4_66
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24552-7
Online ISBN: 978-3-642-24553-4
eBook Packages: Computer ScienceComputer Science (R0)