Ensemble of Support Vector Machines to Improve the Cancer Class Prediction Based on the Gene Expression Profiles

Blanco, Ángela; Martín-Merino, Manuel; Rivas, Javier De Las

doi:10.1007/978-3-540-74972-1_51

Ángela Blanco⁵,
Manuel Martín-Merino⁵ &
Javier De Las Rivas⁶

Part of the book series: Advances in Soft Computing ((AINSC,volume 44))

1335 Accesses
4 Citations

Abstract

DNA microarrays provide rich profiles that are used in cancer prediction considering the gene expression levels across a collection of samples.Support Vector Machines (SVM), have been applied to the classification of cancer samples with encouraging results. However, they are usually based on Euclidean distances that fail to reflect accurately the sample proximities. Besides, SVM classifiers based on non-Euclidean dissimilarities fail to reduce significantly the errors. In this paper, we propose an ensemble of SVM classifiers in order to reduce the errors. The diversity among classifiers is induced considering a set of complementary dissimilarities and kernels. The experimental results suggest that that our algorithm improves classifiers based on a single dissimilarity and a combination strategy such as Bagging.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aggarwal, C. C.: Re-designing Distance Functions and Distance-Based Applications for High Dimensional Applications, in Proc. of the ACM International Conference on Management of Data and Symposium on Principles of Database Systems (SIGMODPODS), vol. 1, March 2001, pp. 13–18.
Google Scholar
Alon, U., Barkai, N., Notterman, D. A., Gish, K., Ybarra, S., Mack, D., Levine, A.J.: Broad Patterns of Gene Expression Revealed by Clustering Analysis of Tumor and Normal Colon Tissues Probed by Oligonucleotide Arrays. Proc. Nat’l Acad Sci USA, 96:6745–6750, 1999.
Article Google Scholar
Bauer, E., Kohavi, R.: An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants, Machine Learning, vol. 36, pp. 105–139, 1999.
Article Google Scholar
Braga-Neto, U., Dougherty, E.: Is Cross-Validation Valid for Small-Sample Microarray Classification? Bioinformatics, vol. 20, no. 3, pp. 374–380, 2004.
Article Google Scholar
Breiman, L.: Bagging predictors, Machine Learning, vol. 24, pp. 123–140, 1996.
MATH MathSciNet Google Scholar
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge: Cambridge University Press, 2000.
Google Scholar
Drãghici, S.: Data Analysis Tools for DNA Microarrays. New York: Chapman & Hall/CRC Press, 2003.
Google Scholar
Furey, T., Cristianini, N., Duffy, N., Bednarski, D., Schummer, M., Haussler, D.: Support Vector Machine Classification and Validation of Cancer Tissue Samples Using Microarray Expression Data, Bioinformatics, vol. 16, no. 10, pp. 906–914, 2000.
Article Google Scholar
Gentleman, R., Carey, V., Huber, W., Irizarry, R., Dudoit, S.: Bioinformatics and Computational Biology Solutions Using R and Bioconductor. Berlin: Springer Verlag, 2006.
Google Scholar
Golub, T., Slonim, D., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J., Coller, H., Loh, M., Downing, J., Caligiuri, M., Bloomfield, C., Lander, E.: Molecular classification of cancer: Class Discovery and Class Prediction by Gene Expression Monitoring, Science, vol. 286, no. 15, pp. 531–537, 1999.
Article Google Scholar
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene Selection for Cancer Classification Using Support Vector Machines, Machine Learning, vol. 46, pp. 389–422, 2002.
Article MATH Google Scholar
Hinneburg C. C. A., Keim, D. A.: What is the Nearest Neighbor in High Dimensional Spaces? In Proc. of the International Conference on Database Theory (ICDT). Cairo, Egypt: Morgan Kaufmann, September 2000, pp. 506–515.
Google Scholar
Jiang, D., Tang, C. Zhang, A.: Cluster Analysis for Gene Expression Data: A survey, IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 11, November 2004.
Google Scholar
Kuncheva, L. I.: Combining Pattern Classifiers. John Wiley, New Jersey, 2004.
Book MATH Google Scholar
Martín-Merino, M., Muñoz, A.: Self Organizing Map and Sammon Mapping for Asymmetric Proximities, Neurocomputing, vol. 63, pp. 171–192, 2005.
Article Google Scholar
Martín-Merino, M., Muñoz, A.: A New Sammon Algorithm for Sparse Data Visualization, In International Conference on Pattern Recognition (ICPR), vol. 1. Cambridge (UK): IEEE Press, August 2004, pp. 477–481.
Google Scholar
Molinaro, A., Simon, R. Pfeiffer, R.: Prediction Error Estimation: a Comparison of Resampling Methods, Bioinformatics, vol. 21, no. 15, pp. 3301–3307, 2005.
Article Google Scholar
Pekalska, E., Paclick, P., Duin, R.: A Generalized Kernel Approach to Dissimilarity-Based Classification,” Journal of Machine Learning Research, vol. 2, pp. 175–211, 2001.
Article Google Scholar
Valentini, G., Dietterich, T.: Bias-Variance Analysis of Support Vector Machines for the Development of Svm-Based Ensemble Methods, Journal of Machine Learning Research, vol. 5, pp. 725–775, 2004.
MathSciNet Google Scholar
Vapnik, V.: Statistical Learning Theory. New York: John Wiley & Sons, 1998.
MATH Google Scholar
West, M., Blanchette, C., Dressman, H., Huang, E., Ishida, S., Spang, R., Zuzan, H., Olson, J., Marks, J., Nevins, J.: Predicting the Clinical Status of Human Breast Cancer by Using Gene Expression Profiles, PNAS, vol. 98, no. 20, September 2001.
Google Scholar

Download references

Author information

Authors and Affiliations

Universidad Pontificia de Salamanca (UPSA)C, Compañía 5, 37002, Salamanca, Spain
Ángela Blanco & Manuel Martín-Merino
Cancer Research Center (CIC-IBMCC, CSIC/USAL), Salamanca, Spain
Javier De Las Rivas

Authors

Ángela Blanco
View author publications
You can also search for this author in PubMed Google Scholar
Manuel Martín-Merino
View author publications
You can also search for this author in PubMed Google Scholar
Javier De Las Rivas
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Escuela Politécnica Superior Campus Vena, Edifico C, Universidad de Burgos, C/Francisco de Vitoria s/n, 09006, Burgos, Spain
Emilio Corchado
Departamento de Informática y Automática Facultad de Ciencias, Universidad de Salamanca, Plaza de la Merced S/N, 37008, Salamanca, Spain
Juan M. Corchado
Centre for Quantifiable Quality of Service in Communication Systems (Q2S) Centre of Excellence, Norwegian University of Science and Technology, O.S. Bragstads plass 2E, 7491, Trondheim, Norway
Ajith Abraham

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Blanco, Á., Martín-Merino, M., Rivas, J.D.L. (2007). Ensemble of Support Vector Machines to Improve the Cancer Class Prediction Based on the Gene Expression Profiles. In: Corchado, E., Corchado, J.M., Abraham, A. (eds) Innovations in Hybrid Intelligent Systems. Advances in Soft Computing, vol 44. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74972-1_51

Download citation

DOI: https://doi.org/10.1007/978-3-540-74972-1_51
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74971-4
Online ISBN: 978-3-540-74972-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics