Boosting Support Vector Machines Using Multiple Dissimilarities

Blanco, Ángela; Martín-Merino, Manuel

doi:10.1007/978-3-540-74819-9_18

Ángela Blanco¹ &
Manuel Martín-Merino¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4692))

Included in the following conference series:

International Conference on Knowledge-Based and Intelligent Information and Engineering Systems

1432 Accesses

Abstract

Support Vector Machines (SVM) are powerful machine learning techniques that are able to deal with high dimensional and noisy data. They have been successfully applied to a wide range of problems and particularly to the analysis of gene expression data.

However SVM algorithms rely usually on the use of the Euclidean distance that often fails to reflect the object proximities. Several versions of the SVM have been proposed that incorporate non Euclidean dissimilarities. Nevertheless, different dissimilarities reflect complementary features of the data and no one can be considered superior to the others. In this paper, we present an ensemble of SVM classifiers that reduces the misclassification error combining different dissimilarities. The method proposed has been applied to identify cancerous tissues using Microarray gene expression data with remarkable results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aggarwal, C.C.: Re-designing distance functions and distance-based applications for high dimensional applications. In: Proc. of SIGMOD-PODS, vol. 1, pp. 13–18 (2001)
Google Scholar
Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning 36, 105–139 (1999)
Article Google Scholar
Breiman, L.: Bagging predictors. Machine Learning 24, 123–140 (1996)
MATH Google Scholar
Cox, T., Cox, M.: Multidimensional Scaling, 2nd edn. Chapman & Hall/CRC Press, Boca Raton, USA (2001)
Google Scholar
Drãghici, S.: Data Analysis Tools for DNA Microarrays. Chapman & Hall/CRC Press, New York (2003)
Google Scholar
Gentleman, R., Carey, V., Huber, W., Irizarry, R., Dudoit, S.: Bioinformatics and Computational Biology Solutions Using R and Bioconductor. Springer, Berlin (2006)
Google Scholar
Golub, G.H., Loan, C.F.V.: Matrix Computations, 3rd edn. Johns Hopkins University press, Baltimore, Maryland, USA (1996)
MATH Google Scholar
Golub, T., Slonim, D., Tamayo, P.: Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286(15), 531–537 (1999)
Article Google Scholar
Hinneburg, C.C.A.A., Keim, D.A.: What is the nearest neighbor in high dimensional spaces? In: Eiter, T., Libkin, L. (eds.) ICDT 2005. LNCS, vol. 3363, pp. 506–515. Springer, Heidelberg (2004)
Google Scholar
Jiang, D., Tang, C., Zhang, A.: Cluster analysis for gene expression data: A survey. IEEE Transactions on Knowledge and Data Engineering 16(11) (November 2004)
Google Scholar
Kittler, J., Hatef, M., Duin, R., Matas, J.: On combining classifiers. IEEE Transactions on Neural Networks 20(3), 228–239 (1998)
Google Scholar
Martín-Merino, M., Muñoz, A.: A new Sammon algorithm for sparse data visualization. In: International Conference on Pattern Recognition (ICPR), vol. 1, pp. 477–481. IEEE Press, Cambridge (UK) (2004)
Google Scholar
Martín-Merino, M., Muñoz, A.: Self organizing map and Sammon mapping for asymmetric proximities. Neurocomputing 63, 171–192 (2005)
Article Google Scholar
Molinaro, A., Simon, R., Pfeiffer, R.: Prediction error estimation: a comparison of resampling methods. Bioinformatics 21(15), 3301–3307 (2005)
Article Google Scholar
Pekalska, E., Paclick, P., Duin, R.: A generalized kernel approach to dissimilarity-based classification. Journal of Machine Learning Research 2, 175–211 (2001)
Article Google Scholar
Valentini, G., Dietterich, T.: Bias-variance analysis of support vector machines for the development of svm-based ensemble methods. Journal of Machine Learning Research 5, 725–775 (2004)
Google Scholar
Vapnik, V.: Statistical Learning Theory. John Wiley & Sons, New York (1998)
MATH Google Scholar
West, M., Blanchette, C., Dressman, H., Huang, E., Ishida, S., Spang, R., Zuzan, H., Olson, J., Marks, J., Nevins, J.: Predicting the clinical status of human breast cancer by using gene expression profiles. PNAS, 98(20) (September 2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Universidad Pontificia de Salamanca, C/Compañía 5, 37002, Salamanca, Spain
Ángela Blanco & Manuel Martín-Merino

Authors

Ángela Blanco
View author publications
You can also search for this author in PubMed Google Scholar
Manuel Martín-Merino
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Bruno Apolloni Robert J. Howlett Lakhmi Jain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Blanco, Á., Martín-Merino, M. (2007). Boosting Support Vector Machines Using Multiple Dissimilarities. In: Apolloni, B., Howlett, R.J., Jain, L. (eds) Knowledge-Based Intelligent Information and Engineering Systems. KES 2007. Lecture Notes in Computer Science(), vol 4692. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74819-9_18

Download citation

DOI: https://doi.org/10.1007/978-3-540-74819-9_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74817-5
Online ISBN: 978-3-540-74819-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics