Skip to main content

Boosting Support Vector Machines Using Multiple Dissimilarities

  • Conference paper
Book cover Knowledge-Based Intelligent Information and Engineering Systems (KES 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4692))

  • 1432 Accesses

Abstract

Support Vector Machines (SVM) are powerful machine learning techniques that are able to deal with high dimensional and noisy data. They have been successfully applied to a wide range of problems and particularly to the analysis of gene expression data.

However SVM algorithms rely usually on the use of the Euclidean distance that often fails to reflect the object proximities. Several versions of the SVM have been proposed that incorporate non Euclidean dissimilarities. Nevertheless, different dissimilarities reflect complementary features of the data and no one can be considered superior to the others. In this paper, we present an ensemble of SVM classifiers that reduces the misclassification error combining different dissimilarities. The method proposed has been applied to identify cancerous tissues using Microarray gene expression data with remarkable results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aggarwal, C.C.: Re-designing distance functions and distance-based applications for high dimensional applications. In: Proc. of SIGMOD-PODS, vol. 1, pp. 13–18 (2001)

    Google Scholar 

  2. Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning 36, 105–139 (1999)

    Article  Google Scholar 

  3. Breiman, L.: Bagging predictors. Machine Learning 24, 123–140 (1996)

    MATH  Google Scholar 

  4. Cox, T., Cox, M.: Multidimensional Scaling, 2nd edn. Chapman & Hall/CRC Press, Boca Raton, USA (2001)

    Google Scholar 

  5. Drãghici, S.: Data Analysis Tools for DNA Microarrays. Chapman & Hall/CRC Press, New York (2003)

    Google Scholar 

  6. Gentleman, R., Carey, V., Huber, W., Irizarry, R., Dudoit, S.: Bioinformatics and Computational Biology Solutions Using R and Bioconductor. Springer, Berlin (2006)

    Google Scholar 

  7. Golub, G.H., Loan, C.F.V.: Matrix Computations, 3rd edn. Johns Hopkins University press, Baltimore, Maryland, USA (1996)

    MATH  Google Scholar 

  8. Golub, T., Slonim, D., Tamayo, P.: Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286(15), 531–537 (1999)

    Article  Google Scholar 

  9. Hinneburg, C.C.A.A., Keim, D.A.: What is the nearest neighbor in high dimensional spaces? In: Eiter, T., Libkin, L. (eds.) ICDT 2005. LNCS, vol. 3363, pp. 506–515. Springer, Heidelberg (2004)

    Google Scholar 

  10. Jiang, D., Tang, C., Zhang, A.: Cluster analysis for gene expression data: A survey. IEEE Transactions on Knowledge and Data Engineering 16(11) (November 2004)

    Google Scholar 

  11. Kittler, J., Hatef, M., Duin, R., Matas, J.: On combining classifiers. IEEE Transactions on Neural Networks 20(3), 228–239 (1998)

    Google Scholar 

  12. Martín-Merino, M., Muñoz, A.: A new Sammon algorithm for sparse data visualization. In: International Conference on Pattern Recognition (ICPR), vol. 1, pp. 477–481. IEEE Press, Cambridge (UK) (2004)

    Google Scholar 

  13. Martín-Merino, M., Muñoz, A.: Self organizing map and Sammon mapping for asymmetric proximities. Neurocomputing 63, 171–192 (2005)

    Article  Google Scholar 

  14. Molinaro, A., Simon, R., Pfeiffer, R.: Prediction error estimation: a comparison of resampling methods. Bioinformatics 21(15), 3301–3307 (2005)

    Article  Google Scholar 

  15. Pekalska, E., Paclick, P., Duin, R.: A generalized kernel approach to dissimilarity-based classification. Journal of Machine Learning Research 2, 175–211 (2001)

    Article  Google Scholar 

  16. Valentini, G., Dietterich, T.: Bias-variance analysis of support vector machines for the development of svm-based ensemble methods. Journal of Machine Learning Research 5, 725–775 (2004)

    Google Scholar 

  17. Vapnik, V.: Statistical Learning Theory. John Wiley & Sons, New York (1998)

    MATH  Google Scholar 

  18. West, M., Blanchette, C., Dressman, H., Huang, E., Ishida, S., Spang, R., Zuzan, H., Olson, J., Marks, J., Nevins, J.: Predicting the clinical status of human breast cancer by using gene expression profiles. PNAS, 98(20) (September 2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Bruno Apolloni Robert J. Howlett Lakhmi Jain

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Blanco, Á., Martín-Merino, M. (2007). Boosting Support Vector Machines Using Multiple Dissimilarities. In: Apolloni, B., Howlett, R.J., Jain, L. (eds) Knowledge-Based Intelligent Information and Engineering Systems. KES 2007. Lecture Notes in Computer Science(), vol 4692. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74819-9_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74819-9_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74817-5

  • Online ISBN: 978-3-540-74819-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics