Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5226))

Included in the following conference series:

Abstract

Based on advances in statistical learning theory, Support Vector Machine (SVM) has demonstrated unique features and state-of-the-art performance in many real-world classification problems. However, conventional SVM utilizes a sign function to classify test data into different classes, which has shown some limitations that hinder its performance. This paper exploresthe feasibility of incorporating information theory-based approaches into SVM decision making process and demonstrated its application in the classification of imbalanced biological datasets. The results obtained indicated that by incorporating information theory-based technique, a significant improvement was achieved (p < 0.005), especially in the process of classification of imbalanced datasets. The proposed methodology not only can improve the overall prediction performance but also can make the classification with the SVM less sensitive to the selection of input parameters.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Vapnik, V.: The Nature of Statistical Learning Theory, 2nd edn. Springer, Heidelberg (1999)

    Google Scholar 

  2. Cortes, C., Vapnik, V.: Support Vector Networks. Machine Learning 20, 273–297 (1995)

    MATH  Google Scholar 

  3. Osuna, E., Freund, R., Girosi, F.: Training Support Vector Machines: An Application to Face Detection. In: Conf. Computer Vision and Pattern Recognition, pp. 130–136 (1997)

    Google Scholar 

  4. Li, Y., Bontcheva, K., Cunningham, H.: Using uneven images SVM and perceptron for information extraction. In: the 9th Conference on Computational Natural Language Learning, pp. 72–79 (June 2005)

    Google Scholar 

  5. Zien, A., Ratsch, G., Mika, S., Scholkopf, B., Lengauer, T., Muller, K.-R.: Engineering Support Vector Machine Kernels that Recognize Translation Initiation Sites. BioInformatics 16(9), 799–807 (2000)

    Article  Google Scholar 

  6. Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene Selection for Cancer Classification using Support Vector Machines. Machine Learning 46(1/3), 389–422 (2002)

    Article  MATH  Google Scholar 

  7. Brown, M., Grundy, W., Lin, D., Cristianini, N., Sugnet, C., Furey, T., Ares Jr., M., Haussler, D.: Knowledge-based Analysis of Microarray Gene Expression Data by Using Support Vector Machines. Proc. Natl. Acad. Sci. 97, 262–267 (2000)

    Article  Google Scholar 

  8. Altman, R.B.: Challenge for Intelligent Systems in Biology. IEEE Intelligent Systems 40(2), 394–409 (2001)

    MathSciNet  Google Scholar 

  9. Burges, C.: A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery 2, 121–167 (1998)

    Article  Google Scholar 

  10. Boser, B.E., Guyon, I.M., Vapnik, V.N.: A Training Algorithm for Optimal Margin Classifiers. In: Haussler, D. (ed.) 5th Annual ACM Workshop on Computational Learning Theory, Pittsburgh, PA, pp. 144–152. ACM Press, New York (1992)

    Chapter  Google Scholar 

  11. Xie, Z., Hu, Q., Yu, D.: Fuzzy Output Support Vector Machines for Classification. In: The Proc. of international conference on advances in natural computation 2005, pp. 1190–1197 (2005)

    Google Scholar 

  12. Li, B., Hu, J., Hirasawa, K.: An Improved Support Vector Machine with Soft Decision-Making Boundary. In: the 26th IASTED International Conference on Artificial Intelligence and Application (2008)

    Google Scholar 

  13. Quinlan, J.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo (1993)

    Google Scholar 

  14. Blackshaw, S., Fraioli, R.E., Furukawa, T., Cepko, C.L.: Comprehensive Analysis of Photoreceptor Gene Expression and the Identification of Candidate Retinal Disease Genes. Cell 107, 579–589 (2001)

    Article  Google Scholar 

  15. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

  16. Platt, J.: Fast Training of Support Vector Machines using Sequential Minimal Optimization. In: Schoelkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods - Support Vector Learning. MIT Press, Cambridge (1998)

    Google Scholar 

  17. Wang, H., Zheng, H., Simpson, D., Zauaje, F.: Machine Learning Approaches to Supporting the Identification of Photoreceptor-enriched Genes Based on Expression Data. BMC Bioinformatics 7, 116 (2006)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, H., Zheng, H. (2008). An Improved Support Vector Machine for the Classification of Imbalanced Biological Datasets. In: Huang, DS., Wunsch, D.C., Levine, D.S., Jo, KH. (eds) Advanced Intelligent Computing Theories and Applications. With Aspects of Theoretical and Methodological Issues. ICIC 2008. Lecture Notes in Computer Science, vol 5226. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87442-3_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-87442-3_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-87440-9

  • Online ISBN: 978-3-540-87442-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics