Skip to main content
Log in

On Eigen-matrix translation method for classification of biological data

  • Published:
Journal of Systems Science and Complexity Aims and scope Submit manuscript

Abstract

Driven by the challenge of integrating large amount of experimental data, classification technique emerges as one of the major and popular tools in computational biology and bioinformatics research. Machine learning methods, especially kernel methods with Support Vector Machines (SVMs) are very popular and effective tools. In the perspective of kernel matrix, a technique namely Eigenmatrix translation has been introduced for protein data classification. The Eigen-matrix translation strategy has a lot of nice properties which deserve more exploration. This paper investigates the major role of Eigen-matrix translation in classification. The authors propose that its importance lies in the dimension reduction of predictor attributes within the data set. This is very important when the dimension of features is huge. The authors show by numerical experiments on real biological data sets that the proposed framework is crucial and effective in improving classification accuracy. This can therefore serve as a novel perspective for future research in dimension reduction problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Fielding A H, Cluster and Classification Techniques for the Biosciences, 1st Edition Cambridge, U.K., 2007.

    Google Scholar 

  2. Watanabe S, Knowing and Guessing: A Quantitative Study of Inference and Information, New York U.S.A., 1969.

    MATH  Google Scholar 

  3. Agrawal R, Gehrke J, Gunopulos D and Raghavan R, Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications, Proceedings of the 1998 ACM-SIGMOD International Conference on the Management of Data (SIGMOD98), Seattle, WA, June 2–4, 1998.

    Google Scholar 

  4. Dy J and Brodley C E, Feature subset selection and order identification for unsupervised learning, The Seventeenth International Conference on Machine Learning, Stanford, CA, USA, June 29, 2000.

    Google Scholar 

  5. Schölkopf B and Smola A J, A short introduction to learning with kernels, Advanced Lectures on Machine Learning, New York, U.S., 2003.

    Google Scholar 

  6. Borgwardt K and Kriegel H, Kernel Methods for Protein Function Prediction, AFP-SIG, Detroit, USA: Oxford, 2005.

    Google Scholar 

  7. Jaakola T, Diekhans M, and Haussler D, A discriminant framework for detecting remote protein homologies, Journal of Computational Biology, 2000, 7: 95–114.

    Article  Google Scholar 

  8. Shawe-Taylor J and Cristianini N, Kernel Methods for Pattern Analysis, Cambridge University Press, 2004.

    Book  Google Scholar 

  9. Leslie C, Eskin E, Cohen A, and Noble W, The spectrum kernel: A string kernel for SVM protein classification, Proceedings of the Pacific Biocomputing Symposium, Hawaii, 2002.

    Google Scholar 

  10. Leslie C, Eskin E, Weston J, and Noble W, Mismatch string kernel for discriminative protein classification, Bioinformatics, 2004, 20: 467–476.

    Article  Google Scholar 

  11. Yuan Y, Lin L, Dong Q, Wang X, and Li M, A protein classification method based on latent semantic analysis, Proceedings of the 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference, Shanghai, 2005.

    Google Scholar 

  12. Ratsch G, Sonnenburg S, and Scolkopf B, RASE: Recognition of alternatively spliced exons in c. elegans, Bioinformatics, 2005, 21: 1369–1377.

    Article  Google Scholar 

  13. Webb-Robertson B, Ratuiste K, and Oehmen C, Physicochemical property distributions for accurate and rapid pairwise protein homology detection, BMC Bioinformatics, 2010, 11: 145.

    Article  Google Scholar 

  14. Jiang H and Ching W, Physico-chemically weighted kernel for SVM protein classification, Proceedings of the 2nd International Conference on Biomedical Engineering and Computer Science (ICBECS 2011), 23–24 April, Wuhan, China, 2011.

    Google Scholar 

  15. Horn R and Johnson C, Matrix Analysis, Cambridge University Press Cambridge, 1985.

    Book  MATH  Google Scholar 

  16. Donoho D, High-dimensional data analysis: The curses and blessings of dimensionality, American Mathematical Society Conference of Math Challenges of the 21st Century, Los Angeles, August, 2000.

    Google Scholar 

  17. Bellman R, Adaptive Control Processes: A Guided Tour, Princeton University Press Princeton, New Jersey, 1961.

    MATH  Google Scholar 

  18. Breiman L, Random forests, Machine Learning, 2001, 45: 5–32.

    Article  MATH  Google Scholar 

  19. Jiang H and Ching W, Kernel techniques in support vector machines for classification of biological data, International Journal of Information Technology and Computer Science, 2011, 3: 1–8.

    Article  MathSciNet  Google Scholar 

  20. He H, Eigenvectors and reconstruction, The Electronic Journal of Combinatorics, 2007, 14: 1–8.

    Google Scholar 

  21. Functional Glycomics Gateway, Available at http://www.functionalglycomics.org.

  22. Yang Y, Lin L, Dong Q, Wang X, and Li M, Remote protein homology detection using recurrence quantification analysis and amino acid physicochemical properties, Journal of Theorietical Biology, 2008, 252: 145–154.

    Article  Google Scholar 

  23. http://hkumath.hku.hk/~wkc/papers/ieeeadditionalfile1.pdf.

  24. Mamitsuka H, Selecting features in microarray classification using ROC curves, Pattern Recognition, 2006, 39: 2393–2404.

    Article  MATH  Google Scholar 

  25. Fan J Q and Fan Y Y, High-dimensional classification using features annealed independence rules, Annals of Statistics, 2008, 36: 2605–2637.

    Article  MathSciNet  MATH  Google Scholar 

  26. Jiang H and Ching W, The role of eigen-matrix translation in classification of biological datasets, Proceedings of the IEEE International Conference on Bioinformatics & Biomedicine (BIBM 2012) 2012, Philadelphia, U.S., 2012.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hao Jiang.

Additional information

This research work was supported by Research Grants Council of Hong Kong under Grant No. 17301214 and HKU CERG Grants, Fundamental Research Funds for the Central Universities, and the Research Funds of Renmin University of China, Hung Hing Ying Physical Research Grant, and the Natural Science Foundation of China under Grant No. 11271144.

This paper was recommended for publication by Editor GAO Xiao-Shan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jiang, H., Qiu, Y., Cheng, X. et al. On Eigen-matrix translation method for classification of biological data. J Syst Sci Complex 28, 1212–1230 (2015). https://doi.org/10.1007/s11424-015-3043-2

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11424-015-3043-2

Keywords

Navigation