Skip to main content

An Insight on the ‘Large G, Small n’ Problem in Gene-Expression Microarray Classification

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10255))

Abstract

This paper analyzes the effect of the high-dimensional, low-sample size problem in cancer classification using gene-expression microarrays. Here the two key questions addressed are: (i) What is the percentage of genes that can ensure highly accurate classification?, and (ii) Does this percentage differ from one classifier to another? Both these issues are investigated by developing a pool of experiments with two gene ranking algorithms, five classifiers and four DNA microarray databases.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Alpaydin, E.: Introduction to Machine Learning. MIT Press, Cambridge (2010)

    MATH  Google Scholar 

  2. Bolón-Canedo, V., Morán-Fernández, L., Alonso-Betanzos, A.: An insight on complexity measures and classification in microarray data. In: Proceedings of International Joint Conference on Neural Networks, Killarney, Ireland, pp. 1–8 (2015)

    Google Scholar 

  3. Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines: And Other Kernel-based Learning Methods. Cambridge University Press, New York (2000)

    Book  MATH  Google Scholar 

  4. Dougherty, E.R.: Small sample issues for microarray-based classification. Comp. Funct. Genomics 2(1), 28–34 (2001)

    Article  Google Scholar 

  5. García, V., Sánchez, J.S.: Mapping microarray gene expression data into dissimilarity spaces for tumor classification. Inform. Sci. 294, 362–375 (2015)

    Article  MathSciNet  Google Scholar 

  6. Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439), 531–537 (1999)

    Article  Google Scholar 

  7. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)

    MATH  Google Scholar 

  8. Heller, M.J.: DNA microarray technology: devices, systems, and applications. Annu. Rev. Biomed. Eng. 4, 129–153 (2002)

    Article  Google Scholar 

  9. Hira, Z.M., Gillies, D.F.: A review of feature selection and feature extraction methods applied on microarray data. Adv. Bioinform. 2015, 1–13 (2015). ID: 198363

    Article  Google Scholar 

  10. Ho, T.K., Basu, M.: Complexity measures of supervised classification problems. IEEE Trans. Pattern Anal. Mach. Intell. 24(3), 289–300 (2002)

    Article  Google Scholar 

  11. Huang, L., Zhang, H.H., Zeng, Z.B., Bushel, P.R.: Improved sparse multi-class SVM and its application for gene selection in cancer classification. Cancer Inform. 12, 143–153 (2013)

    Article  Google Scholar 

  12. Kononenko, I.: Estimating attributes: analysis and extensions of RELIEF. In: Bergadano, F., Raedt, L. (eds.) ECML 1994. LNCS, vol. 784, pp. 171–182. Springer, Heidelberg (1994). doi:10.1007/3-540-57868-4_57

    Chapter  Google Scholar 

  13. Lazar, C., Taminau, J., Meganck, S., Steenhoff, D., Coletta, A., Molter, C., de Schaetzen, V., Duque, R., Bersini, H., Nowe, A.: A survey on filter techniques for feature selection in gene expression microarray analysis. IEEE-ACM Trans. Comput. Biol. Bioinform. 9(4), 1106–1119 (2012)

    Article  Google Scholar 

  14. Lu, Y., Han, J.: Cancer classification using gene expression data. Inf. Syst. 28(4), 243–268 (2003)

    Article  MATH  Google Scholar 

  15. Raspe, E., Decraene, C., Berx, G.: Gene expression profiling to dissect the complexity of cancer biology: pitfalls and promise. Semin. Cancer Biol. 22(3), 250–260 (2012)

    Article  Google Scholar 

  16. Robnik-Šikonja, M., Kononenko, I.: Theoretical and empirical analysis of ReliefF and RReliefF. Mach. Learn. 53(1–2), 23–69 (2003)

    Article  MATH  Google Scholar 

  17. Saeys, Y., Inza, I., Larrañaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)

    Article  Google Scholar 

  18. Simon, R.: Analysis of DNA microarray expression data. Best Pract. Res. Clin. Haematol. 22(2), 271–282 (2009)

    Article  Google Scholar 

  19. Wang, L., Chu, F., Xie, W.: Accurate cancer classification using expressions of very few genes. IEEE-ACM Trans. Comput. Biol. Bioinform. 4(1), 40–53 (2007)

    Article  Google Scholar 

  20. Zhang, C., Lu, X., Zhang, X.: Significance of gene ranking for classification of microarray samples. IEEE-ACM Trans. Comput. Biol. Bioinform. 3(3), 312–320 (2006)

    Article  Google Scholar 

Download references

Acknowledgment

This work has partially been supported by the Spanish Ministry of Economy [TIN2013-46522-P], the Mexican PRODEP [DSA/103.5/15/7004], and the Generalitat Valenciana [PROMETEOII/2014/062].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to J. S. Sánchez .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

García, V., Sánchez, J.S., Cleofas-Sánchez, L., Ochoa-Domínguez, H.J., López-Orozco, F. (2017). An Insight on the ‘Large G, Small n’ Problem in Gene-Expression Microarray Classification. In: Alexandre, L., Salvador Sánchez, J., Rodrigues, J. (eds) Pattern Recognition and Image Analysis. IbPRIA 2017. Lecture Notes in Computer Science(), vol 10255. Springer, Cham. https://doi.org/10.1007/978-3-319-58838-4_53

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-58838-4_53

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-58837-7

  • Online ISBN: 978-3-319-58838-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics