Skip to main content

Feature Selection Based on Activation of Signaling Pathways Applied for Classification of Samples in Microarray Studies

  • Conference paper
Artificial Intelligence and Soft Computing (ICAISC 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7268))

Included in the following conference series:

  • 1697 Accesses

Abstract

This paper presents a new method of deriving features for sample classification based on massive throughput data such as microarray gene expression studies. The number of features in these studies is much bigger than the number of samples thus strong reduction of dimensionality is essential. Standard approaches attempt to select subsets of features (genes) realizing highest association with the target, and they tend to produce unstable and non-reproducible feature sets. The purpose of this work is to improve feature selection by using prior biological knowledge of potential relationships between features, available e.g., in signaling pathways databases. We first identify most activated pathways and then derive pathway-based features based on expression of the up- and down-regulated genes in the pathway. We demonstrate performance of this approach using real microarray data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chiaretti, S., Li, X., Gentleman, R., et al.: Gene expression profile of adult T-cell acute lymphocytic leukemia identifies distinct subsets of patients with different response to therapy and survical. Blood 103, 2771–2778 (2004)

    Article  Google Scholar 

  2. Dudoit, S., Fridlyand, J., Speed, P.: Comparison of discriminant methods for classification of tumors using gene expression data. JASA 192, 77–87 (2005)

    Google Scholar 

  3. Efron, B., Tibshirani, R.: On testing the significance of sets of genes. Ann. Appl. Stat. 1(1), 107–129 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  4. Ein-Dor, L., Zuk, O., Domany, E.: Thousands of samples are needed to generate a robust gene list for predicting outcome in cancer. Proc. Natl. Acad. Sci. USA 103(15), 5923–5928 (2006)

    Article  Google Scholar 

  5. Goemann, J.J., et al.: A global test for groups of genes: testing association with a clinical outcome. Bioinformatics 20(1), 93–99 (2004)

    Article  Google Scholar 

  6. Goeman, J.J., Buehlmann, P.: Analyzing gene expression data in terms on gene sets: methodological issues. Bioinformatics 23(8), 980–987 (2007)

    Article  Google Scholar 

  7. Lin, Y.H., et al.: Multiple gene expression classifiers from different array platforms predict poor prognosis of colorectal cancer. Clin. Cancer Res. 13, 498–507 (2007)

    Article  Google Scholar 

  8. Maciejewski, H.: Quality of Feature Selection Based on Microarray Gene Expression Data. In: Bubak, M., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2008, Part III. LNCS, vol. 5103, pp. 140–147. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  9. Maciejewski, H., Twaróg, P.: Model Instability in Microarray Gene Expression Class Prediction Studies. In: Moreno-Díaz, R., Pichler, F., Quesada-Arencibia, A. (eds.) EUROCAST 2009. LNCS, vol. 5717, pp. 745–752. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  10. Maciejewski, H.: Class Prediction in Microarray Studies Based on Activation of Pathways. In: Corchado, E., Kurzyński, M., Woźniak, M. (eds.) HAIS 2011, Part I. LNCS (LNAI), vol. 6678, pp. 321–328. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  11. Maciejewski, H.: Competitive and self-contained gene set analysis methods applied for class prediction. In: Proc. of the Federated Conference on Computer Science and Information Systems. IEEE Computer Society Press (2011)

    Google Scholar 

  12. Markowetz, F., Spang, R.: Molecular diagnosis. Classification, Model Selection and Performance Evaluation, Methods Inf. Med. 44, 438–443 (2005)

    Google Scholar 

  13. Subramanian, A., et al.: Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. USA 102(43), 15545–15550 (2005)

    Article  Google Scholar 

  14. Wu, M.C., Lin, X.: Prior biological knowledge-based approaches for the analysi of genome-wide expression profiling using gene sets and pathways. Statistical Methods in Medical Research 18, 577–593 (2009)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Maciejewski, H. (2012). Feature Selection Based on Activation of Signaling Pathways Applied for Classification of Samples in Microarray Studies. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2012. Lecture Notes in Computer Science(), vol 7268. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29350-4_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-29350-4_34

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-29349-8

  • Online ISBN: 978-3-642-29350-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics