Skip to main content

Cross-Platform Pathway Activity Transformation and Classification of Microarray Data

  • Conference paper
Computational Intelligence in Information Systems

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 331))

  • 1287 Accesses

Abstract

One of the most challenging problems in microarray study is to analyze microarray data from different platforms. This will improve the reliability of the study, as number of samples is larger and it can be applied for rare disease study, for which only a few microarray data have been published. As different microarray platforms cover different number of genes, so the integrative study of two different platforms needs to be able to deal with the missing value issue. Many works have been done for cross-platform microarray data utilization but none of them have focused on gene-set based microarray data classification. In this study, we applied the Bayesian-based method to reconstruct the expression level of the missing genes before transforming it to the gene-set activity. Two gene-set activity transformation methods; Negatively Correlated Feature Set (NCFS-i) and Analysis-of-Variance Feature Set (AFS), were used to evaluate the performance of this method using actual microarray datasets. The results show that the imputation of missing data can improve the classification performance of the cross-platform study.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Kerr, M.K., Martin, M., Churchill, G.A.: Analysis of variance for gene expression microarray data. Journal of Computational Biology 7(6), 819–837 (2000)

    Article  Google Scholar 

  2. Quackenbush, J.: Computational analysis of microarray data. Nature Reviews Genetics 2(6), 418–427 (2001)

    Article  Google Scholar 

  3. Rhodes, D.R., Yu, J., Shanker, K., Deshpande, N., Varambally, R., Ghosh, D., Barrette, T., Pandey, A., Chinnaiyan, A.M.: Large-scale meta-analysis of cancer microarray data identifies common transcriptional profiles of neoplastic transformation and progression. Proceedings of the National Academy of Sciences of the United States of America 101(25), 9309–9314 (2004)

    Article  Google Scholar 

  4. Lee, E., Chuang, H.Y., Kim, J.W., Ideker, T., Lee, D.: Inferring pathway activity toward precise disease classification. PLoS Computational Biology 4(11), e1000217 (2008)

    Article  Google Scholar 

  5. Sootanan, P., Prom-on, S., Meechai, A., Chan, J.H.: Pathway-based microarray analysis for robust disease classification. Neural Computing and Applications 21(4), 649–660 (2012)

    Article  Google Scholar 

  6. Engchuan, W., Chan, J.H.: Pathway activity transformation for multi-class classification of lung cancer datasets. Neurocomputing (in press, 2014)

    Google Scholar 

  7. Choi, J.K., Yu, U., Kim, S., Yoo, O.J.: Combining multiple microarray studies and modeling inter study variation. Bioinformatics 19, i84–i90 (2003)

    Google Scholar 

  8. Xu, L., Tan, A.C., Naiman, D.Q., Geman, D., Winslow, R.L.: Robust prostate cancer marker genes emerge from direct integration of inter-study microarray data. Bioinformatics 21(20), 3905–3911 (2005)

    Article  Google Scholar 

  9. Benito, M., Parker, J., Du, Q., Wu, J., Xiang, D., Perou, C.M., Marron, J.S.: Adjustment of systematic microarray data biases. Bioinformatics 20(1), 105–114 (2004)

    Article  Google Scholar 

  10. Chen, C., Grennan, K., Badner, J., Zhang, D., Gershon, E., Jin, L., Liu, C.: Removing batch effects in analysis of expression microarray data: an evaluation of six batch adjustment methods. PloS One 6(2), e17238 (2011)

    Article  Google Scholar 

  11. Tan, P.K., Downey, T.J., Spitznagel Jr., E.L., Xu, P., Fu, D., Dimitrov, D.S., Cam, M.C.: Evaluation of gene expression measurements from commercial microarray platforms. Nucleic Acids Research 31(19), 5676–5684 (2003)

    Article  Google Scholar 

  12. Irizarry, R.A., Warren, D., Spencer, F., Kim, I.F., Biswal, S., Frank, B.C., Yu, W.: Multiple-laboratory comparison of microarray platforms. Nature Methods 2(5), 345–350 (2005)

    Article  Google Scholar 

  13. Howell, D.C.: The treatment of missing data. In: The Sage Handbook of Social Science Methodology, pp. 208–224 (2007)

    Google Scholar 

  14. Donders, A.R.T., van der Heijden, G.J., Stijnen, T., Moons, K.G.: Review: a gentle introduction to imputation of missing values. Journal of Clinical Epidemiology 59(10), 1087–1091 (2006)

    Article  Google Scholar 

  15. Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R., Bostein, D., Altman, R.B.: Missing value estimation methods for DNA microarrays. Bioinformatics 17(6), 520–525 (2001)

    Article  Google Scholar 

  16. Oba, S., Sato, M.A., Takemasa, I., Monden, M., Matsubara, K.I., Ishii, S.: A Bayesian missing value estimation method for gene expression profile data. Bioinformatics 19(16), 2088–2096 (2003)

    Article  Google Scholar 

  17. Brock, G.N., Shaffer, J.R., Blakesley, R.E., Lotz, M.J., Tseng, G.C.: Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes. BMC Bioinformatics 9(1), 12 (2008)

    Article  Google Scholar 

  18. Pedreschi, R., Hertog, M.L., Carpentier, S.C., Lammertyn, J., Robben, J., Noben, J.P., Panis, B., Swennen, R., Nicolaï, B.M.: Treatment of missing values for multivariate statistical analysis of gel based proteomics data. Proteomics 8(7), 1371–1383 (2008)

    Article  Google Scholar 

  19. Liew, A.W.C., Law, N.F., Yan, H.: Missing value imputation for gene expression data: computational techniques to recover missing data from available information. Briefings in Bioinformatics 12(5), 498–513 (2011)

    Article  Google Scholar 

  20. Edgar, R., Domrachev, M., Lash, A.E.: Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Research 30(1), 207–210 (2002)

    Article  Google Scholar 

  21. Liberzon, A., Subramanian, A., Pinchback, R., Thorvaldsdóttir, H., Tamayo, P., Mesirov, J.P.: Molecular signatures database (MSigDB) 3.0. Bioinformatics 27(12), 1739–1740 (2011)

    Article  Google Scholar 

  22. Stacklies, W., Redestig, H., Scholz, M., Walther, D., Selbig, J.: pcaMethods—a biocon-ductor package providing PCA methods for incomplete data. Bioinformatics 23(9), 1164–1167 (2007)

    Article  Google Scholar 

  23. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. The Journal of Machine Learning Research 3, 1157–1182 (2003)

    MATH  Google Scholar 

  24. Kononenko, I.: Estimating attributes: analysis and extensions of RELIEF. In: Bergadano, F., De Raedt, L. (eds.) ECML 1994. LNCS, vol. 784, pp. 171–182. Springer, Heidelberg (1994)

    Google Scholar 

  25. Robnik-Šikonja, M., Kononenko, I.: Theoretical and empirical analysis of Relief-F and RRelief-F. Machine Learning 53(1-2), 23–69 (2003)

    Article  MATH  Google Scholar 

  26. Wang, Y., Makedon, F.: Application of Relief-F feature filtering algorithm to selecting informative genes for cancer classification using microarray data. In: Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004, pp. 497–498 (2004)

    Google Scholar 

  27. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explorations Newsletter 11(1), 10–18 (2009)

    Article  Google Scholar 

  28. Pletscher-Frankild, S., Pallejà, A., Tsafou, K., Binder, J.X., Jensen, L.J.: DISEASES: Text mining and data integration of disease–gene associations, bioRxiv, 008425 (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Worrawat Engchuan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Engchuan, W., Meechai, A., Tongsima, S., Chan, J.H. (2015). Cross-Platform Pathway Activity Transformation and Classification of Microarray Data. In: Phon-Amnuaisuk, S., Au, T. (eds) Computational Intelligence in Information Systems. Advances in Intelligent Systems and Computing, vol 331. Springer, Cham. https://doi.org/10.1007/978-3-319-13153-5_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-13153-5_14

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-13152-8

  • Online ISBN: 978-3-319-13153-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics