Cross-Platform Pathway Activity Transformation and Classification of Microarray Data

Engchuan, Worrawat; Meechai, Asawin; Tongsima, Sissades; Chan, Jonathan H.

doi:10.1007/978-3-319-13153-5_14

Worrawat Engchuan⁴,
Asawin Meechai⁵,
Sissades Tongsima⁶ &
…
Jonathan H. Chan⁴

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 331))

1287 Accesses

Abstract

One of the most challenging problems in microarray study is to analyze microarray data from different platforms. This will improve the reliability of the study, as number of samples is larger and it can be applied for rare disease study, for which only a few microarray data have been published. As different microarray platforms cover different number of genes, so the integrative study of two different platforms needs to be able to deal with the missing value issue. Many works have been done for cross-platform microarray data utilization but none of them have focused on gene-set based microarray data classification. In this study, we applied the Bayesian-based method to reconstruct the expression level of the missing genes before transforming it to the gene-set activity. Two gene-set activity transformation methods; Negatively Correlated Feature Set (NCFS-i) and Analysis-of-Variance Feature Set (AFS), were used to evaluate the performance of this method using actual microarray datasets. The results show that the imputation of missing data can improve the classification performance of the cross-platform study.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Kerr, M.K., Martin, M., Churchill, G.A.: Analysis of variance for gene expression microarray data. Journal of Computational Biology 7(6), 819–837 (2000)
Article Google Scholar
Quackenbush, J.: Computational analysis of microarray data. Nature Reviews Genetics 2(6), 418–427 (2001)
Article Google Scholar
Rhodes, D.R., Yu, J., Shanker, K., Deshpande, N., Varambally, R., Ghosh, D., Barrette, T., Pandey, A., Chinnaiyan, A.M.: Large-scale meta-analysis of cancer microarray data identifies common transcriptional profiles of neoplastic transformation and progression. Proceedings of the National Academy of Sciences of the United States of America 101(25), 9309–9314 (2004)
Article Google Scholar
Lee, E., Chuang, H.Y., Kim, J.W., Ideker, T., Lee, D.: Inferring pathway activity toward precise disease classification. PLoS Computational Biology 4(11), e1000217 (2008)
Article Google Scholar
Sootanan, P., Prom-on, S., Meechai, A., Chan, J.H.: Pathway-based microarray analysis for robust disease classification. Neural Computing and Applications 21(4), 649–660 (2012)
Article Google Scholar
Engchuan, W., Chan, J.H.: Pathway activity transformation for multi-class classification of lung cancer datasets. Neurocomputing (in press, 2014)
Google Scholar
Choi, J.K., Yu, U., Kim, S., Yoo, O.J.: Combining multiple microarray studies and modeling inter study variation. Bioinformatics 19, i84–i90 (2003)
Google Scholar
Xu, L., Tan, A.C., Naiman, D.Q., Geman, D., Winslow, R.L.: Robust prostate cancer marker genes emerge from direct integration of inter-study microarray data. Bioinformatics 21(20), 3905–3911 (2005)
Article Google Scholar
Benito, M., Parker, J., Du, Q., Wu, J., Xiang, D., Perou, C.M., Marron, J.S.: Adjustment of systematic microarray data biases. Bioinformatics 20(1), 105–114 (2004)
Article Google Scholar
Chen, C., Grennan, K., Badner, J., Zhang, D., Gershon, E., Jin, L., Liu, C.: Removing batch effects in analysis of expression microarray data: an evaluation of six batch adjustment methods. PloS One 6(2), e17238 (2011)
Article Google Scholar
Tan, P.K., Downey, T.J., Spitznagel Jr., E.L., Xu, P., Fu, D., Dimitrov, D.S., Cam, M.C.: Evaluation of gene expression measurements from commercial microarray platforms. Nucleic Acids Research 31(19), 5676–5684 (2003)
Article Google Scholar
Irizarry, R.A., Warren, D., Spencer, F., Kim, I.F., Biswal, S., Frank, B.C., Yu, W.: Multiple-laboratory comparison of microarray platforms. Nature Methods 2(5), 345–350 (2005)
Article Google Scholar
Howell, D.C.: The treatment of missing data. In: The Sage Handbook of Social Science Methodology, pp. 208–224 (2007)
Google Scholar
Donders, A.R.T., van der Heijden, G.J., Stijnen, T., Moons, K.G.: Review: a gentle introduction to imputation of missing values. Journal of Clinical Epidemiology 59(10), 1087–1091 (2006)
Article Google Scholar
Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R., Bostein, D., Altman, R.B.: Missing value estimation methods for DNA microarrays. Bioinformatics 17(6), 520–525 (2001)
Article Google Scholar
Oba, S., Sato, M.A., Takemasa, I., Monden, M., Matsubara, K.I., Ishii, S.: A Bayesian missing value estimation method for gene expression profile data. Bioinformatics 19(16), 2088–2096 (2003)
Article Google Scholar
Brock, G.N., Shaffer, J.R., Blakesley, R.E., Lotz, M.J., Tseng, G.C.: Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes. BMC Bioinformatics 9(1), 12 (2008)
Article Google Scholar
Pedreschi, R., Hertog, M.L., Carpentier, S.C., Lammertyn, J., Robben, J., Noben, J.P., Panis, B., Swennen, R., Nicolaï, B.M.: Treatment of missing values for multivariate statistical analysis of gel based proteomics data. Proteomics 8(7), 1371–1383 (2008)
Article Google Scholar
Liew, A.W.C., Law, N.F., Yan, H.: Missing value imputation for gene expression data: computational techniques to recover missing data from available information. Briefings in Bioinformatics 12(5), 498–513 (2011)
Article Google Scholar
Edgar, R., Domrachev, M., Lash, A.E.: Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Research 30(1), 207–210 (2002)
Article Google Scholar
Liberzon, A., Subramanian, A., Pinchback, R., Thorvaldsdóttir, H., Tamayo, P., Mesirov, J.P.: Molecular signatures database (MSigDB) 3.0. Bioinformatics 27(12), 1739–1740 (2011)
Article Google Scholar
Stacklies, W., Redestig, H., Scholz, M., Walther, D., Selbig, J.: pcaMethods—a biocon-ductor package providing PCA methods for incomplete data. Bioinformatics 23(9), 1164–1167 (2007)
Article Google Scholar
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. The Journal of Machine Learning Research 3, 1157–1182 (2003)
MATH Google Scholar
Kononenko, I.: Estimating attributes: analysis and extensions of RELIEF. In: Bergadano, F., De Raedt, L. (eds.) ECML 1994. LNCS, vol. 784, pp. 171–182. Springer, Heidelberg (1994)
Google Scholar
Robnik-Šikonja, M., Kononenko, I.: Theoretical and empirical analysis of Relief-F and RRelief-F. Machine Learning 53(1-2), 23–69 (2003)
Article MATH Google Scholar
Wang, Y., Makedon, F.: Application of Relief-F feature filtering algorithm to selecting informative genes for cancer classification using microarray data. In: Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004, pp. 497–498 (2004)
Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explorations Newsletter 11(1), 10–18 (2009)
Article Google Scholar
Pletscher-Frankild, S., Pallejà, A., Tsafou, K., Binder, J.X., Jensen, L.J.: DISEASES: Text mining and data integration of disease–gene associations, bioRxiv, 008425 (2014)
Google Scholar

Download references

Author information

Authors and Affiliations

Data and Knowledge Engineering laboratory, School of Information Technology, King Mongkut’s University of Technology, Thailand, Thonburi
Worrawat Engchuan & Jonathan H. Chan
Department of Chemical Engineering, Faculty of Engineering, King Mongkut’s University of Technology, Thonburi, Thailand
Asawin Meechai
BioStatistics and Informatics laboratory, Genome Institute, National Center for Genetic Engineering and Biotechnology, Khlong Luang, Thailand
Sissades Tongsima

Authors

Worrawat Engchuan
View author publications
You can also search for this author in PubMed Google Scholar
Asawin Meechai
View author publications
You can also search for this author in PubMed Google Scholar
Sissades Tongsima
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan H. Chan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Worrawat Engchuan .

Editor information

Editors and Affiliations

Jalan Tungku Link, Institut Teknologi Brunei, Gadong, Brunei Darussalam
Somnuk Phon-Amnuaisuk
Jalan Tungku Link, Institut Teknologi Brunei, Gadong, Brunei Darussalam
Thien Wan Au

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Engchuan, W., Meechai, A., Tongsima, S., Chan, J.H. (2015). Cross-Platform Pathway Activity Transformation and Classification of Microarray Data. In: Phon-Amnuaisuk, S., Au, T. (eds) Computational Intelligence in Information Systems. Advances in Intelligent Systems and Computing, vol 331. Springer, Cham. https://doi.org/10.1007/978-3-319-13153-5_14

Download citation

DOI: https://doi.org/10.1007/978-3-319-13153-5_14
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13152-8
Online ISBN: 978-3-319-13153-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics