Skip to main content

Covariate-Related Structure Extraction from Paired Data

  • Conference paper
  • First Online:
Information Technology in Bio- and Medical Informatics (ITBAM 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9832))

  • 757 Accesses

Abstract

In the biological domain, it is more and more common to apply several high-throughput technologies to the same set of samples. We propose a Covariate-Related Structure Extraction approach (CRSE) that explores relationships between different types of high-dimensional molecular data (views) in the context of sample covariate information from the experimental design, for example class membership. Real-world data analysis with an initial pipeline implementation of CRSE shows that the proposed approach successfully captures cross-view structures underlying multiple biologically relevant classification schemes, allowing to predict class labels to unseen examples from either view or across views.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Abdi, H., Williams, L.J., Valentin, D.: Multiple factor analysis: principal component analysis for multitable and multiblock data sets. Wiley Interdisc. Rev. Comput. Stat. 5(2), 149–179 (2013)

    Article  Google Scholar 

  • Acar, E., Gurdeniz, G., Rasmussen, M., Rago, D., Dragsted, L.O., Bro, R.: Coupled matrix factorization with sparse factors to identify potential biomarkers in metabolomics. In: IEEE 12th International Conference on Data Mining Workshops, pp. 1–8 (2012)

    Google Scholar 

  • Acar, E., Papalexakis, E.E., Rasmussen, M.A., Lawaetz, A.J., Nilsson, M., Bro, R.: Structure-revealing data fusion. BMC Bioinf. 15(1), 239 (2014)

    Article  Google Scholar 

  • Barkauskas, D.: FTICRMS: Programs for Analyzing Fourier Transform-Ion Cyclotron Resonance Mass Spectrometry Data. R package version 8 (2012)

    Google Scholar 

  • Boulesteix, A.-L., Strimmer, K.: Partial least squares: a versatile tool for the analysis of high-dimensional genomic data. Briefings Bioinform. 8(1), 32–44 (2007)

    Article  Google Scholar 

  • Choi, S.W., Lee, I.-B.: Multiblock PLS-based localized process diagnosis. J. Process Control 15(3), 295–306 (2005)

    Article  Google Scholar 

  • Duda, R.O., Hart, P.E., et al.: Pattern Classification and Scene Analysis, vol. 3. Wiley, New York (1973)

    MATH  Google Scholar 

  • Eslami, A., Qannari, E., Kohler, A., Bougeard, S.: Multivariate analysis of multiblock and multigroup data. Chemometr. Intell. Lab. Syst. 133, 63–69 (2014)

    Article  Google Scholar 

  • Geladi, P., Kowalski, B.R.: Partial least-squares regression: a tutorial. Anal. Chim. Acta 185, 1–17 (1986)

    Article  Google Scholar 

  • González, I., Déjean, S., Martin, P.G., Baccini, A., et al.: CCA: an R package to extend canonical correlation analysis. J. Stat. Softw. 23(12), 1–14 (2008)

    Article  Google Scholar 

  • Guo, S., Ruan, Q., Wang, Z., Liu, S.: Facial expression recognition using spectral supervised canonical correlation analysis. J. Comput. Inf. Sci. Eng. 29(5), 907–924 (2013)

    MathSciNet  Google Scholar 

  • Haenlein, M., Kaplan, A.M.: A beginner’s guide to partial least squares analysis. Underst. Stat. 3(4), 283–297 (2004)

    Article  Google Scholar 

  • Hardoon, D.R., Szedmak, S., Shawe-Taylor, J.: Canonical correlation analysis: an overview with application to learning methods. Neural Comput. 16(12), 2639–2664 (2004)

    Article  MATH  Google Scholar 

  • Horst, P.: Generalized canonical correlations and their applications to experimental data. J. Clin. Psychol. 17(4), 331–347 (1961)

    Article  Google Scholar 

  • Hotelling, H.: Relations between two sets of variates. Biometrika 28, 321–377 (1936)

    Article  MATH  Google Scholar 

  • Huopaniemi, I., Suvitaival, T., Nikkilä, J., Orešič, M., Kaski, S.: Multivariate multi-way analysis of multi-source data. Bioinformatics 26(12), i391–i398 (2010)

    Article  Google Scholar 

  • Jamali, M., Ester, M.: A matrix factorization technique with trust propagation for recommendation in social networks. In: Proceedings of the 4th ACM Conference on Recommender Systems, pp. 135–142. ACM (2010)

    Google Scholar 

  • Jiang, M., Cui, P., Liu, R., Yang, Q., Wang, F., Zhu, W., Yang, S.: Social contextual recommendation. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 45–54. ACM (2012)

    Google Scholar 

  • Klami, A., Virtanen, S., Kaski, S.: Bayesian canonical correlation analysis. J. Mach. Learn. Res. 14(1), 965–1003 (2013)

    MathSciNet  MATH  Google Scholar 

  • Krzanowski, W.: Principal component analysis in the presence of group structure. Appl. Stat. 33, 164–168 (1984)

    Article  Google Scholar 

  • Lanckriet, G.R., De Bie, T., Cristianini, N., Jordan, M.I., Noble, W.S.: A statistical framework for genomic data fusion. Bioinformatics 20(16), 2626–2635 (2004)

    Article  Google Scholar 

  • Lee, C.M., Mudaliar, M.A., Haggart, D., Wolf, C.R., Miele, G., Vass, J.K., Higham, D.J., Crowther, D.: Simultaneous non-negative matrix factorization for multiple large scale gene expression datasets in toxicology. PLoS ONE 7(12), e48238 (2012)

    Article  Google Scholar 

  • Luo, Y., Tao, D., Ramamohanarao, K., Xu, C., Wen, Y.: Tensor canonical correlation analysis for multi-view dimension reduction. IEEE Trans. Knowl. Data Eng. 27(11), 3111–3124 (2015)

    Article  Google Scholar 

  • Pinheiro, J.C., Bates, D.M.: Basic concepts and examples. Mixed-effects Models in S and S-Plus, pp. 3–56 (2000)

    Google Scholar 

  • Ritchie, M.E., Phipson, B., Wu, D., Hu, Y., Law, C.W., Shi, W., Smyth, G.K.: Limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43(7), e47 (2015)

    Article  Google Scholar 

  • Saunders, C., Gammerman, A., Vovk, V.: Ridge regression learning algorithm in dual variables. In: Proceedings of the 15th International Conference on Machine Learning, pp. 515–521. Morgan Kaufmann (1998)

    Google Scholar 

  • Smilde, A.K., Westerhuis, J.A., de Jong, S.: A framework for sequential multiblock component methods. J. Chemom. 17(6), 323–337 (2003)

    Article  Google Scholar 

  • Smyth, G.K.: Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat. Appl. Genet. Mol. Biol. 3(1), 1–25 (2004). doi:10.2202/1544-6115.1027. ISSN (Online) 1544-6115

    MathSciNet  MATH  Google Scholar 

  • Sweeney, K.T., McLoone, S.F., Ward, T.E.: The use of ensemble empirical mode decomposition with canonical correlation analysis as a novel artifact removal technique. IEEE Trans. Biomed. Eng. 60(1), 97–105 (2013)

    Article  Google Scholar 

  • Tenenhaus, M., Vinzi, V.E.: PLS regression, PLS path modeling and generalized procrustean analysis: a combined approach for multiblock analysis. J. Chemom. 19(3), 145–153 (2005)

    Article  Google Scholar 

  • Vía, J., Santamaría, I., Pérez, J.: A learning algorithm for adaptive canonical correlation analysis of several data sets. Neural Netw. 20(1), 139–152 (2007)

    Article  MATH  Google Scholar 

  • Vinod, H.D.: Canonical ridge and econometrics of joint production. J. Econometrics 4(2), 147–166 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  • Wendorf, C.A.: Primer on multiple regression coding: common forms and the additional case of repeated contrasts. Underst. Stat. 3(1), 47–57 (2004)

    Article  Google Scholar 

  • Westerhuis, J.A., Kourti, T., MacGregor, J.F.: Analysis of multiblock and hierarchical PCA and PLS models. J. Chemom. 12(5), 301–321 (1998)

    Article  Google Scholar 

  • Witten, D.M., Tibshirani, R., Hastie, T.: A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics 10(3), 515–534 (2009)

    Article  Google Scholar 

  • Witten, D.M., Tibshirani, R.J.: Extensions of sparse canonical correlation analysis with applications to genomic data. Stat. Appl. Genet. Mol. Biol. 8(1), 1–27 (2009)

    MathSciNet  MATH  Google Scholar 

  • Wold, S., Hellberg, S., Lundstedt, T., Sjöström, M.: PLS modeling with latent variables in two or more dimensions. Partial Least Squares Model Building: Theory and Application (1987)

    Google Scholar 

  • Wold, S., Kettaneh, N., Tjessem, K.: Hierarchical multiblock PLS and PC models for easier model interpretation and as an alternative to variable selection. J. Chemom. 10(5–6), 463–482 (1996)

    Article  Google Scholar 

  • Zhou, G., Cichocki, A., Zhang, Y., Mandic, D.P.: Group component analysis for multiblock data: common and individual feature extraction. IEEE Trans. Neural Netw. Learn. Syst. PP(99), 1–14 (2015). doi:10.1109/TNNLS.2015.2487364

    Article  Google Scholar 

Download references

Acknowledgement

We thank Ming Jin, Jin Zhao, Basem Kanawati, Philippe Schmitt-Kopplin, Andreas Albert, J. Barbro Winkler, and Anton R. Schäffner for kindly providing the datasets used in this study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christian Böhm .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Zhou, L., Georgii, E., Plant, C., Böhm, C. (2016). Covariate-Related Structure Extraction from Paired Data. In: Renda, M., Bursa, M., Holzinger, A., Khuri, S. (eds) Information Technology in Bio- and Medical Informatics. ITBAM 2016. Lecture Notes in Computer Science(), vol 9832. Springer, Cham. https://doi.org/10.1007/978-3-319-43949-5_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-43949-5_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-43948-8

  • Online ISBN: 978-3-319-43949-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics