Skip to main content

Regularization and Shrinkage in Rough Set Based Canonical Correlation Analysis

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10313))

Abstract

The modern technology has enabled very high dimensional multimodal data streams to be routinely acquired, which results in very high dimensional feature spaces () as compared to number of training samples (n). In this regard, the paper presents a new feature extraction algorithm to address the ‘small n and large ’ problem associated with multimodal data sets. It judiciously integrates both regularization and shrinkage with canonical correlation analysis (CCA). While the diagonal elements of covariance matrices are increased using regularization parameters, the off-diagonal elements are decreased by shrinkage parameters. The theory of rough sets is used to find out the optimum regularization parameters of CCA. The effectiveness of the proposed method, along with a comparison with other methods, is demonstrated on three pairs of modalities of two real life data sets.

This work is partially supported by the Department of Electronics and Information Technology, Government of India (PhD-MLA/4(90)/2015-16).

This is a preview of subscription content, log in via an institution.

References

  1. Cruz-Cano, R., Lee, M.T.: Fast regularized canonical correlation analysis. Comput. Stat. Data Anal. 70, 88–100 (2014)

    Article  MathSciNet  Google Scholar 

  2. Dubois, D., Prade, H.: Rough fuzzy sets and fuzzy rough sets. Int. J. Gen Syst 17(2–3), 191–209 (1990)

    Article  Google Scholar 

  3. Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. Wiley, Hoboken (1973)

    MATH  Google Scholar 

  4. Gladwell, G.M.L.: On isospectral spring - mass systems. Inverse Probl. 11(3), 591–602 (1995)

    Article  MathSciNet  Google Scholar 

  5. Golugula, A., Lee, G., Master, S.R., Feldman, M.D., Tomaszewski, J.E., Speicher, D.W., Madabhushi, A.: Analysis, supervised regularized canonical correlation: integrating histologic and proteomic measurements for predicting biochemical recurrence following prostate surgery. BMC Bioinform. 12, 483 (2011)

    Article  Google Scholar 

  6. Gonzalez, I., Dejean, S., Martin, P.G.P., Baccini, A.: CCA: an R package to extend canonical correlation analysis. J. Stat. Softw. 23(12), 1–14 (2008)

    Article  Google Scholar 

  7. Gonzalez, I., Dejean, S., Martin, P.G.P., Goncalves, O., Besse, P., Baccini, A.: Highlighting relationships between heterogeneous biological data through graphical displays based on regularized canonical correlation analysis. J. Biol. Syst. 17(2), 173–199 (2009)

    Article  MathSciNet  Google Scholar 

  8. Gou, Z., Fyfe, C.: A canonical correlation neural network for multicollinearity and functional data. Neural Netw. 17(2), 285–293 (2004)

    Article  Google Scholar 

  9. Guo, Y., Hastie, T., Tibshirani, R.: Regularized linear discriminant analysis and its application in microarrays. Biostatistics 8(1), 86–100 (2007)

    Article  Google Scholar 

  10. Hassan, M., Boudaoud, S., Terrien, J., Karlsson, B., Marque, C.: Combination of canonical correlation analysis and empirical mode decomposition applied to denoising the labor electrohysterogram. IEEE Trans. Biomed. Eng. 58(9), 2441–2447 (2011)

    Article  Google Scholar 

  11. Hoerl, A.E., Kennard, R.W.: Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12(1), 55–67 (1970)

    Article  Google Scholar 

  12. Hotelling, H.: Relations between two sets of variates. Biometrika 28(3/4), 321–377 (1936)

    Article  Google Scholar 

  13. Hwang, D., Schmitt, W.A., Stephanopoulos, G., Stephanopoulos, G.: Determination of minimum sample size and discriminatory expression patterns in microarray data. Bioinformatics 18, 1184–1193 (2002)

    Article  Google Scholar 

  14. Jafari, P., Azuaje, F.: An assessment of recently published gene expression data analyses: reporting experimental design and statistical factors. BMC Med. Inform. Decis. Making 6, 27 (2006)

    Article  Google Scholar 

  15. Lee, G., Singanamalli, A., Wang, H., Feldman, M.D., Master, S.R., Shih, N.N.C., Spangler, E., Rebbeck, T., Tomaszewski, J.E., Madabhushi, A.: Supervised Multi-View Canonical Correlation Analysis (sMVCCA): integrating histologic and proteomic features for predicting recurrent prostate cancer. IEEE Trans. Med. Imaging 34(1), 284–297 (2015)

    Article  Google Scholar 

  16. Li, M., Liu, Y., Feng, G., Zhou, Z., Hu, D.: OI and fMRI signal separation using both temporal and spatial autocorrelations. IEEE Trans. Biomed. Eng. 57(8), 1917–1926 (2010)

    Article  Google Scholar 

  17. Lin, Z., Zhang, C., Wu, W., Gao, X.: Frequency recognition based on canonical correlation analysis for SSVEP-based BCIs. IEEE Trans. Biomed. Eng. 53(12), 2610–2614 (2006)

    Article  Google Scholar 

  18. Maji, P.: A rough hypercuboid approach for feature selection in approximation spaces. IEEE Trans. Knowl. Data Eng. 26(1), 16–29 (2014)

    Article  Google Scholar 

  19. Maji, P., Mandal, A.: Multimodal omics data integration using max relevance-max significance criterion. IEEE Trans. Biomed. Eng. (2016). doi:10.1109/TBME.2016.2624823

  20. Maji, P., Mandal, A.: Rough hypercuboid based supervised regularized canonical correlation for multimodal data analysis. Fundamenta Informaticae 148(1–2), 133–155 (2016)

    Article  MathSciNet  Google Scholar 

  21. Mandal, A., Maji, P.: FaRoC: fast and robust supervised canonical correlation analysis for multimodal omics data. IEEE Trans. Cybern. (2017). doi:10.1109/TCYB.2017.2685625

  22. Paul, S., Maji, P.: \(\mu \)HEM for identification of differentially expressed miRNAs using hypercuboid equivalence partition matrix. BMC Bioinform. 14(1), 266 (2013)

    Article  Google Scholar 

  23. Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishers, Dordrecht (1991)

    Book  Google Scholar 

  24. Schafer, J., Strimmer, K.: A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Stat. Appl. Genet. Mol. Biol. 4(1), 32 (2005)

    Article  MathSciNet  Google Scholar 

  25. Sweeney, K.T., McLoone, S.F., Ward, T.E.: The use of ensemble empirical mode decomposition with canonical correlation analysis as a novel artifact removal technique. IEEE Trans. Biomed. Eng. 60(1), 97–105 (2013)

    Article  Google Scholar 

  26. Thomas, J.G., Olson, J.M., Tapscott, S.J., Zhao, L.P.: An efficient and robust statistical modeling approach to discover differentially expressed genes using genomic expression profiles. Genome Res. 11(7), 1227–1236 (2001)

    Article  Google Scholar 

  27. Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1995)

    Book  Google Scholar 

  28. Vinod, H.D.: Canonical ridge and econometrics of joint production. J. Econometrics 4(2), 147–166 (1976)

    Article  Google Scholar 

  29. Wu, G.R., Chen, F., Kang, D., Zhang, X., Marinazzo, D., Chen, H.: Multiscale causal connectivity analysis by canonical correlation: theory and application to epileptic brain. IEEE Trans. Biomed. Eng. 58(11), 3088–3096 (2011)

    Article  Google Scholar 

  30. Yamanishi, Y., Vert, J.P., Kanehisa, M.: Protein network inference from multiple genomic data: a supervised approach. Bioinformatics 20, i363–i370 (2004)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pradipta Maji .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Mandal, A., Maji, P. (2017). Regularization and Shrinkage in Rough Set Based Canonical Correlation Analysis. In: Polkowski, L., et al. Rough Sets. IJCRS 2017. Lecture Notes in Computer Science(), vol 10313. Springer, Cham. https://doi.org/10.1007/978-3-319-60837-2_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-60837-2_36

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-60836-5

  • Online ISBN: 978-3-319-60837-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics