Skip to main content
Log in

Multi-task regression learning for survival analysis via prior information guided transductive matrix completion

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

Survival analysis aims to predict the occurrence time of a particular event of interest, which is crucial for the prognosis analysis of diseases. Currently, due to the limited study period and potential losing tracks, the observed data inevitably involve some censored instances, and thus brings a unique challenge that distinguishes from the general regression problems. In addition, survival analysis also suffers from other inherent challenges such as the high-dimension and small-sample-size problems. To address these challenges, we propose a novel multi-task regression learning model, i.e., prior information guided transductive matrix completion (PigTMC) model, to predict the survival status of the new instances. Specifically, we use the multi-label transductive matrix completion framework to leverage the censored instances together with the uncensored instances as the training samples, and simultaneously employ the multi-task transductive feature selection scheme to alleviate the overfitting issue caused by high-dimension and small-sample-size data. In addition, we employ the prior temporal stability of the survival statuses at adjacent time intervals to guide survival analysis. Furthermore, we design an optimization algorithm with guaranteed convergence to solve the proposed PigTMC model. Finally, the extensive experiments performed on the real microarray gene expression datasets demonstrate that our proposed model outperforms the previously widely used competing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Fernandez T, Rivera N, Teh Y W. Gaussian processes for survival analysis. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. 2016, 5021–5029

  2. Efron B. The efficiency of Cox’s likelihood function for censored data. Journal of the American Statistical Association, 1977, 72(359): 557–565

    Article  MathSciNet  Google Scholar 

  3. Therneau T M, Lumley T. Package ‘survival’. R Top Doc, 2015, 128

  4. Li Y, Rakesh V, Reddy C K. Project success prediction in crowdfunding environments. In: Proceedings of ACM International Conference on Web Search and Data Mining. 2016, 247–256

  5. Crowther M J, Lambert P C. A general framework for parametric survival analysis. Statistics in Medicine, 2014, 33(30): 5280–5297

    Article  MathSciNet  Google Scholar 

  6. Lee E T, Wang J. Statistical Methods for Survival Data Analysis. New Jersey: John Wiley & Sons, 2003

    Book  Google Scholar 

  7. Tibshirani R. The lasso method for variable selection in the Cox model. Statistics in Medicine, 1997, 16(4): 385–395

    Article  Google Scholar 

  8. Simon N, Friedman J, Hastie T, Tibshirani R. Regularization paths for Cox’s proportional hazards model via coordinate descent. Journal of Statistical Software, 2011, 39(5): 1

    Article  Google Scholar 

  9. Li Y, Wang L, Wang J, Wang J, Ye J, Reddy C K. Transfer learning for survival analysis via efficient L2, 1-norm regularized Cox regression. In: Proceedings of IEEE International Conference on Data Mining. 2016, 231–240

  10. Li Y, Wang J, Ye J, Reddy C K. A multi-task learning formulation for survival analysis. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016, 1715–1724

  11. Li Y, Yang T, Zhou J, Ye J. Multi-task learning based survival analysis for predicting Alzheimer’s disease progression with multi-source block-wise missing data. In: Proceedings of SIAM International Conference on Data Mining. 2018, 288–296

  12. Chen L, Zhang H, Lu J, Thung K, Aibaidula A, Liu L, Chen S, Jin L, Wu J, Wang Q, Zhou L, Shen D G. Multi-label nonlinear matrix completion with transductive multi-task feature selection for joint MGMT and IDH1 status prediction of patient with high-grade gliomas. IEEE Transactions on Medical Imaging, 2018, 37(8): 1775–1787

    Article  Google Scholar 

  13. Goldberg A, Recht B, Xu J, Nowak R, Zhu J. Transduction with matrix completion: three birds with one stone. In: Proceedings of the 23rd International Conference on Neural Information Processing Systems. 2010, 757–765

  14. Cabral R, De la Torre F, Costeira J P, Bernardino A. Matrix completion for weakly-supervised multi-label image classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(1), 121–135

    Article  Google Scholar 

  15. Tulyakov S, Alameda-Pineda X, Ricci E, Yiu L, Cohn J F, Sebe N. Self-adaptive matrix completion for heart rate estimation from face videos under realistic conditions. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2016, 2396–2404

  16. Cox D R. Regression models and life-tables. Journal of the Royal Statistical Society: Series B (Methodological), 1972, 34(2): 187–202

    MathSciNet  MATH  Google Scholar 

  17. Indrayan A, Bansal A K. The methods of survival analysis for clinicians. Indian Pediatrics, 2010, 47(9): 743–748

    Article  Google Scholar 

  18. Wang P, Li Y, Reddy C K. Machine learning for survival analysis: a survey. ACM Computing Surveys (CSUR), 2019, 51(6): 110

    Article  Google Scholar 

  19. Aitkin M, Clayton D. The fitting of exponential, Weibull and extreme value distributions to complex censored survival data using GLIM. Journal of the Royal Statistical Society: Series C (Applied Statistics), 1980, 29(2): 156–163

    MATH  Google Scholar 

  20. Bennett S. Log-logistic regression models for survival data. Journal of the Royal Statistical Society: Series C (Applied Statistics), 1983, 32(2): 165–171

    Google Scholar 

  21. Li Y, Xu K S, Reddy C K. Regularized parametric regression for high-dimensional survival analysis. In: Proceedings of SIAM International Conference on Data Mining. 2016, 765–773

  22. Miller R, Halpern J. Regression with censored data. Biometrika, 1982, 69(3): 521–531

    Article  MathSciNet  Google Scholar 

  23. Koul H, Susarla V, Van Ryzin J. Regression analysis with randomly right-censored data. The Annals of Statistics, 1981, 9(6): 1276–1288

    Article  MathSciNet  Google Scholar 

  24. Tobin J. Estimation of relationships for limited dependent variables. Econometrica, 1958, 26(1): 24–36

    Article  MathSciNet  Google Scholar 

  25. Buckley J, James I. Linear regression with censored data. Biometrika, 1979, 66(3): 429–436

    Article  Google Scholar 

  26. Wang S, Nan B, Zhu J, Beer D G. Doubly penalized Buckley-James method forsurvival data with high-dimensional covariates. Biometrics, 2008, 64(1): 132–140

    Article  MathSciNet  Google Scholar 

  27. Li Y, Vinzamuri B, Reddy C K. Regularized weighted linear regression for high-dimensional censored data. In: Proceedings of SIAM International Conference on Data Mining. 2016, 45–53

  28. Ye W, Chen L, Yang G, Dai H, Xiao F. Anomaly-tolerant trafficmatrix estimation via prior information guided matrix completion. IEEE Access, 2017, 5: 3172–3182

    Article  Google Scholar 

  29. Xu Y, Yin W. A globally convergent algorithm fornonconvex optimization based on block coordinate update. Journal of Scientific Computing, 2017, 72(2): 700–734

    Article  MathSciNet  Google Scholar 

  30. Liu J, Ji S, Ye J. Multi-task feature learning via efficient 1 2, 1-norm minimization. In: Proceedings of AUAI Conference on Uncertainty in Artificial Intelligence. 2009, 339–348

  31. Sørlie T, Tibshirani R, Parker J, Hastie T, Maron J S, Nobel A, Deng S, Johnsen H, Pesich R, Geisler S, Demeter J, Perou C M, Lønning P E, Brown P O, Børresen-Dale A L, Botstein D. Repeated observation of breast tumor subtypes in independent gene expression data sets. Proceedings of the National Academy of Sciences, 2003, 100(14): 8418–8423

    Article  Google Scholar 

  32. Van’t Veer L J, Dai H, Van De Vijver M J, He Y D, Hart A A M, Mao M, Peterse H L, Wan Der Kooy K, Marton M J, Witteveen A T, Schreiber G J, Kerkhoven R M, Roberts C, Linsley P S, Bernards R, Friend S H. Gene expression profiling predicts clinical outcome of breast cancer. Nature, 2002, 415(6871): 530

    Article  Google Scholar 

  33. Beer D G, Kardia SLR, Huang C C, Giordano T J, Levin A M, Misek D E, Lin L, Chen G, Tarek G, Thomas D G, Lizyness M L, Kuick R, Hayasaka S, Taylor J, Lannettoni M D, Orringer M B, Hanash S. Gene-expression profiles predict survival of patients with lung adenocarcinoma. Nature Medicine, 2002, 8(8): 816

    Article  Google Scholar 

  34. van Houwelingen H C, Bruinsma T, Hart A A M, Van’t Veer L J, Wessels L F. Cross-validated Cox regression on microarray gene expression data. Statistics in Medicine, 2006, 25(18): 3201–3216

    Article  MathSciNet  Google Scholar 

  35. Rosenwald A, Wright G, Wiestner A, Chan W C, Connors J M, Campo E, Gascoyne R D, Grogan T M, Muller-Hermelink H K, Smeland E B, Chiorazzi M, Giltnane J M, Hurt E M, Zhao H, Averett L, Henrickson S, Yang L, Poweel J, Wilson W, Jaffe E S, Simon R, Kiausner R D, Montserrat E, Bosch F, Greiner T, Weisenburger D D, Sanger W G, Dave B J, Lynch J C, Vose J, Armitage J O, Fisher R I, Miller T P, LeBlanc M, Ott G, Kvaloy S, Holte H, Delabie J, Staudt L M. The proliferation gene expression signature is a quantitative integrator of oncogenic events that predicts survival in mantle cell lymphoma. Cancer Cell, 2003, 3(2): 185–197

    Article  Google Scholar 

  36. Harrell Jr F E, Califf R M, Pryor D B, Lee K L, Rosati R A. Evaluating the yield of medical tests. Jama, 1982, 247(18): 2543–2546

    Article  Google Scholar 

  37. Therneau T. A package for survival analysis in S. R Package Version 2.37-4. 2013

  38. Yang Y, Zou H. A cocktail algorithm for solving the elastic net penalized Cox’s regression in high dimensions. Statistics and its Interface, 2013, 6(2): 167–173

    Article  MathSciNet  Google Scholar 

  39. Wang L, Li Y, Zhou J, Zhu D, Ye J. Multi-task survival analysis. In: Proceedings of IEEE International Conference on Data Mining. 2017, 485–494

  40. Faraway J J. Practical Regression and ANOVA Using R. 2002

  41. Wang Z, Wang C Y. Buckley-James boosting for survival analysis with high-dimensional biomarker data. Statistical Applications in Genetics and Molecular Biology, 2010, 9(1): 1–31

    Article  MathSciNet  Google Scholar 

  42. Zhou J, Chen J, Ye J. Malsar: multi-task learning via structural regularization. Arizona State University, 2011, 21

  43. Alameda-Pineda X, Ricci E, Yan Y, Sebe N. Recognizing emotions from abstract paintings using non-linear matrix completion. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2016, 5240–5248

  44. Boyd S, Vandenberghe L. Convex Optimization. Cambridge: Cambridge University Press, 2004

    Book  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China (Grant Nos. 61872190, 61772285, 61572263 and 61906098), in part by the Natural Science Foundation of Jiangsu Province (BK20161516), and in part by the Open Fund of MIIT Key Laboratory of Pattern Analysis and Machine Intelligence of NUAA.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lei Chen.

Additional information

Lei Chen received the Master’s Degree in computer software and theory from Nanjing University of Aeronautics and Astronautics, China in 2005, and the PhD Degree in communication and information system from Nanjing University of Posts and Telecommunications, China in 2014. He is currently a professor in the School of Computer Science, Nanjing University of Posts and Telecommunications, China. He was a visiting researcher at The University of North Carolina at Chapel Hill, USA, from June 2016 to June 2017. His research interests include machine learning, pattern recognition, and medical image analysis.

Kai Shao earned his bachelor’s degree in computer science from the Nanjing University of Information Science and Technology, China in 2017. From 2017, he continued his study in computer science in the Nanjing University of Posts and Telecommunications, China as a master candidate. His research focuses on the machine learning and data mining.

Xianzhong Long obtained his PhD degree from Shanghai Jiao Tong University, China in June 2014. He received his BS degree from Henan Polytechnic University, China in 2007 and MS degree from Xihua University, China in 2010, both in computer science. Now, he is an assistant professor at Nanjing University of Posts and Telecommunications, China. His research interests are computer vision, machine learning, and image processing, specifically on image classification, object recognition, and clustering.

Lingsheng Wang earned his bachelor’s degree in computer science from the Nanjing University of Posts and Telecommunications (NJUPT), China in 2018. From 2018, he continued his study in computer science in NJUPT as a master candidate. His research focuses on the machine learning and data mining.

Electronic Supplementary Material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, L., Shao, K., Long, X. et al. Multi-task regression learning for survival analysis via prior information guided transductive matrix completion. Front. Comput. Sci. 14, 145312 (2020). https://doi.org/10.1007/s11704-019-8374-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11704-019-8374-z

Keywords

Navigation