Abstract
Transfer learning methods have been successfully applied in solving a wide range of real-world problems. However, there is almost no attempt of effectively using these methods in healthcare applications. In the healthcare domain, it becomes extremely critical to solve the “when to transfer” issue of transfer learning. In highly divergent source and target domains, transfer learning can lead to negative transfer. Most of the existing works in transfer learning are primarily focused on selecting useful information from the source to improve the performance of the target task, but whether the transfer learning can help and when the transfer learning should be applied in the target task are still some of the impending challenges. In this paper, we address this issue of “when to transfer” by proposing a sparse feature selection model based on the constrained elastic net penalty. As a case study of the proposed model, we demonstrate the performance using the diabetes electronic health records (EHRs) which contain patient records from all fifty states in the United States. Our approach can choose relevant features to transfer knowledge from the source to the target tasks. The proposed model can measure the differences between multivariate data distributions conditional on the predicted model, and based on this measurement we can avoid unsuccessful transfer. We successfully transfer the knowledge across different states to improve the diagnosis of diabetes in a certain state with insufficient records to build an individualized predictive model with the aid of information from other states.




Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Although in Zhou et al. (2012) it has been named as multi-task Lasso, both \(L_1\)-norm and \(L_2\)-norm penalties are used in the optimization formulation.
References
Arnold A, Nallapati R, Cohen WW (2007) A comparative study of methods for transductive transfer learning. In: Seventh IEEE international conference on data mining workshops, 2007. ICDM Workshops 2007, p 77–82
Blitzer J, Dredze M, Pereira F (2007) Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. ACL 7:440–447
Caruana R (1997) Multitask learning. Mach Learn 28(1):41–75
Dai W, Yang Q, Xue G, Yu Y (2007) Boosting for transfer learning. In: ICML’07: Proceedings of the 24th international conference on Machine learning, p 193–200
Dai W, Yang Q, Xue GR, Yu Y (2008) Self-taught clustering. In: Proceedings of the 25th international conference on machine learning, ACM, p 200–207
Donoho DL, Johnstone JM (1994) Ideal spatial adaptation by wavelet shrinkage. Biometrika 81(3):425–455
Evgeniou A, Pontil M (2007) Multi-task feature learning. In: Proceedings of the 2006 conference on advances in neural information processing systems, vol. 19. The MIT Press, Cambridge, p 41
Evgeniou T, Pontil M (2004) Regularized multi-task learning. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, p 109–117
Farhadi A, Forsyth D, White R (2007) Transfer learning in sign language. In: IEEE Conference on computer vision and pattern recognition, CVPR’07, IEEE, p 1–8
Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1–22
Fung GPC, Yu JX, Lu H, Yu PS (2006) Text classification without negative examples revisit. IEEE Trans Knowl Data Eng 18(1):6–20
Hastie T, Tibshirani R, Friedman JJH (2001) The elements of statistical learning. Springer, New York
Liu J, Ji S, Ye J (2009) Multi-task feature learning via efficient l 2, 1-norm minimization. In: Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence. AUAI Press, Corvallis, p 339–348
Mihalkova L, Mooney RJ (2008) Transfer learning by mapping with minimal target data. In: Proceedings of the AAAI-08 workshop on transfer learning for complex tasks
Pan J (2010) Feature-based transfer learning with real-world applications. Ph.D. thesis, The Hong Kong University of Science and Technology
Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359
Pan SJ, Zheng VW, Yang Q, Hu DH (2008) Transfer learning for wifi-based indoor localization. In: Association for the advancement of artificial intelligence (AAAI) workshop, p 6
Practice Fusion Diabetes Classification: Identify patients diagnosed with Type 2 Diabetes (2012). https://www.kaggle.com/c/pf2012-diabetes
Raina R, Battle A, Lee H, Packer B, Ng AY (2007) Self-taught learning: transfer learning from unlabeled data. In: Proceedings of the 24th international conference on Machine learning, ACM, p 759–766
Rosenstein MT, Marx Z, Kaelbling LP, Dietterich TG (2005) To transfer or not to transfer. In: NIPS 2005 workshop on inductive transfer: 10 years later, vol. 2, p 7
Rückert U, Kramer S (2008) Machine learning and knowledge discovery in databases., Kernel-based inductive transferSpringer, Heidelberg, pp 220–233
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc B 58(1):267–288
Tseng P (2001) Convergence of a block coordinate descent method for nondifferentiable minimization. J Optim Theory Appl 109(3):475–494
Ye J, Liu J (2012) Sparse methods for biomedical data. ACM SIGKDD Explor Newslett 14(1):4–15
Zhou J, Chen J, Ye J (2012) Malsar: multi-task learning via structural regularization. Arizona State University, Phoenix
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc B 67(2):301–320
Acknowledgments
This work was supported in part by NSF Grants IIS-1242304, IIS-1231742 and NIH Grant R21CA175974.
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editors: Fei Wang, Gregor Stiglic, Ian Davidson and Zoran Obradovic.
Rights and permissions
About this article
Cite this article
Li, Y., Vinzamuri, B. & Reddy, C.K. Constrained elastic net based knowledge transfer for healthcare information exchange. Data Min Knowl Disc 29, 1094–1112 (2015). https://doi.org/10.1007/s10618-014-0389-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-014-0389-3