ABSTRACT
With the expansion of the healthcare industry and the overwhelming amount of electronic health records (EHRs) shared by healthcare institutions and practitioners, we now wish to take advantage of EHR data to develop an effective disease risk management model that not only improve disease prediction but also shows clinically meaningful feature grouping which can be used to define the bone disease phenotypes. In this paper, we explore the feasibility of extracting critical risk factors for osteoporosis and bone fracture prediction based on heterogeneous electronic health records (EHRs). This will improve our understanding of bone disease risk arising from the complex interplay of the human bone mineral density (BMD) assessment with major risk factors such as gender, age, family history, and life styles. We focus on addressing the problem of identifying individual or integrated risk factors (RFs) to predict osteoporosis and bone fractures, which are common diseases associated with aging and may be clinically silent but cause significant mortality and morbidity. On the other hand, the unbiased, EHR-driven phenotype discovery could be achieved using a massive EHR dataset and a computationally intense analysis capable of identifying all of the phenotypes in the dataset. We infer the precise phenotypic patterns from the new feature representation from EHR data for grouping risk factors for osteoporosis. We present a 2-layer deep graphical model to use EHR with minimal human supervision which derives a new representation of medical objects by embedding them in a low-dimensional vector space. This new low-dimensional risk factor representation will ultimately present the risk factor embedding/clustering by offering intuitive visualization by projection onto a 2D plane. We demonstrate the capability of our framework on a 9704 Caucasian women of 20 years of prospective dataset under osteoporosis and bone fracture assessment. The integrated representation not only improves the risk prediction accuracy but also presents clinically meaningful risk factor grouping for bone diseases.
- http://www.shef.ac.uk/FRAX/.Google Scholar
- http://www.sof.ucsf.edu/interface/.Google Scholar
- E. Amaldi and V. Kann. On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems. Theoretical Computer Science, 209(1):237--260, 1998. Google ScholarDigital Library
- R. Bender. Introduction to the use of regression models in epidemiology. Methods Mol Biol, 471:179--195, 2009.Google ScholarCross Ref
- Y. Bengio. Deep learning of representations for unsupervised and transfer learning. Unsupervised and Transfer Learning Challenges in Machine Learning, Volume 7, page 19, 2012.Google Scholar
- Y. Bengio, A. Courville, and P. Vincent. Representation learning: A review and new perspectives. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 35(8):1798--1828, 2013. Google ScholarDigital Library
- D. Black, M. Steinbuch, L. Palermo, P. Dargent-Molina, R. Lindsay, M. Hoseyni, and O. Johnell. An assessment tool for predicting fracture risk in postmenopausal women. Osteoporosis International, 12(7):519--528, 2001.Google ScholarCross Ref
- M. A. Carreira-Perpinan and G. E. Hinton. On contrastive divergence learning, 2005.Google Scholar
- C.-C. Chang and C.-J. Lin. Libsvm: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST), 2(3):27, 2011. Google ScholarDigital Library
- Y. Chen, R. J. Carroll, E. R. M. Hinz, A. Shah, A. E. Eyler, J. C. Denny, and H. Xu. Applying active learning to high-throughput phenotyping algorithms for electronic health records data. Journal of the American Medical Informatics Association, 20(e2):e253--e259, 2013.Google Scholar
- A. Cichocki, R. Zdunek, S. Choi, R. Plemmons, and S.-I. Amari. Novel multi-layer non-negative tensor factorization with sparsity constraints. In Adaptive and Natural Computing Algorithms, pages 271--280. Springer, 2007. Google ScholarDigital Library
- Cummings, S. R., Nevitt, M. C., Browner, W. S., Stone, K., Fox, K. M., Ensrud, K. E., Cauley, J., Black, D., and Vogt, T. M. Risk factors for hip fracture in white women. Study of Osteoporotic fractures research group, 332:767--773, 1995.Google Scholar
- J. Davis and M. Goadrich. The relationship between precision-recall and roc curves. In Proceedings of the 23rd international conference on Machine learning, pages 233--240. ACM, 2006. Google ScholarDigital Library
- M. G. Donaldson, P. M. Cawthon, J. T. Schousboe, K. E. Ensrud, L.-Y. Lui, J. A. Cauley, T. A. Hillier, B. C. Taylor, M. C. Hochberg, D. C. Bauer, et al. Novel methods to evaluate fracture risk models. Journal of Bone and Mineral Research, 26(8):1767--1773, 2011.Google ScholarCross Ref
- M. J. Econs. The genetics of osteoporosis and metabolic bone disease. Humana Press, 2000.Google ScholarCross Ref
- K. E. Ensrud, L.-Y. Lui, B. C. Taylor, J. T. Schousboe, M. G. Donaldson, H. A. Fink, J. A. Cauley, T. A. Hillier, W. S. Browner, and S. R. Cummings. A comparison of prediction models for fractures in older women: is more better? Archives of internal medicine, 169(22):2087--2094, 2009.Google Scholar
- D. Erhan, Y. Bengio, A. Courville, P.-A. Manzagol, P. Vincent, and S. Bengio. Why does unsupervised pre-training help deep learning? The Journal of Machine Learning Research, 11:625--660, 2010. Google ScholarDigital Library
- D. Erhan, A. Courville, and Y. Bengio. Understanding representations learned in deep architectures. Technical report, Technical Report 1355, Université de Montréal/DIRO, 2010.Google Scholar
- L. Getoor, J. T. Rhee, D. Koller, and P. Small. Understanding tuberculosis epidemiology using structured statistical models. Artificial Intelligence in Medicine, 30(3):233--256, 2004. Google ScholarDigital Library
- V. Gilsanz. Phenotype and genotype of osteoporosis. Trends in Endocrinology & Metabolism, 9(5):184--190, 1998.Google ScholarCross Ref
- S. H. Ha. Medical domain knowledge and associative classification rules in diagnosis. International Journal of Knowledge Discovery in Bioinformatics (IJKDB), 2(1):60--73, 2011.Google ScholarCross Ref
- J. Halter, J. Ouslander, M. Tinetti, S. Studenski, K. High, and S. Asthana. Hazzard's geriatric medicine and gerontology. McGraw-Hill Prof Med/Tech, 2009.Google Scholar
- T. A. Hillier, J. A. Cauley, J. H. Rizzo, K. L. Pedula, K. E. Ensrud, D. C. Bauer, L.-Y. Lui, K. K. Vesco, D. M. Black, M. G. Donaldson, et al. Who absolute fracture risk models (frax): do clinical risk factors improve fracture prediction in older women without osteoporosis? Journal of Bone and Mineral Research, 26(8):1774--1782, 2011.Google ScholarCross Ref
- G. Hinton, S. Osindero, and Y.-W. Teh. A fast learning algorithm for deep belief nets. Neural computation, 18(7):1527--1554, 2006. Google ScholarDigital Library
- G. E. Hinton. Deep belief networks. Scholarpedia, 4(5):5947, 2009.Google ScholarCross Ref
- G. E. Hinton and R. R. Salakhutdinov. Reducing the dimensionality of data with neural networks. science, 2006.Google Scholar
- J. C. Ho, J. Ghosh, and J. Sun. Marble: high-throughput phenotyping from electronic health records via sparse nonnegative tensor factorization. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 115--124. ACM, 2014. Google ScholarDigital Library
- G. Hripcsak and D. J. Albers. Next-generation phenotyping of electronic health records. Journal of the American Medical Informatics Association, pages amiajnl--2012, 2012.Google Scholar
- R. Jemtland, M. Holden, S. Reppe, O. K. Olstad, F. P. Reinholt, V. T. Gautvik, H. Refvem, A. Frigessi, B. Houston, and K. M. Gautvik. Molecular disease map of bone characterizing the postmenopausal osteoporosis phenotype. Journal of bone and mineral research, 26(8):1793--1801, 2011.Google Scholar
- J. Kanis, A. Odén, O. Johnell, H. Johansson, C. De Laet, J. Brown, P. Burckhardt, C. Cooper, C. Christiansen, S. Cummings, et al. The use of clinical risk factors enhances the performance of bmd in the prediction of hip and osteoporotic fractures in men and women. Osteoporosis international, 18(8):1033--1046, 2007.Google ScholarCross Ref
- T. A. Lasko, J. C. Denny, and M. A. Levy. Computational phenotype discovery using unsupervised feature learning over noisy, sparse, and irregular clinical data. PloS one, 8(6):e66341, 2013.Google ScholarCross Ref
- D. D. Lee and H. S. Seung. Learning the parts of objects by non-negative matrix factorization. Nature, 401(6755):788--791, 1999.Google ScholarCross Ref
- G. Lemineur, R. Harba, N. Kilic, O. Ucan, O. Osman, and L. Benhamou. Efficient estimation of osteoporosis using artificial neural networks. In Industrial Electronics Society, 2007. IECON 2007. 33rd Annual Conference of the IEEE, pages 3039--3044. IEEE, 2007.Google ScholarCross Ref
- H. Li, C. Buyea, X. Li, M. Ramanathan, L. Bone, and A. Zhang. 3d bone microarchitecture modeling and fracture risk prediction. In Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine, pages 361--368. ACM, 2012. Google ScholarDigital Library
- J. Li, J. Shi, and D. Satz. Modeling and analysis of disease and risk factors through learning bayesian networks from observational data. Quality and Reliability Engineering International, 24(3):291--302, 2008.Google ScholarCross Ref
- J. Marx. Coming to grips with bone loss: Piecing together human aging. Science, 305(5689):1420--1422, 2004.Google ScholarCross Ref
- T. D. Nguyen, T. Tran, D. Phung, and S. Venkatesh. Learning parts-based representations with nonnegative restricted boltzmann machine. In Asian Conference on Machine Learning, pages 133--148, 2013.Google Scholar
- C. Ordonez and K. Zhao. Evaluating association rules and decision trees to predict multiple target attributes. Intelligent Data Analysis, 15(2):173--192, 2011. Google ScholarDigital Library
- J. Robbins, A. Schott, P. Garnero, P. Delmas, D. Hans, and P. Meunier. Risk factors for hip fracture in women with high bmd: Epidos study. Osteoporosis international, 16(2):149--154, 2005.Google ScholarCross Ref
- F. Rosenblatt. Principles of neurodynamics; perceptrons and the theory of brain mechanisms. Spartan Books, Washington, 1962.Google Scholar
- J. Sirola, A.-K. Koistinen, K. Salovaara, T. Rikkonen, M. Tuppurainen, J. S. Jurvelin, R. Honkanen, E. Alhava, and H. Kröger. Bone loss rate may interact with other risk factors for fractures among elderly women: A 15-year population-based study. Journal of osteoporosis, 2010, 2010.Google Scholar
- Y. W. Teh and G. E. Hinton. Rate-coded restricted boltzmann machines for face recognition. Advances in neural information processing systems, pages 908--914, 2001.Google Scholar
- T. Tran, T. D. Nguyen, D. Phung, and S. Venkatesh. Learning vector representation of medical objects via emr-driven nonnegative restricted boltzmann machines (enrbm). Journal of biomedical informatics, 54:96--105, 2015. Google ScholarDigital Library
- L. Van der Maaten and G. Hinton. Visualizing data using t-sne. Journal of Machine Learning Research, 9(2579-2605):85, 2008.Google Scholar
- N. G. Weiskopf and C. Weng. Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research. Journal of the American Medical Informatics Association, 20(1):144--151, 2013.Google ScholarCross Ref
- H. Zou and T. Hastie. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2):301--320, 2005.Google Scholar
Index Terms
- Bone disease prediction and phenotype discovery using feature representation over electronic health records
Recommendations
Prediction and informative risk factor selection of bone diseases
With the booming of healthcare industry and the overwhelming amount of electronic health records (EHRs) shared by healthcare institutions and practitioners, we take advantage of EHR data to develop an effective disease risk management model that not ...
Electronic health records: how can IS researchers contribute to transforming healthcare?
Electronic health records (EHR) facilitate integration of patient health history for planning safe and proper treatment. Combined with data analytics, aggregate-level EHR enable examination and development of effective medicines and therapies for ...
Using electronic health record systems in diabetes care: emerging practices
IHI '10: Proceedings of the 1st ACM International Health Informatics SymposiumWhile there has been considerable attention devoted to the deployment of electronic health record (EHR) systems, there has been far less attention given to their appropriation for use in clinical encounters --- particularly in the context of complex, ...
Comments