skip to main content
10.1145/2808719.2808741acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedingsconference-collections
research-article

Bone disease prediction and phenotype discovery using feature representation over electronic health records

Published:09 September 2015Publication History

ABSTRACT

With the expansion of the healthcare industry and the overwhelming amount of electronic health records (EHRs) shared by healthcare institutions and practitioners, we now wish to take advantage of EHR data to develop an effective disease risk management model that not only improve disease prediction but also shows clinically meaningful feature grouping which can be used to define the bone disease phenotypes. In this paper, we explore the feasibility of extracting critical risk factors for osteoporosis and bone fracture prediction based on heterogeneous electronic health records (EHRs). This will improve our understanding of bone disease risk arising from the complex interplay of the human bone mineral density (BMD) assessment with major risk factors such as gender, age, family history, and life styles. We focus on addressing the problem of identifying individual or integrated risk factors (RFs) to predict osteoporosis and bone fractures, which are common diseases associated with aging and may be clinically silent but cause significant mortality and morbidity. On the other hand, the unbiased, EHR-driven phenotype discovery could be achieved using a massive EHR dataset and a computationally intense analysis capable of identifying all of the phenotypes in the dataset. We infer the precise phenotypic patterns from the new feature representation from EHR data for grouping risk factors for osteoporosis. We present a 2-layer deep graphical model to use EHR with minimal human supervision which derives a new representation of medical objects by embedding them in a low-dimensional vector space. This new low-dimensional risk factor representation will ultimately present the risk factor embedding/clustering by offering intuitive visualization by projection onto a 2D plane. We demonstrate the capability of our framework on a 9704 Caucasian women of 20 years of prospective dataset under osteoporosis and bone fracture assessment. The integrated representation not only improves the risk prediction accuracy but also presents clinically meaningful risk factor grouping for bone diseases.

References

  1. http://www.shef.ac.uk/FRAX/.Google ScholarGoogle Scholar
  2. http://www.sof.ucsf.edu/interface/.Google ScholarGoogle Scholar
  3. E. Amaldi and V. Kann. On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems. Theoretical Computer Science, 209(1):237--260, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. Bender. Introduction to the use of regression models in epidemiology. Methods Mol Biol, 471:179--195, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  5. Y. Bengio. Deep learning of representations for unsupervised and transfer learning. Unsupervised and Transfer Learning Challenges in Machine Learning, Volume 7, page 19, 2012.Google ScholarGoogle Scholar
  6. Y. Bengio, A. Courville, and P. Vincent. Representation learning: A review and new perspectives. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 35(8):1798--1828, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. Black, M. Steinbuch, L. Palermo, P. Dargent-Molina, R. Lindsay, M. Hoseyni, and O. Johnell. An assessment tool for predicting fracture risk in postmenopausal women. Osteoporosis International, 12(7):519--528, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  8. M. A. Carreira-Perpinan and G. E. Hinton. On contrastive divergence learning, 2005.Google ScholarGoogle Scholar
  9. C.-C. Chang and C.-J. Lin. Libsvm: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST), 2(3):27, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Y. Chen, R. J. Carroll, E. R. M. Hinz, A. Shah, A. E. Eyler, J. C. Denny, and H. Xu. Applying active learning to high-throughput phenotyping algorithms for electronic health records data. Journal of the American Medical Informatics Association, 20(e2):e253--e259, 2013.Google ScholarGoogle Scholar
  11. A. Cichocki, R. Zdunek, S. Choi, R. Plemmons, and S.-I. Amari. Novel multi-layer non-negative tensor factorization with sparsity constraints. In Adaptive and Natural Computing Algorithms, pages 271--280. Springer, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Cummings, S. R., Nevitt, M. C., Browner, W. S., Stone, K., Fox, K. M., Ensrud, K. E., Cauley, J., Black, D., and Vogt, T. M. Risk factors for hip fracture in white women. Study of Osteoporotic fractures research group, 332:767--773, 1995.Google ScholarGoogle Scholar
  13. J. Davis and M. Goadrich. The relationship between precision-recall and roc curves. In Proceedings of the 23rd international conference on Machine learning, pages 233--240. ACM, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. G. Donaldson, P. M. Cawthon, J. T. Schousboe, K. E. Ensrud, L.-Y. Lui, J. A. Cauley, T. A. Hillier, B. C. Taylor, M. C. Hochberg, D. C. Bauer, et al. Novel methods to evaluate fracture risk models. Journal of Bone and Mineral Research, 26(8):1767--1773, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  15. M. J. Econs. The genetics of osteoporosis and metabolic bone disease. Humana Press, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  16. K. E. Ensrud, L.-Y. Lui, B. C. Taylor, J. T. Schousboe, M. G. Donaldson, H. A. Fink, J. A. Cauley, T. A. Hillier, W. S. Browner, and S. R. Cummings. A comparison of prediction models for fractures in older women: is more better? Archives of internal medicine, 169(22):2087--2094, 2009.Google ScholarGoogle Scholar
  17. D. Erhan, Y. Bengio, A. Courville, P.-A. Manzagol, P. Vincent, and S. Bengio. Why does unsupervised pre-training help deep learning? The Journal of Machine Learning Research, 11:625--660, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. D. Erhan, A. Courville, and Y. Bengio. Understanding representations learned in deep architectures. Technical report, Technical Report 1355, Université de Montréal/DIRO, 2010.Google ScholarGoogle Scholar
  19. L. Getoor, J. T. Rhee, D. Koller, and P. Small. Understanding tuberculosis epidemiology using structured statistical models. Artificial Intelligence in Medicine, 30(3):233--256, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. V. Gilsanz. Phenotype and genotype of osteoporosis. Trends in Endocrinology & Metabolism, 9(5):184--190, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  21. S. H. Ha. Medical domain knowledge and associative classification rules in diagnosis. International Journal of Knowledge Discovery in Bioinformatics (IJKDB), 2(1):60--73, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  22. J. Halter, J. Ouslander, M. Tinetti, S. Studenski, K. High, and S. Asthana. Hazzard's geriatric medicine and gerontology. McGraw-Hill Prof Med/Tech, 2009.Google ScholarGoogle Scholar
  23. T. A. Hillier, J. A. Cauley, J. H. Rizzo, K. L. Pedula, K. E. Ensrud, D. C. Bauer, L.-Y. Lui, K. K. Vesco, D. M. Black, M. G. Donaldson, et al. Who absolute fracture risk models (frax): do clinical risk factors improve fracture prediction in older women without osteoporosis? Journal of Bone and Mineral Research, 26(8):1774--1782, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  24. G. Hinton, S. Osindero, and Y.-W. Teh. A fast learning algorithm for deep belief nets. Neural computation, 18(7):1527--1554, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. G. E. Hinton. Deep belief networks. Scholarpedia, 4(5):5947, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  26. G. E. Hinton and R. R. Salakhutdinov. Reducing the dimensionality of data with neural networks. science, 2006.Google ScholarGoogle Scholar
  27. J. C. Ho, J. Ghosh, and J. Sun. Marble: high-throughput phenotyping from electronic health records via sparse nonnegative tensor factorization. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 115--124. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. G. Hripcsak and D. J. Albers. Next-generation phenotyping of electronic health records. Journal of the American Medical Informatics Association, pages amiajnl--2012, 2012.Google ScholarGoogle Scholar
  29. R. Jemtland, M. Holden, S. Reppe, O. K. Olstad, F. P. Reinholt, V. T. Gautvik, H. Refvem, A. Frigessi, B. Houston, and K. M. Gautvik. Molecular disease map of bone characterizing the postmenopausal osteoporosis phenotype. Journal of bone and mineral research, 26(8):1793--1801, 2011.Google ScholarGoogle Scholar
  30. J. Kanis, A. Odén, O. Johnell, H. Johansson, C. De Laet, J. Brown, P. Burckhardt, C. Cooper, C. Christiansen, S. Cummings, et al. The use of clinical risk factors enhances the performance of bmd in the prediction of hip and osteoporotic fractures in men and women. Osteoporosis international, 18(8):1033--1046, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  31. T. A. Lasko, J. C. Denny, and M. A. Levy. Computational phenotype discovery using unsupervised feature learning over noisy, sparse, and irregular clinical data. PloS one, 8(6):e66341, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  32. D. D. Lee and H. S. Seung. Learning the parts of objects by non-negative matrix factorization. Nature, 401(6755):788--791, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  33. G. Lemineur, R. Harba, N. Kilic, O. Ucan, O. Osman, and L. Benhamou. Efficient estimation of osteoporosis using artificial neural networks. In Industrial Electronics Society, 2007. IECON 2007. 33rd Annual Conference of the IEEE, pages 3039--3044. IEEE, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  34. H. Li, C. Buyea, X. Li, M. Ramanathan, L. Bone, and A. Zhang. 3d bone microarchitecture modeling and fracture risk prediction. In Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine, pages 361--368. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. J. Li, J. Shi, and D. Satz. Modeling and analysis of disease and risk factors through learning bayesian networks from observational data. Quality and Reliability Engineering International, 24(3):291--302, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  36. J. Marx. Coming to grips with bone loss: Piecing together human aging. Science, 305(5689):1420--1422, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  37. T. D. Nguyen, T. Tran, D. Phung, and S. Venkatesh. Learning parts-based representations with nonnegative restricted boltzmann machine. In Asian Conference on Machine Learning, pages 133--148, 2013.Google ScholarGoogle Scholar
  38. C. Ordonez and K. Zhao. Evaluating association rules and decision trees to predict multiple target attributes. Intelligent Data Analysis, 15(2):173--192, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. J. Robbins, A. Schott, P. Garnero, P. Delmas, D. Hans, and P. Meunier. Risk factors for hip fracture in women with high bmd: Epidos study. Osteoporosis international, 16(2):149--154, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  40. F. Rosenblatt. Principles of neurodynamics; perceptrons and the theory of brain mechanisms. Spartan Books, Washington, 1962.Google ScholarGoogle Scholar
  41. J. Sirola, A.-K. Koistinen, K. Salovaara, T. Rikkonen, M. Tuppurainen, J. S. Jurvelin, R. Honkanen, E. Alhava, and H. Kröger. Bone loss rate may interact with other risk factors for fractures among elderly women: A 15-year population-based study. Journal of osteoporosis, 2010, 2010.Google ScholarGoogle Scholar
  42. Y. W. Teh and G. E. Hinton. Rate-coded restricted boltzmann machines for face recognition. Advances in neural information processing systems, pages 908--914, 2001.Google ScholarGoogle Scholar
  43. T. Tran, T. D. Nguyen, D. Phung, and S. Venkatesh. Learning vector representation of medical objects via emr-driven nonnegative restricted boltzmann machines (enrbm). Journal of biomedical informatics, 54:96--105, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. L. Van der Maaten and G. Hinton. Visualizing data using t-sne. Journal of Machine Learning Research, 9(2579-2605):85, 2008.Google ScholarGoogle Scholar
  45. N. G. Weiskopf and C. Weng. Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research. Journal of the American Medical Informatics Association, 20(1):144--151, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  46. H. Zou and T. Hastie. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2):301--320, 2005.Google ScholarGoogle Scholar

Index Terms

  1. Bone disease prediction and phenotype discovery using feature representation over electronic health records

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          BCB '15: Proceedings of the 6th ACM Conference on Bioinformatics, Computational Biology and Health Informatics
          September 2015
          683 pages
          ISBN:9781450338530
          DOI:10.1145/2808719

          Copyright © 2015 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 9 September 2015

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          BCB '15 Paper Acceptance Rate48of141submissions,34%Overall Acceptance Rate254of885submissions,29%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader