skip to main content
10.1145/3233547.3233591acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedingsconference-collections
research-article

Applying Stochastic Process Model to Imputation of Censored Longitudinal Data

Published: 15 August 2018 Publication History

Abstract

Longitudinal data are widely used in medicine, demography, sociology and other areas. Incomplete observations in such data often confound the results of analysis. A plethora of data imputation methods have already been proposed to alleviate this problem. The Stochastic Process Model (SPM) represents a general framework for modeling joint evolution of repeatedly measured variables and time-to-event outcome typically observed in longitudinal studies of aging, health and longevity. It is perfectly suitable for imputing missing observations in censored longitudinal data. We applied SPM to the problem of imputation of censored missing longitudinal data. This model was applied both to the Framingham Heart Study and Cardiovascular Health Study data as well as to simulated datasets. We also present an R package stpm designed for this purpose.

References

[1]
David B Allison, Myles S Faith, Moonseong Heo, and Donald P Kotler . 1997. Hypothesis concerning the U-shaped relation between body mass index and mortality. American Journal of Epidemiology Vol. 146, 4 (1997), 339--349.
[2]
Florent Boutitie, Franccois Gueyffier, Stuart Pocock, Robert Fagard, and Jean Pierre Boissel . 2002. J-shaped relationship between blood pressure and mortality in hypertensive patients: new insights from a meta-analysis of individual-patient data. Annals of Internal Medicine Vol. 136, 6 (2002), 438--448.
[3]
A. P. Dempster, N. M. Laird, and D. B. Rubin . 1977. Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society. Series B (Methodological) Vol. 39, 1 (1977), 1--38. deftempurl%http://www.jstor.org/stable/2984875 tempurl
[4]
Jean Mundahl Engels and Paula Diehr . 2003. Imputation of missing longitudinal data: a comparison of methods. Journal of clinical epidemiology Vol. 56, 10 (2003), 968--976.
[5]
Diane L Fairclough, Harriet F Peterson, and Victor Chang . 1998. Why are missing quality of life data a problem in clinical trials of cancer therapy? Statistics in medicine Vol. 17, 5--7 (1998), 667--677.
[6]
Frank E Harrell Jr et almbox. . 2013. Hmisc: Harrell miscellaneous. R package version 3.12--2. Computer software}. Available from http://cran. R-project. Org/web/packages/Hmisc (2013).
[7]
Tin Kam Ho . 1995. Random Decision Forests. In Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 1) - Volume 1 (ICDAR '95). IEEE Computer Society, Washington, DC, USA, 278--. deftempurl%http://dl.acm.org/citation.cfm?id=844379.844681 tempurl
[8]
James Honaker, Gary King, and Matthew Blackwell . 2011. Amelia II: A Program for Missing Data. Journal of Statistical Software, Articles Vol. 45, 7 (2011), 1--47.
[9]
S.G. Johnson . 2017. The NLopt nonlinear-optimization package. (2017). deftempurl%https://cran.r-project.org/package=nloptr tempurl R package version 1.0.4.
[10]
Masafumi Kuzuya, Hiromi Enoki, Mitsunaga Iwata, Jun Hasegawa, and Yoshihisa Hirakawa . 2008. J-shaped relationship between resting pulse rate and all-cause mortality in community-dwelling older people with disabilities. Journal of the American Geriatrics Society Vol. 56, 2 (2008), 367--368.
[11]
Syed S Mahmood, Daniel Levy, Ramachandran S Vasan, and Thomas J Wang . 2014. The Framingham Heart Study and the epidemiology of cardiovascular disease: a historical perspective. The Lancet Vol. 383, 9921 (2014), 999--1008.
[12]
A Mazza, S Zamboni, E Rizzato, AC Pessina, V Tikhonoff, L Schiavon, and E Casiglia . 2007. Serum uric acid shows a J-shaped trend with coronary mortality in non-insulin-dependent diabetic elderly people. The CArdiovascular STudy in the ELderly (CASTEL). Acta diabetologica Vol. 44, 3 (2007), 99.
[13]
David Meyer, Evgenia Dimitriadou, Kurt Hornik, Andreas Weingessel, and Friedrich Leisch . 2017. e1071: Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien. deftempurl%https://CRAN.R-project.org/package=e1071 tempurl R package version 1.6--8.
[14]
J. A. Nelder and R. Mead . 1965. A Simplex Method for Function Minimization. Comput. J. Vol. 7, 4 (1965), 308--313.
[15]
Kiyohito Okumiya, Kozo Matsubayashi, Tomoko Wada, Michiko Fujisawa, Yasushi Osaki, Nobufumi Yasuda, Toshio Ozawa, et almbox. . 1999. AU-Shaped Association Between Home Systolic Blood Pressure and Four-Year Mortality in Community-Dwelling Older Men. Journal of the American Geriatrics Society Vol. 47, 12 (1999), 1415--1421.
[16]
Dimitris Rizopoulos . 2010. JM: An R package for the joint modelling of longitudinal and time-to-event data. Journal of Statistical Software (Online) Vol. 35, 9 (2010), 1--33.
[17]
Dimitris Rizopoulos . 2016. The R Package JMbayes for Fitting Joint Models for Longitudinal and Time-to-Event Data Using MCMC. Journal of Statistical Software Vol. 72, 7 (2016), 1--45.
[18]
Donald B Rubin . 2004. Multiple imputation for nonresponse in surveys. Vol. Vol. 81. John Wiley & Sons.
[19]
Daniel J. Stekhoven and Peter Bühlmann . 2012. MissForest-non-parametric missing value imputation for mixed-type data. Bioinformatics Vol. 28, 1 (2012), 112--118.
[20]
Yu-Sung Su, Andrew Gelman, Jennifer Hill, and Masanao Yajima . 2011. Multiple Imputation with Diagnostics (mi) in R: Opening Windows into the Black Box. Journal of Statistical Software, Articles Vol. 45, 2 (2011), 1--31.
[21]
S. van Buuren . 2012. Flexible Imputation of Missing Data. Taylor & Francis. showLCCN2012000504
[22]
Stef van Buuren and Karin Groothuis-Oudshoorn . 2011. mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software, Articles Vol. 45, 3 (2011), 1--67.
[23]
Jacqueline CM Witteman, Diederick E Grobbee, Hans A Valkenburg, Th Stijnen, H Burger, A Hofman, and AM van Hemert . 1994. J-shaped relation between change in diastolic blood pressure and progression of aortic atherosclerosis. The Lancet Vol. 343, 8896 (1994), 504--507.
[24]
Max A Woodbury and Kenneth G Manton . 1977. A random-walk model of human mortality and aging. Theoretical Population Biology Vol. 11, 1 (1977), 37--48.
[25]
Anatoli I Yashin, Konstantin G Arbeev, Igor Akushevich, Aliaksandr Kulminski, Lucy Akushevich, and Svetlana V Ukraintseva . 2007. Stochastic model for analysis of longitudinal data on aging and mortality. Mathematical biosciences Vol. 208, 2 (2007), 538--551.
[26]
Anatoli I Yashin, Kenneth G Manton, and James W Vaupel . 1985. Mortality and aging in a heterogeneous population: a stochastic process model with observed and unobserved variables. Theoretical population biology Vol. 27, 2 (1985), 154--175.
[27]
Ilya Y Zhbannikov, Konstantin Arbeev, Igor Akushevich, Eric Stallard, and Anatoliy I Yashin . 2017. stpm: an R package for stochastic process model. BMC bioinformatics Vol. 18, 1 (2017), 125.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
BCB '18: Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics
August 2018
727 pages
ISBN:9781450357944
DOI:10.1145/3233547
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 August 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. biostatistics
  2. data imputation
  3. multiple imputation
  4. stochastic processes

Qualifiers

  • Research-article

Funding Sources

  • National Institute on Aging

Conference

BCB '18
Sponsor:

Acceptance Rates

BCB '18 Paper Acceptance Rate 46 of 148 submissions, 31%;
Overall Acceptance Rate 254 of 885 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 77
    Total Downloads
  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media