Abstract
In this study, we describe our preliminary efforts in building an i2b2-based integrated data repository that supports centralized data management for ovarian cancer clinical research, and discuss important lessons learnt that would inspire the evaluation and enhancement for future generic cancer-specific data repository. We collected multiple types of heterogeneous clinical data, including demographic, outcome, chemo-treatment and lab-test information for ovarian cancer. To better integrate different data types, we conducted data normalization procedures through reusing standard codes and creating mappings between local codes and standard vocabularies. We also developed the extract, transform and load (ETL) scripts to load the data into an i2b2 instance. Through further analytic practices, we evaluated major expectations of the systems according to common clinical research needs, including cohort query and identification, clinical data-based hypothesis-testing, and exploratory data-mining. We also identified and discussed outstanding issues we will address through additional enhancement of existing i2b2 system.
Z. Li—Co-first author.
Similar content being viewed by others
Notes
References
Huser, V., Cimino, J.J.: Desiderata for healthcare integrated data repositories based on architectural comparison of three public repositories. In: AMIA Annual Symposium Proceedings 2013, pp. 648–656 (2013)
Wade, T.D., et al.: Using patient lists to add value to integrated data repositories. J. Biomed. Inform. 52, 72–77 (2014)
MacKenzie, S.L., et al.: Practices and perspectives on building integrated data repositories: results from a 2010 CTSA survey. J. Am. Med. Inform. Assoc. 19(e1), e119–e124 (2012)
The observational health data sciences and informatics. http://www.ohdsi.org/. Accessed 7 Mar 2016
PCORnet, the National Patient-Centered Clinical Research Network. http://www.pcornet.org/. Accessed 3 Mar 2016
Informatics for integrating biology and the bedside (i2b2). https://www.i2b2.org/. Accessed 3 Mar 2016
Data sharing network (SHRINE). https://www.i2b2.org/work/shrine.html. Accessed 3 Mar 2016
Rustin, G.J., et al.: Defining response of ovarian carcinoma to initial chemotherapy according to serum CA 125. J. Clin. Oncol. 14(5), 1545–1551 (1996)
Sun, C.C., et al.: Rankings and symptom assessments of side effects from chemotherapy: insights from experienced patients with ovarian cancer. Support. Care Cancer 13(4), 219–227 (2005)
Siegel, R.L., Miller, K.D., Jemal, A.: Cancer statistics, 2015. CA Cancer J. Clin. 65(1), 5–29 (2015)
Konecny, G.E., et al.: Prognostic and therapeutic relevance of molecular subtypes in high-grade serous ovarian cancer. J. Natl Cancer Inst. 106(10), dju249 (2014)
Wang, C., et al.: Tumor hypomethylation at 6p21.3 associates with longer time to recurrence of high-grade serous epithelial ovarian cancer. Cancer Res. 74(11), 3084–3091 (2014)
International classification of diseases (ICD). http://www.who.int/classifications/icd/en/. Accessed 3 Mar 2016
National drug code directory. http://www.fda.gov/Drugs/InformationOnDrugs/ucm142438.htm. Accessed 7 Mar 2016
A universal code system for tests, measurements, and observations. https://loinc.org/. Accessed 7 Mar 2016
NCI Thesaurus (NCIt). https://wiki.nci.nih.gov/display/EVS/NCI+Thesaurus+(NCIt). Accessed 3 Mar 2016
North American Association of Centrak Cancer Registries, Data Standards & Data Dictionary, vol. II (2015). https://www.naaccr.org/StandardsandRegistryOperations/VolumeII.aspx#. Accessed 3 Mar 2016
Segagni, D., et al.: R engine cell: integrating R into the i2b2 software infrastructure. J. Am. Med. Inform. Assoc. 18(3), 314–317 (2011)
rgate: gateway between i2b2 plugins and R (2013) https://informatics.kumc.edu/work/wiki/HeronStatsPlugins. Accessed 3 Mar 2013
GIRI (Generic Integration of R into I2b2) (2014). http://community.i2b2.org/wiki/display/GIRI/Home. Accessed 3 Mar 2013
Acknowledgement
The study is supported in part by a NCI U01 Project – caCDE-QA (1U01CA180940-01A1), R01-CA122443, and an award from Mayo Clinic Ovarian Cancer SPORE (P50 CA136393).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Hong, N. et al. (2017). Building an i2b2-Based Integrated Data Repository for Cancer Research: A Case Study of Ovarian Cancer Registry. In: Wang, F., Yao, L., Luo, G. (eds) Data Management and Analytics for Medicine and Healthcare. DMAH 2016. Lecture Notes in Computer Science(), vol 10186. Springer, Cham. https://doi.org/10.1007/978-3-319-57741-8_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-57741-8_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57740-1
Online ISBN: 978-3-319-57741-8
eBook Packages: Computer ScienceComputer Science (R0)