Skip to main content

Clinical Data Integration Strategies for Multicenter Studies

  • Conference paper
  • First Online:
Technological Innovation for Connected Cyber Physical Spaces (DoCEIS 2023)

Abstract

Multicenter health studies are important to enrich the outcomes of medical research findings due to the number of subjects that they can engage. To simplify the execution of these studies, the data-sharing process should be effortless, for instance, using interoperable databases. However, achieving this interoperability is still an ongoing research topic. In the first stage of this work, we propose methodologies to optimize the harmonization pipelines of health databases, considering the OMOP CDM as the destination schema. In the following stage, aiming to enrich the information stored in OMOP CDM databases, we have investigated solutions to extract clinical concepts from unstructured narratives. In the final stage, we aimed to simplify the protocol execution of multicenter studies, by proposing novel solutions for facilitating the discovery of databases. The developed solutions are currently being used in European projects aiming to create federated networks of health databases across Europe.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Almeida, J.R., Silva, L.B., Bos, I., Visser, P.J., Oliveira, J.L.: A methodology for cohort harmonisation in multicentre clinical research. Inform. Med. Unlocked 27, 100760 (2021). https://doi.org/10.1016/j.imu.2021.100760

    Article  Google Scholar 

  2. Ranganathan, P., Aggarwal, R.: Study designs: part 1–an overview and classification. Perspect. Clin. Res. 9(4), 184 (2018). https://doi.org/10.4103/picr.PICR_124_18

    Article  Google Scholar 

  3. Song, J.W., Chung, K.C.: Observational studies: cohort and casecontrol studies. Plast. Reconstr. Surg. 126(6), 2234 (2010). https://doi.org/10.1097/PRS.0b013e3181f44abc

    Article  Google Scholar 

  4. Carlson, M.D., Morrison, R.S.: Study design, precision, and validity in observational studies. J. Palliat. Med. 12(1), 77–82 (2009). https://doi.org/10.1089/jpm.2008.9690

    Article  Google Scholar 

  5. Hripcsak, G., Duke, J.D., Shah, N.H., et al.: Observational health data sciences and informatics (OHDSI): opportunities for observational researchers. Stud. Health Technol. Inform. 216, 574 (2015). https://doi.org/10.3233/978-1-61499-564-7-574

    Article  Google Scholar 

  6. Harris, P.A., Taylor, R., Thielke, R., Payne, J., Gonzalez, N., Conde, J.G.: Research electronic data capture (REDCap)—a metadata-driven methodology and workflow process for providing translational research informatics support. J. Biomed. Inform. 42(2), 377–381 (2009). https://doi.org/10.1016/j.jbi.2008.08.010

    Article  Google Scholar 

  7. Brown, C.H., Sloboda, Z., Faggiano, F., et al.: Methods for synthesizing findings on moderation effects across multiple randomized trials. Prev. Sci. 14(2), 144–156 (2013). https://doi.org/10.1007/s11121-011-0207-8

    Article  Google Scholar 

  8. Cushman, R., Froomkin, A.M., Cava, A., Abril, P., Goodman, K.W.: Ethical, legal and social issues for personal health records and applications. J. Biomed. Inform. 43(5), S51–S55 (2010). https://doi.org/10.1016/j.jbi.2010.05.003

    Article  Google Scholar 

  9. Fox, G.: “To protect my health or to protect my health privacy?” A mixedmethods investigation of the privacy paradox. J. Am. Soc. Inf. Sci. 71(9), 1015–1029 (2020). https://doi.org/10.1002/asi.24369

    Article  Google Scholar 

  10. Meystre, S.M., Lovis, C., Bürkle, T., Tognola, G., Budrionis, A., Lehmann, C.U.: Clinical data reuse or secondary use: current status and potential future progress. Yearb. Med. Inform. 26(01), 38–52 (2017). https://doi.org/10.15265/IY-2017-007

  11. Topaloglu, U., Topaloglu, M.B.: Using a federated network of realworld data to optimize clinical trials operations. JCO Clin. Cancer Inform. 2, 1–10 (2018). https://doi.org/10.1200/CCI.17.00067

    Article  Google Scholar 

  12. Kaelber, D.C., Jha, A.K., Johnston, D., Middleton, B., Bates, D.W.: A research agenda for personal health records (PHRs). J. Am. Med. Inform. Assoc. 15(6), 729–736 (2008). https://doi.org/10.1197/jamia.M2547

    Article  Google Scholar 

  13. Kahn, M.G., Callahan, T.J., Barnard, J., et al.: A harmonized data quality assessment terminology and framework for the secondary use of electronic health record data. Egems 4(1) (2016). https://doi.org/10.13063/2327-9214.1244

  14. Weiskopf, N.G., Hripcsak, G., Swaminathan, S., Weng, C.: Defining and measuring completeness of electronic health records for secondary use. J. Biomed. Inform. 46(5), 830–836 (2013). https://doi.org/10.1016/j.jbi.2013.06.010

    Article  Google Scholar 

  15. Ross, M., Wei, W., Ohno-Machado, L.: “Big data” and the electronic health record. Yearb. Med. Inform. 23(01), 97–104 (2014). https://doi.org/10.15265/IY-2014-0003

    Article  Google Scholar 

  16. Gamal, A., Barakat, S., Rezk, A.: Standardized electronic health record data modeling and persistence: a comparative review. J. Biomed. Inform. 114, 103670 (2021). https://doi.org/10.1016/j.jbi.2020.103670

    Article  Google Scholar 

  17. Muñoz, P., Trigo, J.D., Martínez, I., Muñoz, A., Escayola, J., García, J.: The ISO/EN 13606 standard for the interoperable exchange of electronic health records. J. Healthc. Eng. 2(1), 1–24 (2011). https://doi.org/10.1260/2040-2295.2.1.1

    Article  Google Scholar 

  18. Ulriksen, G.-H., Pedersen, R., Ellingsen, G.: Infrastructuring in healthcare through the OpenEHR architecture. Comput. Support. Coop. Work (CSCW) 26(1–2), 33–69 (2017). https://doi.org/10.1007/s10606-017-9269-x

    Article  Google Scholar 

  19. Hripcsak, G., et al.: The Book of OHDSI: Observational Health Data Sciences and Informatics. OHDSI (2019)

    Google Scholar 

  20. Rodrigues, J.J.: Health Information Systems: Concepts, Methodologies, Tools, and Applications: Concepts, Methodologies, Tools, and Applications, vol. 1. IGI Global (2009)

    Google Scholar 

  21. Fernandes, L.M., O’Connor, M., Weaver, V.: Big data, bigger outcomes. J. AHIMA 83(10), 38–43 (2012)

    Google Scholar 

  22. Rehman, A., Naz, S., Razzak, I.: Leveraging big data analytics in healthcare enhancement: trends, challenges and opportunities. Multimedia Syst. 28, 1339–1371 (2021). https://doi.org/10.1007/s00530-020-00736-8

    Article  Google Scholar 

  23. Murdoch, T.B., Detsky, A.S.: The inevitable application of big data to health care. JAMA 309(13), 1351–1352 (2013). https://doi.org/10.1001/jama.2013.393

    Article  Google Scholar 

  24. Abraham, L., Vilanilam, G.C., et al.: Big data in clinical sciences-value, impact, and fallacies. Arch. Med. Health Sci. 10(1), 112 (2022). https://doi.org/10.4103/amhs.amhs_296_21

    Article  Google Scholar 

  25. Jensen, P.B., Jensen, L.J., Brunak, S.: Mining electronic health records: towards better research applications and clinical care. Nat. Rev. Genet. 13(6), 395–405 (2012). https://doi.org/10.1038/nrg3208

    Article  Google Scholar 

  26. Xu, J., Glicksberg, B.S., Su, C., Walker, P., Bian, J., Wang, F.: Federated learning for healthcare informatics. J. Healthc. Inform. Res. 5(1), 1–19 (2020). https://doi.org/10.1007/s41666-020-00082-4

    Article  Google Scholar 

  27. Fung, B.C., Wang, K., Chen, R., Yu, P.S.: Privacy-preserving data publishing: a survey of recent developments. ACM Comput. Surv. (CSUR) 42(4), 1–53 (2010). https://doi.org/10.1145/1749603.1749605

    Article  Google Scholar 

  28. Meystre, S.M., Savova, G.K., Kipper-Schuler, K.C., Hurdle, J.F.: Extracting information from textual documents in the electronic health record: a review of recent research. Yearb. Med. Inform. 17(01), 128–144 (2008). https://doi.org/10.1055/s-0038-1638592

    Article  Google Scholar 

  29. Wang, Y., Wang, L., Rastegar-Mojarad, M., et al.: Clinical information extraction applications: a literature review. J. Biomed. Inform. 77, 34–49 (2018). https://doi.org/10.1016/j.jbi.2017.11.011

    Article  Google Scholar 

  30. Ford, E., Carroll, J.A., Smith, H.E., Scott, D., Cassell, J.A.: Extracting information from the text of electronic medical records to improve case detection: a systematic review. J. Am. Med. Inform. Assoc. 23(5), 1007–1015 (2016). https://doi.org/10.1093/jamia/ocv180

    Article  Google Scholar 

  31. Sheikhalishahi, S., Miotto, R., Dudley, J.T., Lavelli, A., Rinaldi, F., Osmani, V., et al.: Natural language processing of clinical notes on chronic diseases: systematic review. JMIR Med. Inform. 7(2), e12239 (2019). https://doi.org/10.2196/12239

    Article  Google Scholar 

  32. Pivovarov, R., Elhadad, N.: Automated methods for the summarization of electronic health records. J. Am. Med. Inform. Assoc. 22(5), 938–947 (2015). https://doi.org/10.1093/jamia/ocv032

    Article  Google Scholar 

  33. Neustein, A., Imambi, S.S., Rodrigues, M., Teixeira, A., Ferreira, L.: Application of text mining to biomedical knowledge extraction: analyzing clinical narratives and medical literature. In: Text Mining of Web-Based Medical Content, pp. 3–32 (2014). https://doi.org/10.1515/9781614513902

  34. Hripcsak, G., Ryan, P.B., Duke, J.D., et al.: Characterizing treatment pathways at scale using the OHDSI network. Proc. Natl. Acad. Sci. 113(27), 7329–7336 (2016). https://doi.org/10.1073/pnas.1510502113

    Article  Google Scholar 

  35. Almeida, J.R., Silva, L.B., Pazos, A., Oliveira, J.L.: Combining heterogeneous patient-level data into transMART to support multicentre studies. In: 2022 IEEE 35th International Symposium on Computer-Based Medical Systems (CBMS), pp. 62–65 (2022). https://doi.org/10.1109/CBMS55023.2022.00018

  36. Almeida, J.R., Coelho, L., Oliveira, J.L.: BIcenter: a collaborative web ETL solution based on a reflective software approach. SoftwareX 16, 100892 (2021). ISSN: 2352-7110. https://doi.org/10.1016/j.softx.2021.100892

  37. Almeida, J.R., Pazos, A., Oliveira, J.L.: BIcenter-AD: harmonising Alzheimer’s disease cohorts using a common ETL tool. Inform. Med. Unlocked 35, 101133 (2022). ISSN: 2352-9148. https://doi.org/10.1016/j.imu.2022.101133

  38. Almeida, J.R., Silva, J.F., Matos, S., Oliveira, J.L.: A two-stage workflow to extract and harmonize drug mentions from clinical notes into observational databases. J. Biomed. Inform. 120, 103849 (2021). https://doi.org/10.1016/j.jbi.2021.103849

    Article  Google Scholar 

  39. Matos, S.: Configurable web-services for biomedical document annotation. J. Cheminform. 10(1), 68 (2018). https://doi.org/10.1186/s13321-018-0317-4

    Article  Google Scholar 

  40. Bodenreider, O.: The unified medical language system (UMLS): integrating biomedical terminology. Nucleic Acids Res. 32(suppl 1), D267–D270 (2004). https://doi.org/10.1093/nar/gkh061

    Article  Google Scholar 

  41. Almeida, J.R., Oliveira, J.L.: Multi-language concept normalisation of clinical cohorts. In: 2020 IEEE 33rd International Symposium on Computer-Based Medical Systems (CBMS), pp. 261–264. IEEE (2020). https://doi.org/10.1109/CBMS49503.2020.00056

  42. Lovestone, S., Consortium, E.: The European medical information framework: a novel ecosystem for sharing healthcare data across Europe. Learn. Health Syst. 4(2), e10214 (2020). https://doi.org/10.1002/lrh2.10214

  43. Oliveira, J.L., Trifan, A., Silva, L.A.B.: EMIF Catalogue: a collaborative platform for sharing and reusing biomedical data. Int. J. Med. Inform. 126, 35–45 (2019). https://doi.org/10.1016/j.ijmedinf.2019.02.006

    Article  Google Scholar 

  44. Bos, I., Vos, S., Vandenberghe, R., et al.: The EMIF-AD Multimodal Biomarker Discovery study: design, methods and cohort characteristics. Alzheimer’s Res. Ther. 10(1), 64 (2018). https://doi.org/10.1186/s13195-018-0396-5

    Article  Google Scholar 

  45. Almeida, J.R., Barraca, J.P., Oliveira, J.L.: A secure architecture for exploring patient-level databases from distributed institutions. In: 2022 IEEE 35th International Symposium on Computer-Based Medical Systems (CBMS), pp. 447–452. IEEE (2022). https://doi.org/10.1109/CBMS55023.2022.00086

  46. Almeida, J.R., Silva, J.M., Oliveira, J.L.: A FAIR approach to real-world health data management and analysis. In: 2023 IEEE 36th International Symposium on Computer-Based Medical Systems (CBMS). IEEE (2023)

    Google Scholar 

  47. Wilkinson, M.D., et al.: The FAIR Guiding Principles for scientific data management and stewardship. Scientific data 3.1, 1–9 (2016). (vid. págs. 142, 151)

    Google Scholar 

Download references

Acknowledgments

This work has received support from the EU/EFPIA Innovative Medicines Initiative 2 Joint Undertaking under grant agreement No 806968. JRA has been funded by FCT (Foundation for Science and Technology) under the grant SFRH/BD/147837/2019.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to João Rafael Almeida .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Almeida, J.R., Pazos, A., Oliveira, J.L. (2023). Clinical Data Integration Strategies for Multicenter Studies. In: Camarinha-Matos, L.M., Ferrada, F. (eds) Technological Innovation for Connected Cyber Physical Spaces. DoCEIS 2023. IFIP Advances in Information and Communication Technology, vol 678. Springer, Cham. https://doi.org/10.1007/978-3-031-36007-7_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-36007-7_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-36006-0

  • Online ISBN: 978-3-031-36007-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics