Abstract
Health records rank among the most sensitive personal information existing today. An unwanted disclosure to unauthorised parties usually results in significant negative consequences for an individual. Therefore, health records must be adequately protected in order to ensure the individual’s privacy. However, health records are also valuable resources for clinical studies and research activities. In order to make the records available for privacy-preserving secondary use, thorough de-personalisation is a crucial prerequisite to prevent re-identification. This paper introduces MEDSEC, a system which automatically converts paper-based health records into de-personalised and pseudonymised documents which can be accessed by secondary users without compromising the patients’ privacy. The system converts the paper-based records into a standardised structure that facilitates automated processing and the search for useful information.
Similar content being viewed by others
References
Appari A, Johnson ME (2010) Information security and privacy in health-care: current state of research. Int J Internet Enterp Manag 6(4):279–314
Appelt DE (1999) Introduction to information extraction. AI Commun 12(3):161–172
Bascifci F, Eldem A (2013) Using reduced rule base with Expert System for the diagnosis of disease in hypertension. Med Biomed Eng Comput 51:1287–1293
Buckland M, Gey F (1994) The relationship between recall and precision. J Am Soc Inform Sci 45(1):12–19
Claerhout B, DeMoor G (2005) Privacy protection for clinical and genomic data: the use of privacy-enhancing techniques in medicine. Int J Med Inform 74(2):257–265
Galindo D, Verheul ER (2007) Microdata sharing via pseudonymisation. Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality, Manchester, pp 24–32
Giakoumaki A, Pavlopoulos S, Koutsouris D (2006) Secure and efficient health data management through multiple watermarking on medical images. Med Biomed Eng Comput 44:619–631
Grouin D, Rosier A, Dameron O, Zweigenbaum P (2009) Testing tactics to localize de-identification. Stud Health Technol Inform 150:735–739
Health Level Seven International (2007) HL7 version 3. Online: www.hl7.org
Heurix J, Karlinger M, Neubauer T (2012) PERiMETER–Pseudonymization and personal metadata encryption for privacy-preserving searchable documents. Health Systems 1(1):46–57
Heurix J, Rella A, Fenz S, Neubauer T (2013) Automated transformation of semi-structured text elements. In: Proceedings of America’s conference on information systems (AMCIS), pp 1–11
Iacono LL (2007) Multi-centric universal pseudonymisation for secondary use of the EHR. Stud Health Technol Inform 126:239–247
Imamura T, Matsumoto S, Kanagawa Y, Tajima B, Matsuya S, Furue M, Oyama H (2007) A technique for identifying three diagnostic findings using association analysis. Med Biomed Eng Comput 45:51–59
Morrison F, Li L, Lai A, Hripcsak G (2009) Repurposing the clinical record: can an existing natural language processing system de-identify clinical notes? J Am Med Inform Assoc 16(1):37–39
Ness RB (2007) Influence of the HIPAA privacy rule on health research. J Am Med Assoc 298(18):2164–2170
Noumeir R, Lemay A, Lina J-M (2007) Pseudonymization of radiology data for research purposes. J Digit Imaging 20(3):284–295
Sarawagi S (2008) Information extraction. Found Trends Databases 1(3):261–377
Sibanda T, He T, Szolovits P, Uzuner O (2006) Syntactically-informed semantic category recognition in discharge summaries. In: AMIA annual symposium proceedings, pp 714–718
Simon SR, Evans JS, Benjamin A, Delano D, Bates DW (2009) Patients’ attitudes toward electronic health information exchange: qualitative study. J Med Internet Res 11(3):1–30
Szarvas G, Farkas R, Busa-Fekete R (2007) State-of-the-art anonymization of medical records using an iterative machine learning framework. J Am Med Inform Assoc 14(5):574–580
Union European (1995) Directive 95/46/EC of the European Parliament and of the Council of 24 October 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data. Off J Eur Commun L281:31–50
United States Congress (1996) Health insurance portability and accountability Act of 1996. Pub.L. 104–191, 110 Stat. 1936
Velupillai S, Dalianisa H, Hassela M, Nilsson GH (2009) Developing a standard for de-identifying electronic patient records written in Swedish: precision, recall and f-measure in a manual and computerized annotation trial. Int J Med Inform 78(12):19–26
Wellner B, Huyck M, Mardis S, Aberdeen J, Morgan A, Peshkin L, Yeh A, Hitzeman J, Hirschman L (2007) Rapidly retargetable approaches to de-identification in medical records. J Am Med Inform Assoc 14(5):564–573
Willison DJ, Keshavjee K, Nair K, Goldsmith C, Holbrook AM (2003) Patients’ consent preferences for research uses of information in electronic medical records: interview and survey data. Br Med J 326(7385):373
Acknowledgments
We thank our business partners XiTrust Secure Technologies and Xylem Technologies for supporting the implementation of the case studies carried out within the MEDSEC project. The research was funded by BRIDGE (#824884), FFG-Austrian Research Promotion Agency, and supported by COMET K1, FFG-Austrian Research Promotion Agency.
Ethical standard
Since real-life records from a hospital archive with personal data were used in the case study, special care was taken to ensure the involved patients’ privacy. Access to the data was only allowed for the directly involved project members. Furthermore, the test data were only accessible within the archive computer network and records were not stored, copied, or processed outside the network environment.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Heurix, J., Fenz, S., Rella, A. et al. Recognition and pseudonymisation of medical records for secondary use. Med Biol Eng Comput 54, 371–383 (2016). https://doi.org/10.1007/s11517-015-1322-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11517-015-1322-7