As a guest user you are not logged in or recognized by your IP address. You have
access to the Front Matter, Abstracts, Author Index, Subject Index and the full
text of Open Access publications.
Recent renewed interest in de-identification (also known as “anonymisation”) has led to the development of a series of systems in the United States with very good performance on challenge test sets. De-identification needs however to be tuned to the local documents and their specificities. We address here two issues raised in this context. First, tuning is generally performed by language engineers who should not have to work on identified text. We therefore perform a first gross de-identification step in the hospital. Second, to set up a de-identification system for new documents in a language different from English, here French patient reports, we tested two methods: the first attempts to adapt an existing US de-identifier for English, the second re-develops a new system which applies the same methods. The first method involved localizing patterns designed for English, which proved cumbersome and did not quickly obtain good performance. With a similar effort, the latter method obtained much better results. Evaluated on a set of 23 randomly selected texts from a corpus of 21,749 clinical texts, it obtained 83% recall and 92% precision.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.