Automatic generation of repeated patient information for tailoring clinical notes
Introduction
Within a series of reports for a given patient, a percentage of the patient's information is often repeated, being carried over from one report in the series to the next. Physicians often spend much time and effort both determining what information needs to be repeated and re-generating (dictating or typing) this repeated information when creating a follow-up report for a given patient [1]. This paper describes a methodology to automatically generate repeated patient information for creating new clinical notes. We define a clinical note to be a report which documents encounters with patients, including information such as demographics, medical history, surgical history, examination results or the current medical condition. Our belief is that such a system can reduce the total amount of time needed to generate clinical notes and can also lead to more complete and accurate notes. Completeness is achieved because the system will always propagate to subsequent notes information designated by the physician to be repeated. The accuracy of the repeated information is dependent on the accuracy of the previous note(s) in the series but there is the ever-present danger of propagating erroneous information. With a user interface, which clearly presents the repeated information to the physician, the content of each clinical note can be thoroughly reviewed before being submitted to the patient's permanent record.
Our approach is to use the role a phrase or sentence plays within the document as a key feature in determining whether it should be repeated or not. The role a sentence plays, called a discourse role, is basically the author's intention in using that sentence. We have identified several discourse roles that occur within clinical notes in the pediatric urology domain and some were found to be repeated more readily than others. The discourse role and other words mentioned within the same sentence were shown to be good features for predicting repetition of sentences in clinical notes with high accuracy. Though we have designed the system with our particular clinical note in view, we believe that our methodology is applicable to other documents, which share similar characteristics.
Section snippets
Generating repeated patient information
Our system generates text for documents by extracting text segments from a previous document and inserting it into a new document. We believe copying text verbatim for generating repeated patient information is suitable in performing this task for the following two reasons:
- (1)
Physicians often use particular phrases to describe medical observations or events and a system, which generates its own language, may incorrectly convey this vital information.
- (2)
There is usually no need for the system to
Results
An analysis of the distribution of discourse roles in our corpus resulted in the graph shown in Fig. 7, where it can be seen that over half the sentences fell under the finding-abnorm or finding-norm tags. This corresponds to the fact that clinical notes are written mainly for documenting patient findings. Fig. 8 shows the repeat percentage of sentences using just the discourse role as a determining feature, which turns out to be not very indicative of repeatability by itself. Consequently, we
Discussion
Work in structured reporting [11], [12], [13], [14] has addressed the issue of reducing the amount of time for a physician to generate routine patient reports. Many of these systems utilize templates, which provide a general structure of the report and allow the physician to fill in the details, and macros, which allow the physician to use a type of shorthand to generate text for the report [15]. Though these systems can allow faster generation of reports, they are good only for generating
Future work
Future work will consist of improving the matching process between two semantic patterns. Work is currently being done to utilize the syntactic structure of language to prune out unnecessary phrases before a match score is calculated. This will help focus the similarity metric. There is also work currently underway which is focusing on determining semantic equivalence on the phrase level, which will allow the system to correlate semantic patterns that do not look alike on the surface but have
Conclusion
We presented a methodology for automatically generating repeated patient information in a series of clinical notes using semantic patterns and approximate sequence matching. Semantic patterns were used to determine discourse roles for sentences in the clinical notes, and based on the discourse role and other features, the system determined whether the sentence should be repeated in a subsequent note or not. Our system was trained on a corpus of pediatric urology clinical notes, and it was able
Acknowledgements
This work was supported in part by a grant from the National Institute of Biomedical Imaging and Bioengineering (NIBIB) PO1-EB00216, and from the National Library of Medicine, T15-LM07356.
References (21)
- et al.
Identification of common molecular subsequences
J. Mol. Biol.
(1981) Text generation in clinical medicine—a review
Methods Inf. Med.
(2003)Automatically constructing a dictionary for information extraction tasks
Automatically generating extraction patterns from untagged text
- I. Muslea, Extraction patterns for information extraction tasks: a survey. The AAAI-99 Workshop on Machine Learning for...
- S.B. Huffman, Learning information extraction patterns from examples, in: Proceedings of the 1995 IJCAI Workshop on New...
- et al.
Learning to paraphrase: an unsupervised approach using multiple-sequence alignment
- et al.
Automatic structuring of radiology free-text reports
Radiology
(2001) Binary codes capable of correcting deletions, insertions and reversals
Soviet Phys. Doklady
(1966)- et al.
Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids
(1998)
Cited by (7)
Pilot trial of semi-automated medical note writing using lexeme hypotheses
2020, International Journal of Medical InformaticsCitation Excerpt :Most lexeme queries need only a handful of responses to cover the needed answers. This approach allows us to examine how we generate notes is much more rigorous fashion than previously possible [10], and it generates three useful hypotheses that we use to predict what issue (or query) a clinician will need to address next when writing a note. These hypotheses assume that we have constructed a large library of lexemes and their associated responses in a lexicon.
Automatic extraction and assessment of lifestyle exposures for Alzheimer's disease using natural language processing
2019, International Journal of Medical InformaticsCitation Excerpt :EHRs refer to the comprehensive records of a patient health care history that resides in digital format [12,13]. Clinical notes are free-text EHRs that contain textual descriptions of physician-patient encounters and capture the information that the author intended to collect concerning a certain medical topic, offering valuable resources for identifying lifestyle exposures that physicians believed to be clinically important [14,15]. However, since clinical notes are free-text narratives lacking a standardized structure, searching for simple keywords may result in low sensitivity [16,17].
Comparison of automatic summarisation methods for clinical free text notes
2016, Artificial Intelligence in MedicineCitation Excerpt :Van Vleck et al. [2] performed structured interviews to identify and classify phrases that clinicians considered relevant to explaining a patient's history. Meng et al. [6] used an annotated training corpus together with tailored semantic patterns to determine what information should be repeated in a new clinical note or summary. Velupillai and Kvist [23] focused on recognising diagnostic statements in clinical text, learning from an annotated training corpus, and classifying these based on the level of certainty they have in them.
What can natural language processing do for clinical decision support?
2009, Journal of Biomedical InformaticsEvaluation and comparison of errors on nursing notes created by online and offline speech recognition technology and handwritten: an interventional study
2022, BMC Medical Informatics and Decision MakingData science techniques, tools and algorithms
2021, SpringerBriefs in Applied Sciences and Technology