Research NoteDependency parsing for medical language and concept representation
Introduction
Sowa noticed a similarity between existential graphs (an early form of conceptual graphs) and the graphs of dependency grammar ([20], p. 8). Although he does not detail his observation, it is easy to see that the nodes of both graph types denote concepts. To illustrate the similarity, the dependency tree of the sentenceand its conceptual graph are contrasted in Fig. 1(a) and (b).
We deem the noted similarity worth further exploration. In particular, we present and discuss a Prolog-based definition and implementaion of dependency grammar that can (a) parse sentences of a medical language, (b) generate meaningful conceptual graphs from a canonical basis, and (c) act as a conceptual parser facilitating the transformation from one form of medical content representation into the other. We report on our experience with the described framework and draw a practically relevant conclusion.
Section snippets
Parsing medical texts with dependency grammar
Dependency theory is an old linguistic theory. It is based on the assumption that every word of a sentence has slots to be filled by others, called its dependents. Dependency grammar was first formalized by Tesnière [24]and soon after by Gaifman [10]and Hayes [11], but has since almost been forgotten. A recent treatise of dependency parsing is Fraser's dissertation [7].
According to our own formalization 21, 22which is based on Hellwig's dependency unification grammar (DUG) [12], a grammar
Generating conceptual graphs with dependency rules
A dependency tree makes visible the grammatical structure of a sentence. Its content or meaning, however, is represented in a different form. According to the notation of conceptual structures [20], the de facto standard for medical concept representation, the content of sentence (1) could be represented by:
[uptake] -
(existence) -> [certain]
(form) -> [area] -
(occurrences) -> [1],
(quantification) -> [increased]
(location) -> [lumbosacral junction] -
(relative position) -> [at]
(side) -> [right], , .
which is the
Preliminary results and discussion
We have used the described formalization of dependency grammar to implement the Canon Group's core merged model of radiology findings 6, 9. Attempts to generate all canonical graphs derivable from the model showed that it contains several design flaws [23]. In particular, in its current form it includes hidden sources of infinite recursion. For example, the subtype relations:
pleural_effusion < rad_finding.
pleural_effusion < effusion.
effusion < observation.
and the canonical graph
[rad_finding] -
Conclusion
The specification of medical language and concept representation is difficult and, without the aid of formal evaluation tools, error-prone. We have presented a uniform framework allowing the implementation and execution of dependency rules and canonical graphs on Prolog machines. This framework greatly facilitates the parsing and generation of natural language sentences and conceptual structures and should therefore prove a useful design tool.
Acknowledgements
The author's work has in part been supported by the Universitätsgesellschaft Hildesheim e.V.
References (26)
Dependency systems and phrase-structure systems
Inf. Control
(1965)- et al.
Inheritace hierachies: semantics and unification
J. Symbolic Comput.
(1989) Feature-constraint logics for unification grammars
J. Logic Program.
(1992)- et al.
LOGIN: A logic programming language with built-in inheritance
J. Logic Program.
(1986) - et al.
Natural language processing and semantic representation of medical texts
Methods Inform. Med.
(1994) - J. Bernauer, Subsumption principles underlying medical concept systems and their formal reconstruction, in: Proc. 18th...
- et al.
A logical foundation for representation of clinical data
J. Am. Med. Inform. Assoc.
(1994) - B. Carpenter, The Logic of Typed Feature Structures, Cambridge University Press, Cambridge,...
- et al.
Toward a medical-concept representation language
J. Am. Med. Inform. Assoc.
(1994) - N. Fraser, Dependency Parsing, PhD Thesis, University College London, London,...
A schema for representing medical language applied to clinical radiology
J. Am. Med. Inform. Assoc.
The Canon Group's effort: working toward a merged model
J. Am. Med. Inform. Assoc.
Dependency theory: a formalism and some observations
Language
Cited by (3)
On the representation of roles in object-oriented and conceptual modelling
2000, Data and Knowledge EngineeringUnderstanding of medico-technical reports
2000, Artificial Intelligence in MedicineIncorporating syntactic dependency information towards improved coding of lengthy medical concepts in clinical reports
2009, BioNLP 2009 - Biomedical Natural Language Processing Workshop, BioNLP 2009 - held in conjunction with 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2009 - Proceedings