Research Note
Dependency parsing for medical language and concept representation

https://doi.org/10.1016/S0933-3657(97)00041-9Get rights and content

Abstract

The theory of conceptual structures serves as a common basis for natural language processing and medical concept representation. We present a Prolog-based formalization of dependency grammar that can accommodate conceptual structures in its dependency rules. First results indicate that this formalization provides an operational basis for the implementation of medical language parsers and for the design of medical concept representation languages.

Introduction

Sowa noticed a similarity between existential graphs (an early form of conceptual graphs) and the graphs of dependency grammar ([20], p. 8). Although he does not detail his observation, it is easy to see that the nodes of both graph types denote concepts. To illustrate the similarity, the dependency tree of the sentenceThereisanareaofincreaseduptakeattherightlumbosacraljunction.”and its conceptual graph are contrasted in Fig. 1(a) and (b).

We deem the noted similarity worth further exploration. In particular, we present and discuss a Prolog-based definition and implementaion of dependency grammar that can (a) parse sentences of a medical language, (b) generate meaningful conceptual graphs from a canonical basis, and (c) act as a conceptual parser facilitating the transformation from one form of medical content representation into the other. We report on our experience with the described framework and draw a practically relevant conclusion.

Section snippets

Parsing medical texts with dependency grammar

Dependency theory is an old linguistic theory. It is based on the assumption that every word of a sentence has slots to be filled by others, called its dependents. Dependency grammar was first formalized by Tesnière [24]and soon after by Gaifman [10]and Hayes [11], but has since almost been forgotten. A recent treatise of dependency parsing is Fraser's dissertation [7].

According to our own formalization 21, 22which is based on Hellwig's dependency unification grammar (DUG) [12], a grammar

Generating conceptual graphs with dependency rules

A dependency tree makes visible the grammatical structure of a sentence. Its content or meaning, however, is represented in a different form. According to the notation of conceptual structures [20], the de facto standard for medical concept representation, the content of sentence (1) could be represented by:

  • [uptake] -

    • (existence) -> [certain]

    • (form) -> [area] -

      • (occurrences) -> [1],

    • (quantification) -> [increased]

    • (location) -> [lumbosacral junction] -

      • (relative position) -> [at]

      • (side) -> [right], , .


which is the

Preliminary results and discussion

We have used the described formalization of dependency grammar to implement the Canon Group's core merged model of radiology findings 6, 9. Attempts to generate all canonical graphs derivable from the model showed that it contains several design flaws [23]. In particular, in its current form it includes hidden sources of infinite recursion. For example, the subtype relations:

  • pleural_effusion < rad_finding.

  • pleural_effusion < effusion.

  • effusion < observation.


and the canonical graph
  • [rad_finding] -

Conclusion

The specification of medical language and concept representation is difficult and, without the aid of formal evaluation tools, error-prone. We have presented a uniform framework allowing the implementation and execution of dependency rules and canonical graphs on Prolog machines. This framework greatly facilitates the parsing and generation of natural language sentences and conceptual structures and should therefore prove a useful design tool.

Acknowledgements

The author's work has in part been supported by the Universitätsgesellschaft Hildesheim e.V.

References (26)

  • H Gaifman

    Dependency systems and phrase-structure systems

    Inf. Control

    (1965)
  • G Smolka et al.

    Inheritace hierachies: semantics and unification

    J. Symbolic Comput.

    (1989)
  • G Smolka

    Feature-constraint logics for unification grammars

    J. Logic Program.

    (1992)
  • H Aı̈t-Kaci et al.

    LOGIN: A logic programming language with built-in inheritance

    J. Logic Program.

    (1986)
  • R.H Baud et al.

    Natural language processing and semantic representation of medical texts

    Methods Inform. Med.

    (1994)
  • J. Bernauer, Subsumption principles underlying medical concept systems and their formal reconstruction, in: Proc. 18th...
  • K.E Campbell et al.

    A logical foundation for representation of clinical data

    J. Am. Med. Inform. Assoc.

    (1994)
  • B. Carpenter, The Logic of Typed Feature Structures, Cambridge University Press, Cambridge,...
  • D.A Evans et al.

    Toward a medical-concept representation language

    J. Am. Med. Inform. Assoc.

    (1994)
  • N. Fraser, Dependency Parsing, PhD Thesis, University College London, London,...
  • C Friedman et al.

    A schema for representing medical language applied to clinical radiology

    J. Am. Med. Inform. Assoc.

    (1994)
  • C Friedman et al.

    The Canon Group's effort: working toward a merged model

    J. Am. Med. Inform. Assoc.

    (1995)
  • D.G Hays

    Dependency theory: a formalism and some observations

    Language

    (1964)
  • Cited by (3)

    • Understanding of medico-technical reports

      2000, Artificial Intelligence in Medicine
    • Incorporating syntactic dependency information towards improved coding of lengthy medical concepts in clinical reports

      2009, BioNLP 2009 - Biomedical Natural Language Processing Workshop, BioNLP 2009 - held in conjunction with 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2009 - Proceedings
    View full text