Intra-axiom redundancies in SNOMED CT
Introduction
SNOMED Clinical Terms (SNOMED CT) allows for meaning-based recording and retrieval of clinical information, which thereby becomes (re)usable. One of the advantages of SNOMED CT is its large size and coverage, which on the other hand makes defining new and maintaining existing concepts a challenging task.
Spackman [1] indicated back in 2001 that concept modellers have been uncertain about which elements are inherited from supertypes and therefore do not have to be added explicitly to a concept definition. Such intra-axiom redundancies, i.e. elements that are already entailed by other elements of the concept definition, are harmless from a logical point of view. However, they impede the maintainability of a terminology [2], [3], as they misleadingly suggest that new, meaningful information has been added to a concept.
Moreover, redundant elements might lead to content-related problems when concepts evolve. For example, the rolegroup in the subconcept Thyroid uptake with thyroid stimulation was redundant in the July 2012 version of SNOMED CT, as it repeated a rolegroup already contained in the definition of the superconcept Non-imaging thyroid uptake test, see Example 1.1. In the subsequent version of SNOMED CT, the method Radionuclide imaging was removed from the rolegroup in the superconcept, which makes sense for a concept with the name Non-imaging thyroid uptake test. However, the method was not removed from the rolegroup in the subconcept, as shown in Example 1.2, which is apparently incorrect. In this paper, we inventory redundant elements in SNOMED CT concept definitions.
Example 1.1 Two concept definitions in the July 2012 version of SNOMED CT. The definition of Thyroid uptake with thyroid stimulation contains a redundant element, the rolegroup (RG).
Example 1.2 Definitions of the concepts from Example 1.1 in the January 2013 version of SNOMED CT. The definition of Non-imaging thyroid uptake test has been corrected, but the previously redundant rolegroup is left unchanged.
Section snippets
SNOMED CT concept definitions and rolegroups
SNOMED CT is based on the lightweight Description Logic EL+ [4]. Its concepts are defined by conjunctions of other concepts as well as role-value pairs that are represented as exists restrictions (∃). These exists restrictions can be either ungrouped or grouped in so-called rolegroups “to add clarity to concept definitions. A rolegroup combines an attribute-value pair with one or more other attribute-value pairs. Rolegroups originated to add clarity to Clinical finding concepts which require
Materials and methods
We employed all 11 versions of SNOMED CT that were convertible to OWL, i.e. the January 2009 version to the January 2014 version. We converted these versions with the Perl script that is provided with each release of SNOMED CT. This script makes use of two tables: concepts and stated relationships. The latter faithfully represents the information as it was specified by modellers, and has been released since 2009.
We relied on the high-performance reasoner ELK [7] to classify SNOMED CT, and to
Redundant elements in concept definitions in the July 2012 version
Applying the four rules of redundancy detection on the July 2012 version of SNOMED CT, 35,010 (12%) of the 296,433 concepts were identified to contain redundant elements in their definitions. Table 1 gives an overview of the results, only regarding the first explanation for these redundancies (the rules were applied in the same order as they are presented in this paper). 11,858 of these concepts are fully defined, and 23,152 non-trivially primitive.
Example 4.1 Parenteral form thymoxamine.
Example 4.2 Closed skull
Related and future work
Campbell et al. [13] proposed a semantics-based conflict identification method for the distributed development of logic-based terminologies. Conflicts that can be detected are multiply-defined term conflicts and non-unique definition conflicts. Multiply-defined terms refer to the same term, but do not have the same definitions. They can be sub-classified into semantically-conflicting definitions and semantically equivalent definitions. When the definitions are semantically equivalent, it is
Discussion and conclusions
Our results show that 35,010 (12%) of all 296,433 SNOMED CT concepts of the July 2012 version were defined redundantly. These redundancies unnecessarily impede the work of concept modellers, and ultimately the quality of a terminology. Redundant elements in concept definitions are introduced and solved in comparable amounts in all versions of SNOMED CT between January 2009 and January 2014. On average, about three quarters of the introduced redundancies are caused by changes in definitions of
References (18)
- et al.
Pellet: a practical OWL-DL reasoner
Web Semant: Sci Serv Agents World Wide Web
(2007) - et al.
Auditing description-logic-based medical terminological systems by detecting equivalent concept definitions
Int J Med Informa
(2008) - et al.
A review of auditing methods applied to the content of controlled biomedical terminologies
J Biomed Inform
(2009) Normal forms for description logic expressions of clinical concepts in SNOMED RT
- et al.
Logical Support for Terminological Modeling
- et al.
Elimination of redundancy in ontologies
- et al.
Is tractable reasoning in extensions of the description logic el useful in practice?
SNOMED CT Editorial Guide, Tech. rep.
(2014)- et al.
Relationship groups in SNOMED CT
Cited by (0)
- 1
Ronald Cornet is a member of the Technical Committee of the International Health Terminology Standards Development Organisation (IHTSDO), which publishes SNOMED CT. His position at the IHTSDO, however, had no bearing on the research study or results.