Elsevier

Pattern Recognition Letters

Volume 29, Issue 15, 1 November 2008, Pages 2025-2031
Pattern Recognition Letters

A framework and baseline results for the CLEF medical automatic annotation task

https://doi.org/10.1016/j.patrec.2008.05.020Get rights and content

Abstract

The medical automatic annotation task issued by the cross language evaluation forum (CLEF) aims at a fair comparison of state-of-the art algorithms for medical content-based image retrieval (CBIR). The contribution of this work is twofold: at first, a logical decomposition of the CBIR task is presented, and key elements to support the relevant steps are identified: (i) implementation of algorithms for feature extraction, feature comparison, and classifier combination, (ii) visualization of extracted features and retrieval results, (iii) generic evaluation of retrieval algorithms, and (iv) optimization of the parameters for the retrieval algorithms and their combination. Data structures and tools to address these key elements are integrated into an existing framework for image retrieval in medical applications (IRMA). Secondly, baseline results for the CLEF annotation tasks 2005–2007 are provided applying the IRMA framework, where global features and corresponding distance measures are combined within a nearest neighbor approach. Using identical classifier parameters and combination weights for each year shows that the task difficulty decreases over the years. The declining rank of the baseline submission also indicates the overall advances in CBIR concepts. Furthermore, a rough comparison between participants who submitted in only one of the years becomes possible.

Introduction

In 2005, a medical automatic annotation task (MAAT) was introduced by the cross language evaluation forum (CLEF) as part of its annual retrieval challenges. It requires the non-interactive classification of images into categories based on a multi-axial, hierarchical code (Deselaers et al., 2008). To face the challenge of optimizing algorithms and parameters, a storage concept is required not only to hold the images (CLEF MAAT 2007 consists of 12,000 images), but also to organize the experiments. Existing approaches like the GNU image finding tool (GIFT1), or medGIFT (Hidki et al., in press) as a specialized version for the medical domain, provide such a framework for the development and application of retrieval algorithms. In particular, GIFT integrates user-implemented content descriptors, works on files, and stores extracted features in inverted files for fast retrieval. However, the algorithm model is rather strict, as the feature extraction works isolated per image, and the system does not focus on development issues like common pre-processing steps, access to intermediate features, or modularization.

The image retrieval in medical applications (IRMA) project has two main goals (Lehmann et al., 2004): on the system side, a framework is implemented which supports the development and execution of retrieval algorithms. It provides a database for storing feature data as well as definitions of algorithms, a runtime environment for their execution, and user interfaces for accessing the system functionality, e.g., interactive retrieval.2 On the retrieval side, it aims to implement, evaluate and verify a multi-step approach for the abstraction process for image content. This process models image content on different levels and therefore uses a wide range of features: global features for the coarse classification of the images, local (per-pixel) features as a basis for segmentation, and a hierarchical data structure for modelling the relationship between image regions and object identification (structural features for scene description).

The IRMA framework stores content descriptions inside a generic feature container (Güld et al., 2007), along with type information. A central database keeps track of their storage location and their generation history. The system supports both interactive and non-interactive retrieval algorithms. Algorithms are integrated as methods, which are implemented as functions in the C++ programming language and transform a tuple of input features into a tuple of output features. In addition to processing a set of single images independently from each other, IRMA also models transformations which compute features from a set of features, e.g., a principal component analysis (PCA). The method interfaces use the feature type specifications to enable compatibility checks when interconnecting methods. Networks are used to combine method instances into more complex retrieval algorithms by defining the propagation of features between the methods. Experiments are used to store a partial parameterization of networks by assigning parameters to some of the network sources. Thus, a network can be initialized for a specific task, leaving only certain sources unassigned. Such experiments can be used for retrieval, hiding internals from the user and requiring him only to specify relevant information at query time, e.g., the set of images to be searched and the query image.

At runtime, a central scheduling service determines which methods inside the parameterized network have a complete tuple of input features and are therefore ready to run. The central service dispatches pending method calls to a set of daemons running on other computers inside the local area network. The daemon dynamically loads the method function, runs it for the given input features, stores the output and reports back to the scheduler. Feature storage is handled transparently to the method. Using the generation history of each feature inside the database, already completed method calls are identified and skipped by the scheduler. The daemons run as background processes on stock-house PC hardware (GNU/Linux) in a workstation pool of roughly a dozen machines. Early experiments showed results (Güld et al., 2007) that were mostly limited by the central database and the central file server hosting the image data.

All user interfaces are running on a web-server and are therefore accessed via a web browser. A PHP class library for commonly used widgets and application steps (user accounting, preferences, consistent look & feel) was built using the Smarty template engine.3 Current interactive query functionality is based on the query by example (QBE) paradigm (Smeulders et al., 2000).

Through these components, the IRMA framework can be used to develop, run, and deploy a variety of image processing algorithms. Therefore, CLEF MAAT is regarded as a use-case for the platform. While the storage requirements are addressed by the framework, the method concept does not impose any constraints on the design and implementation of the CBIR algorithms to be employed for CLEF MAAT. To address the problems of parameter optimization, common data structures must be developed to use algorithm-independent tools for the evaluation and inspection of retrieval results. Beside common algorithm steps (like image pre-processing), the combination of CBIR algorithms as a whole must be supported efficiently. This includes both the developer’s side and the user’s side during the implementation and the application of the algorithms.

Section snippets

Methods

By identifying the elementary steps of the CLEF MAAT (and CBIR in general), a logical layer above the layer of the framework components is introduced. The main challenge with algorithms for image analysis and categorization is their parameterization for optimal results. Furthermore, an early result observed in many experiments, e.g., the ImageCLEF2004med retrieval challenge (Thies et al., 2005), is the improvement of results if a combination of classifiers is used. Fig. 1 shows a decomposition

Network builder

For CLEF MAAT, the network builder was used to define networks for the extraction and comparison of four types of global features. These networks integrate ten methods, which were implemented by several programmers. The documentation (stored as part of the method entity in the database) enables the network composer to treat the methods as black boxes, i.e. independent from their source code, based solely on the type information from the method interface. While it is still a programming tool and

Discussion

To support CLEF MAAT, logical application steps were established on top of the IRMA framework to allow the efficient evaluation, combination and optimization of CBIR algorithms based on nearest neighbor classifiers. The result matrix feature type ensures that any pair of element-based classifiers can be combined and evaluated with two generic programs. Via iterated runs, the optimization of parameters can also be performed automatically with little additional effort and computational costs. The

Acknowledgement

This work is partly funded by the German Research Foundation (DFG), Grants Le 1108/4 and Le 1108/9. We also appreciate the peer reviewers’ useful comments and suggestions.

References (14)

  • T. Deselaers et al.

    Automatic Medical Image Annotation in ImageCLEF 2007: Overview, Results, and Discussion

    Pattern Recognition Letters, Special Issue on Medical Image Annotation in ImageCLEF 2007

    (2008)
  • M.O. Güld et al.

    A generic concept for the implementation of medical image retrieval systems

    International Journal of Medical Informatics

    (2007)
  • D. Argiro et al.

    Cantata: The visual programming environment for the Khoros system

  • V. Castelli et al.

    Progressive search and retrieval in large image archives

    IBM Journal of Research and Development

    (1998)
  • Deserno, T.M., Güld, M.O., Plodowski, B., Spitzer, K., Wein, B.B., Schubert, H., Ney, H., Seidl, T., 2007. Extended...
  • Güld, M.O., Thies, C., Fischer, B., Lehmann, T.M., 2006. Content-based retrieval of medical images by combining global...
  • Güld, M.O., Thies, C., Fischer, B., Deserno, T.M., 2007. Baseline results for the ImageCLEF 2006 medical automatic...
There are more references available in the full text version of this article.

Cited by (0)

View full text