A framework and baseline results for the CLEF medical automatic annotation task
Introduction
In 2005, a medical automatic annotation task (MAAT) was introduced by the cross language evaluation forum (CLEF) as part of its annual retrieval challenges. It requires the non-interactive classification of images into categories based on a multi-axial, hierarchical code (Deselaers et al., 2008). To face the challenge of optimizing algorithms and parameters, a storage concept is required not only to hold the images (CLEF MAAT 2007 consists of 12,000 images), but also to organize the experiments. Existing approaches like the GNU image finding tool (GIFT1), or medGIFT (Hidki et al., in press) as a specialized version for the medical domain, provide such a framework for the development and application of retrieval algorithms. In particular, GIFT integrates user-implemented content descriptors, works on files, and stores extracted features in inverted files for fast retrieval. However, the algorithm model is rather strict, as the feature extraction works isolated per image, and the system does not focus on development issues like common pre-processing steps, access to intermediate features, or modularization.
The image retrieval in medical applications (IRMA) project has two main goals (Lehmann et al., 2004): on the system side, a framework is implemented which supports the development and execution of retrieval algorithms. It provides a database for storing feature data as well as definitions of algorithms, a runtime environment for their execution, and user interfaces for accessing the system functionality, e.g., interactive retrieval.2 On the retrieval side, it aims to implement, evaluate and verify a multi-step approach for the abstraction process for image content. This process models image content on different levels and therefore uses a wide range of features: global features for the coarse classification of the images, local (per-pixel) features as a basis for segmentation, and a hierarchical data structure for modelling the relationship between image regions and object identification (structural features for scene description).
The IRMA framework stores content descriptions inside a generic feature container (Güld et al., 2007), along with type information. A central database keeps track of their storage location and their generation history. The system supports both interactive and non-interactive retrieval algorithms. Algorithms are integrated as methods, which are implemented as functions in the C++ programming language and transform a tuple of input features into a tuple of output features. In addition to processing a set of single images independently from each other, IRMA also models transformations which compute features from a set of features, e.g., a principal component analysis (PCA). The method interfaces use the feature type specifications to enable compatibility checks when interconnecting methods. Networks are used to combine method instances into more complex retrieval algorithms by defining the propagation of features between the methods. Experiments are used to store a partial parameterization of networks by assigning parameters to some of the network sources. Thus, a network can be initialized for a specific task, leaving only certain sources unassigned. Such experiments can be used for retrieval, hiding internals from the user and requiring him only to specify relevant information at query time, e.g., the set of images to be searched and the query image.
At runtime, a central scheduling service determines which methods inside the parameterized network have a complete tuple of input features and are therefore ready to run. The central service dispatches pending method calls to a set of daemons running on other computers inside the local area network. The daemon dynamically loads the method function, runs it for the given input features, stores the output and reports back to the scheduler. Feature storage is handled transparently to the method. Using the generation history of each feature inside the database, already completed method calls are identified and skipped by the scheduler. The daemons run as background processes on stock-house PC hardware (GNU/Linux) in a workstation pool of roughly a dozen machines. Early experiments showed results (Güld et al., 2007) that were mostly limited by the central database and the central file server hosting the image data.
All user interfaces are running on a web-server and are therefore accessed via a web browser. A PHP class library for commonly used widgets and application steps (user accounting, preferences, consistent look & feel) was built using the Smarty template engine.3 Current interactive query functionality is based on the query by example (QBE) paradigm (Smeulders et al., 2000).
Through these components, the IRMA framework can be used to develop, run, and deploy a variety of image processing algorithms. Therefore, CLEF MAAT is regarded as a use-case for the platform. While the storage requirements are addressed by the framework, the method concept does not impose any constraints on the design and implementation of the CBIR algorithms to be employed for CLEF MAAT. To address the problems of parameter optimization, common data structures must be developed to use algorithm-independent tools for the evaluation and inspection of retrieval results. Beside common algorithm steps (like image pre-processing), the combination of CBIR algorithms as a whole must be supported efficiently. This includes both the developer’s side and the user’s side during the implementation and the application of the algorithms.
Section snippets
Methods
By identifying the elementary steps of the CLEF MAAT (and CBIR in general), a logical layer above the layer of the framework components is introduced. The main challenge with algorithms for image analysis and categorization is their parameterization for optimal results. Furthermore, an early result observed in many experiments, e.g., the ImageCLEF2004med retrieval challenge (Thies et al., 2005), is the improvement of results if a combination of classifiers is used. Fig. 1 shows a decomposition
Network builder
For CLEF MAAT, the network builder was used to define networks for the extraction and comparison of four types of global features. These networks integrate ten methods, which were implemented by several programmers. The documentation (stored as part of the method entity in the database) enables the network composer to treat the methods as black boxes, i.e. independent from their source code, based solely on the type information from the method interface. While it is still a programming tool and
Discussion
To support CLEF MAAT, logical application steps were established on top of the IRMA framework to allow the efficient evaluation, combination and optimization of CBIR algorithms based on nearest neighbor classifiers. The result matrix feature type ensures that any pair of element-based classifiers can be combined and evaluated with two generic programs. Via iterated runs, the optimization of parameters can also be performed automatically with little additional effort and computational costs. The
Acknowledgement
This work is partly funded by the German Research Foundation (DFG), Grants Le 1108/4 and Le 1108/9. We also appreciate the peer reviewers’ useful comments and suggestions.
References (14)
- et al.
Automatic Medical Image Annotation in ImageCLEF 2007: Overview, Results, and Discussion
Pattern Recognition Letters, Special Issue on Medical Image Annotation in ImageCLEF 2007
(2008) - et al.
A generic concept for the implementation of medical image retrieval systems
International Journal of Medical Informatics
(2007) - et al.
Cantata: The visual programming environment for the Khoros system
- et al.
Progressive search and retrieval in large image archives
IBM Journal of Research and Development
(1998) - Deserno, T.M., Güld, M.O., Plodowski, B., Spitzer, K., Wein, B.B., Schubert, H., Ney, H., Seidl, T., 2007. Extended...
- Güld, M.O., Thies, C., Fischer, B., Lehmann, T.M., 2006. Content-based retrieval of medical images by combining global...
- Güld, M.O., Thies, C., Fischer, B., Deserno, T.M., 2007. Baseline results for the ImageCLEF 2006 medical automatic...