Elsevier

Pattern Recognition Letters

Volume 29, Issue 15, 1 November 2008, Pages 2011-2017
Pattern Recognition Letters

Hierarchical classification using a frequency-based weighting and simple visual features

https://doi.org/10.1016/j.patrec.2008.04.004Get rights and content

Abstract

This article describes the use of a frequency-based weighting scheme using low level visual features developed for image retrieval to perform a hierarchical classification of medical images. The techniques are based on a classical tf/idf (term frequency, inverse document frequency) weighting scheme of the GIFT (GNU Image Finding Tool), and perform classification based on kNN (k-Nearest Neighbors) and voting-based approaches. The features used by the GIFT are very simple giving a global description of the images and local information on fixed regions both for colors and textures. We reused a similar technique as in previous years of ImageCLEF to have a baseline for the retrieval performance over the three years of the medical image annotation task. This allows showing the clear increase in quality of participating research systems over the years.

Subsequently, we optimized the retrieval results based on the simple technology used by varying the feature space, the classification method (varying number of neighbors, various voting schemes) and by adding new information such as aspect ratio, which has shown to work well in the past. The results show that the techniques we use have several problems that could not be fully solved through the applied optimizations. Still, optimizations improved results enormously from an error value of 228 to below 150. As a baseline to show the progress of techniques over the years it also works well. Aspect ratio shows to be an important factor to improve results. Performing classification on an axis level performs better than using the entire hierarchy code or not taking hierarchy into account at all. To further improve results, the use of more suitable visual features such as patch histograms or salient point features seems necessary. Small distortions of images of the same class have to be taken into account for very good results. Still, without using any learning technique and high level visual features, the approach performs reasonably well.

Introduction

Medical images are an extremely important part of the diagnosis process in medical institutions. As most hospitals now have computerized patient records and fully digitized image production, new possibilities arise for management of data and the extraction of information from the stored data (Müller et al., 2004a, Tagare et al., 1997, Vannier et al., 2002). At the same time of images becoming digital, the number of images produced and their complexity has increased strongly. The Geneva University Hospitals radiology department alone produced over 70,000 images per day in 2007 (Müller et al., 2007) and these numbers continue to rise.

In other domains, content-based image retrieval has been used for many years to manage the growing amount of visual data (Datta et al., 2008, Smeulders et al., 2000, Kato, 1992, Rui et al., 1999). While early approaches used fairly low level features such as global color distributions and texture characteristics (Niblack et al., 1993), more modern systems rather use local features either gained through segmentation (Winter and Nastar, 1999) or in the form of salient points and their relations (Fergus et al., 2004, Tommasi et al., 2007). The latter obtained the best result in ImageCLEF 2007.

Object recognition in images has been another active research area to extract important information from potentially non-annotated images (Everingham et al., 2006, Pinz, 2005). In the medical domain, similar approaches have been used for medical image classification to extract information from these images (Lehmann et al., 2005). The dataset of the IRMA project (Image Retrieval in Medical Applications) is also used in the ImageCLEF1 benchmark, of which a participation is described in this article. Many of the techniques for image retrieval and for image classification are similar but whereas for classification, a finite number of classes is regarded and training data are often available, for information retrieval applications, the number of classes occurring in the dataset is often unknown and training data are rarely available.

Several steps can generally be tuned to optimize the final performance.

  • Image pre-processing such as segmentation (Antani et al., 2004), normalization of gray levels, or background removal (Müller et al., 2005).

  • Extraction of domain-specific visual features (Müller et al., 2004b).

  • Optimization of the distance measure or weighting scheme to determine distances between elements.

  • Application of a learning strategy (such as Support Vector Machines) (Qiu, 2006).

In our approach, we do not take into account any pre-processing and neither any learning strategy. Efforts are concentrated on the optimization of the feature space and particularly on a classification strategy with our simple features to test the limits of our retrieval engine, the GIFT.2 This cannot rival in performance with more modern approaches particularly for learning/classification such as the use of Support Vector Machines (SVMs) (Chapelle et al., 2002) or salient point-based visual features (Tommasi et al., 2007).

More on the ImageCLEFmed benchmark, the corresponding classification setup, error calculation, and the other participating techniques can be read in (Deselaers et al., 2008).

In Section 2, the methods of our approach are explained in detail. Section 3 presents the results obtained with these methods. In the last section, we critically interpret our results and present the conclusions of this article.

Section snippets

Methods

This section describes the data used and the techniques employed.

Results

This section details the results obtained with the various techniques. The results of all participating research groups are compared with error values in (Deselaers et al., 2008).

Interpretation and discussion

In comparison with systems using modern visual techniques and machine learning approaches, the GIFT system with a simple kNN classification and without any learning strategy has a relatively low performance. However, the GIFT runs were initially meant to be a baseline to allow comparison with other techniques. The best overall results were obtained using SIFT (Scale Invariant Feature Transform) features and SVM-based learning approaches. Other top results used histograms of image patches or

Acknowledgements

This study was partially supported by the Swiss National Science Foundation (Grant 205321–109304/1) and the European Union in the Sixth Framework Program through the KnowARC project (Grant IST 032691). We also thank the reviewers for their constructive comments that helped to improve this paper.

References (26)

  • Gass, T., Geissbuhler, A., Müller, H., 2007. Learning a frequency-based weighting for medical image classification. In:...
  • Kato, T., 1992. Database architecture for content-based image retrieval. In: Jamberdino, A.A., Niblack, W. (Eds.),...
  • Lehmann, T.M., Schubert, H., Keysers, D., Kohnen, M., Wein, B.B., 2003. The IRMA code for unique classification of...
  • Cited by (6)

    View full text