A computer-aided detection system for clustered microcalcifications

https://doi.org/10.1016/j.artmed.2010.04.007Get rights and content

Abstract

Objective

The aim of this paper is to describe a novel system for computer-aided detection of clusters of microcalcifications on digital mammograms.

Methods and material

Mammograms are first segmented by means of a tree-structured Markov random field algorithm that extracts the elementary homogeneous regions of interest. An analysis of such regions is then performed by means of a two-stage, coarse-to-fine classification based on both heuristic rules and classifier combination. In this phase, we avoid taking a decision on the single microcalcifications and forward it to the successive phase of clustering realized through a sequential approach.

Results

The system has been tested on a publicly available database of mammograms and compared with previous approaches. The obtained results show that the system is very effective, especially in terms of sensitivity.

Conclusions

The proposed approach exhibits some remarkable advantages both in segmentation and classification phases. The segmentation phase employs an image model that reduces the computational burden, preserving the small details in the image through an adaptive local estimation of all model parameters. The classification stage combines the results of the classifiers focused on the single microcalcification and the cluster as a whole. Such an approach makes a detection system particularly effective and robust with respect to the large variations exhibited by the clusters of microcalcifications.

Introduction

Mammography is presently the most effective radiological screening technique for detecting lesions in the breast using low doses of radiation, and thus, it plays a central role in early detection of breast cancer through mass screening. In fact, it makes it possible to diagnose breast cancer at a very early stage, when it is still feasible to successfully treat the disease with an effective (and possibly breast-conserving) therapy.

An important clue to breast cancer is the presence of microcalcifications (μC), which are tiny granule-like deposits of calcium that appear on the mammogram as small bright spots. When scattered throughout the mammary tissue, they are typically not of concern. On the contrary, clustered microcalcifications may be the only detectable manifestation of early breast cancer.

Visual interpretation of mammogram is a fatiguing and time-consuming task because of the small size of the microcalcifications (ranging from 0.1 mm to 0.7 mm) and the low contrast of the image. This is particularly true in mass screening where a high number of mammograms must be examined by a radiologist in a day. This can give rise to a significant number of errors with very high costs, both socially and economically. Thus, a computer-aided detection (CADe) system could be very useful to the radiologist, both for prompting suspect cases and for helping in the diagnostic decision as a “second reading.” The goal is twofold: to improve the sensitivity of the diagnosis (i.e. the accuracy in recognizing all the actual clusters) and its specificity (i.e. the ability to avoid erroneous detections).

The approach followed by the traditional CADe systems [1], [2], [3] entails a segmentation phase following which a binary decision is made about each region extracted by segmentation (classification phase); in other words, each segmented object is classified as a microcalcification or an artifact. In the successive clustering phase the regions classified as microcalcifications are clustered with very simple rules based exclusively on their proximity, to individuate those clusters that are worth prompting.

However, the poor contrast on the mammogram makes it difficult to distinguish the calcifications from the mammary tissue in the background. This makes the feature extraction phase very critical, and could lead to unreliable feature values, which in turn, could negatively affect the classification. In such cases, the results of the classification would not be sufficiently accurate and the subsequent clustering could produce unsatisfactory results. In our work, we propose an alternative approach that, in the final stage, does not require a decision on the single regions during the classification phase. It considers the output of the classifier as a confidence degree of the single microcalcifications to be used in a successive phase of clustering, in which the recognition process is completed (delayed classification). In this way, a suitable clustering algorithm, which takes into account both the spatial coordinates of the regions and their confidence degree, aggregates them in the candidate clusters. The final decision is taken on such clusters instead of the single microcalcifications and considers both the features of the single microcalcifications therein and the characteristics of the whole cluster. The rationale is that the actual desired outcome of a CADe system is the position of the clusters of microcalcifications and not of the single microcalcifications, and accordingly, a single decision about the cluster, based on the evidence accumulated from the grouped microcalcifications, can produce more reliable results than many decisions taken separately on each microcalcification.

Another important point is the segmentation phase: a particularly effective tool is given by the Markov Random Field (MRF) model that makes it possible to include an a priori knowledge on the segmentation result. For this reason, the MRF model has previously been used for detecting clustered microcalcifications [4]; however, its common implementation is affected by two critical problems: the computational burden and the sensitivity of the results to the model parameters. Hence, in our approach, we have employed a Tree-Structured MRF (TS-MRF) model [5], which is capable of providing a segmentation process rapidly and is quite spatially adaptive as all the field parameters are estimated locally.

The rest of the paper is organized as follows: Section 2 provides a review of the existing techniques in the field of the detection of microcalcification clusters, while Section 3 gives an overview of the whole approach. The subsequent sections describe the steps of the method: preprocessing (Section 4), segmentation (Section 5), classifications of the candidate microcalcifications (Section 6) and clustering (Section 7). Section 8 describes the experimental results obtained on a public database of images, while in Section 9 some conclusions are drawn and some future research directions are outlined.

Section snippets

Review of the existing techniques

In recent years, CADe systems have undergone significant development with respect to automated detection and classification of microcalcification clusters in digitized mammograms. Many papers have been published in the literature have been cited in the two recent surveys by Cheng et al. [6] and Nishikawa [7] which present a comparative analysis of various algorithms and techniques for the diagnosis of breast cancer on mammograms.

Two general approaches are commonly used [7]: the application of

The proposed approach

The research work presented in this paper falls within the second approach described in the previous section. As the size of the digitized mammographic image in input is greater than the region that contains the breast tissue, the first step of our approach comprises a preprocessing phase that recovers the region containing the breast tissue and deletes other signs or artifacts present in the image.

Once the breast region has been found, a segmentation phase is performed to decompose the image

Preprocessing

The preprocessing phase aims at locating the region containing the mammary tissue and eliminating the artifacts out of the mammogram. In this way, the size of the image to be segmented is reduced without the loss of information, and the complexity of the succeeding segmentation phase is decreased.

Accordingly, an adaptive image binarization is performed through an approach based on the Otsu algorithm [29]: the connected components are extracted and the largest one is selected as the area

MRF model-based methods

Mammogram segmentation can be easily formulated in the statistical framework as an estimation problem. Indeed, a digital image can be conveniently modeled as a random field, namely, a set of random variables defined on a discrete set of sites. The set of sites S identifies the geometry of the N1×N2 image through a finite regular rectangular lattice S={(i,j):1iN1,1jN2}. A random field defined on S is simply a collection of random variables, one for each site, X={Xs,sS} with realization x={xs

Features

After segmentation, the image is subdivided into a huge number of elementary ROIs that, for the most part, contain background areas. For subsequent classification, each ROI is described with both geometrical and textural features, defined according to the known characteristics of the microcalcifications.

For each ROI, the employed features are:

  • Area: the number of pixels in the ROI.

  • Perimeter: the number of pixels on the contour of the ROI.

  • Compactness: Area/(Perimeter)2.

  • X-axis: the maximum width

Clustering algorithm

As stated in the previous sections, our approach postpones the decision on the single ROI until the clustering phase. In other words, the confidence degree estimated for each ROI in the classification phase is considered as an input feature for the clustering algorithm together with the spatial coordinates of the ROIs. In this way, we could obtain a reliable partition of the ROIs in the clusters of microcalcifications, which are successively validated by another classification phase working on

Database

The system has been tested on a standard database, publicly available, which was kindly provided by the National Expert and Training Centre for Breast Cancer Screening and the Department of Radiology at the University of Nijmegen, the Netherlands. The database comprised 40 digitized mammographic images composed of both oblique and craniocaudal views from 21 patients. The images were digitized from the films at a size of 2048×2048 using a 12-bit CCD camera (Eikonix 1412), with a sampling

Conclusions

In this paper, we described a CADe system for the detection of clusters of microcalcifications in mammograms. An experimental analysis performed on a standard database demonstrated that the proposed approach is very efficient and effective in locating TP clusters of microcalcifications when compared with other methods in the literature.

The approach is innovative and presents some advantages both in segmentation and classification phases. In the segmentation phase, the tree structure of the

References (37)

  • L. Wei et al.

    A study on several machine-learning methods for classification of malignant and benign clustered microcalcifications

    IEEE Transactions on Medical Imaging

    (2005)
  • W. Veldkamp et al.

    Improved correction for signal dependent noise applied to automatic detection of microcalcifications

  • C. D’Elia et al.

    A tree-structured Markov random field model for Bayesian image segmentation

    IEEE Transactions on Image Processing

    (2003)
  • R. Nishikawa

    Current status and future directions of computer-aided diagnosis in mammography

    Computerized Medical Imaging and Graphics

    (2007)
  • Y. Wu et al.

    Artificial neural networks in mammography: application to decision making in the diagnosis of breast cancer

    Radiology

    (1993)
  • L. Wei et al.

    Relevance vector machine for automatic detection of clustered microcalcifications

    IEEE Transactions on Medical Imaging

    (2005)
  • R. Nakayama et al.

    Computer-aided diagnosis scheme using a filter bank for detection of microcalcification clusters in mammograms

    IEEE Transactions on Biomedical Engineering

    (2006)
  • R. Strickland et al.

    Wavelet transform for detecting microcalcifications in mammograms

    IEEE Transactions on Medical Imaging

    (1996)
  • Cited by (35)

    • A multi-context CNN ensemble for small lesion detection

      2020, Artificial Intelligence in Medicine
      Citation Excerpt :

      These systems help physicians in the tedious and challenging task of interpreting the invaluable source of information being held in medical images, preventing decisions to be affected by errors and improving the detection of subtle but important changes in anatomical structures and tissues, essential to timely treat diseases [3]. For CADe development, several approaches have been reported in the literature of the last few decades, ranging from conventional image analysis methodologies to Machine Learning techniques [4–11]. Deep Learning models, and in particular convolutional neural networks (CNNs), have recently acquired great popularity thanks to their remarkable performance in computer vision [12,13] and have proved to be powerful also in medical image analysis [14–18].

    • Mammogram classification using sparse-ROI: A novel representation to arbitrary shaped masses

      2016, Expert Systems with Applications
      Citation Excerpt :

      As the human interpretation of mammogram varies from one expert to another, a repetitive interpretation is required to avoid misinterpretation of breast tissues. Therefore, computer aided detection (CADe) and computer aided diagnosis (CADx) systems are being developed for efficient diagnosis (Marrocco, Molinara, D'Elia, & Tortorella, 2010; Jiang, Yao, & Wason, 2007; Verma, McLeod, & Klevansky, 2009; Haralick, Shanmugam, & Dinstein, 1973; Chang et al., 2006; Ke, Mu, & Kang, 2010). Earlier works in literature depict that CAD systems significantly increase the accuracy of detection and diagnosis (Rouhi, Jafari, Kasaei, & Keshavarzian, 2015, Abdel-Zaher & Eldeib, 2016).

    • Computer-aided detection of cerebral microbleeds in susceptibility-weighted imaging

      2015, Computerized Medical Imaging and Graphics
      Citation Excerpt :

      The majority of pre-screened FPs associated with the discontinuous foldings of the sulcus and gyrus across the brain, and sharp local intensity changes in the sinus. Unlike other studies [19,32], a well-established RF classifier was chosen as an independent weak classifier to handle imbalanced datasets. This is mainly because the RF algorithm incorporates iterative learning [20], and bagging or boosting ensembles of classifiers [16] which have been shown to be effective in skewed classification problems.

    • The Simulation of the Signal Detection Algorithm in MIMO System Application

      2022, Proceedings - 2022 8th Annual International Conference on Network and Information Systems for Computers, ICNISC 2022
    View all citing articles on Scopus
    View full text