Skip to main content

Advertisement

Log in

Neighbourhood search feature selection method for content-based mammogram retrieval

  • Original Article
  • Published:
Medical & Biological Engineering & Computing Aims and scope Submit manuscript

Abstract

Content-based image retrieval plays an increasing role in the clinical process for supporting diagnosis. This paper proposes a neighbourhood search method to select the near-optimal feature subsets for the retrieval of mammograms from the Mammographic Image Analysis Society (MIAS) database. The features based on grey level cooccurrence matrix, Daubechies-4 wavelet, Gabor, Cohen–Daubechies–Feauveau 9/7 wavelet and Zernike moments are extracted from mammograms available in the MIAS database to form the combined or fused feature set for testing various feature selection methods. The performance of feature selection methods is evaluated using precision, storage requirement and retrieval time measures. Using the proposed method, a significant improvement is achieved in mean precision rate and feature dimension. The results show that the proposed method outperforms the state-of-the-art feature selection methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Similar content being viewed by others

Notes

  1. A solution (feature subset) which can be regarded as arbitrarily close to the optimal solution is termed as the near-optimal one throughout this paper.

References

  1. Ahuja RK, Ergun O, Punnen A (2002) A survey of very large scale neighborhood search techniques. Discrete Appl Math 123:75–102

    Article  Google Scholar 

  2. Chandy DA, Johnson JS, Selvan SE (2014) Texture feature extraction using gray level statistical matrix for content-based mammogram retrieval. Multimed Tools Appl 72(2):2011–2024

    Article  Google Scholar 

  3. Cheng HD, Shi XJ, Min R, Hu LM, Cai XP, Du HN (2006) Approaches for automated detection and classification of masses in mammograms. Pattern Recognit Lett 39(4):646–668

    Article  Google Scholar 

  4. Daubechies I, Sweldens W (1998) Factorizing wavelet transforms into lifting steps. J Fourier Anal Appl 4(3):247–267

    Article  Google Scholar 

  5. De Sousa EPM et al (2002) How to use fractal dimension to find correlations between attributes. In: Proceedings of KDD-workshop on fractals and self-similarity in data mining: issues and approaches

  6. De Wouver GV, Scheunders P, Dyck DV (1999) Statistical texture characterization from discrete wavelet representations. IEEE Trans Image Process 8(4):592–598

    Article  Google Scholar 

  7. Do MN, Vetterli M (2002) Wavelet-based texture retrieval using generalized Gaussian density and Kullback–Leibler distance. IEEE Tans Image Process 11(2):146–158

    Article  Google Scholar 

  8. Eisa M, Refaat M, El-Gamal AF (2009) Preliminary diagnostics of mammograms using moments and texture features. Int J Graph Vis Image Process 9:21–27

    Google Scholar 

  9. El-Naqa I, Yang Y, Galatsanos NP, Nishikawa RM, Wernick MN (2004) A similarity learning approach to content-based image retrieval: application to digital mammography. IEEE Trans Med Imaging 23(10):1233–1244

    Article  PubMed  Google Scholar 

  10. Felipe JC, Olioti JB, Traina AJM, Ribeiro MX, Souza EPM, Junior CT (2005) A low cost approach for effective shape-based retrieval and classification of medical images, In: Proceedings of seventh IEEE international symposium on multimedia, pp 6–7

  11. Felipe JC, Traina AJM, Ribeiro MX, Souza EPM, Junior CT (2006) Effective shape-based retrieval and classification of mammograms, In: Proceedings of the twenty first annual ACM symposium on applied computing, pp 250–255

  12. Ferrari RJ, Frere AF, Rangayyan RM, Desautels JEL, Borges RA (2004) Identification of the breast boundary in mammograms using active contour models. Med Biol Eng Comput 42:201–208

    Article  CAS  PubMed  Google Scholar 

  13. Giger ML, Huo Z, Kupinski MA, Vyborny CJ (2000) Computer aided diagnosis in mammography. In: Fitzpatrick JM, Sonka M (eds) Handbook of medical imaging 2: medical image processing and analysis. SPIE, Bellingham, pp 915–1004

    Chapter  Google Scholar 

  14. Gonzalez-Garcia AC, Sossa-Azuela JH, Felipe-Riveron EM (2007) Image retrieval based on wavelet computation and neural network classification. In: Eighth IEEE international workshop on image analysis for multimedia interactive services (WIAMIS ’07), June 2007

  15. Greenspan H, Pinhas AT (2007) Medical image categorization and retrieval for pacs using GMM-KL framework. IEEE Trans Inf Technol Biomed 11(2):190–202

    Article  PubMed  Google Scholar 

  16. Haralick RM, Shanmugam K, Dinstein I (1973) Texture features for image classification. IEEE Trans Syst Man Cybern 3(6):610–621

    Article  Google Scholar 

  17. Huang Y-J, Chan D-Y, Cheng D-C, Ho Y-J, Tsai P-P, Shen W-C, Chen R-F (2013) Automated feature set selection and its application to MCC identification in digital mammograms for breast cancer detection. Sensors 13:4855–4875

    Article  PubMed  PubMed Central  Google Scholar 

  18. Jose TJ, Mythili P (2009) Neural network and genetic algorithm based hybrid model for content based mammogram image retrieval. Appl Sci 9:3531–3538

    Article  Google Scholar 

  19. Kay SM (1993) Fundamentals of statistical signal processing. Volume 1: Estimation theory. Prentice-Hall, Englewood Cliffs

    Google Scholar 

  20. Khotanzad A, Hong YH (1990) Invariant image recognition by Zernike moments. IEEE Trans Pattern Anal Mach Intell 12(5):489–497

    Article  Google Scholar 

  21. Kinoshita SK, Azevedo-Marques PM, Pereira RR, Rodrigues JAH, Rangayyan RM (2007) Content-based retrieval of mammograms using visual features related to breast density patterns. J Digit Imaging 20(2):172–190

    Article  PubMed  PubMed Central  Google Scholar 

  22. Lakovidis DK, Maroulis DE, Bariamis DG (2007) FPGA architecture for fast parallel computation of co-occurrence matrices. Microprocess Microsyst 31(2):160–165

    Article  Google Scholar 

  23. Lamard M, Cazuguel G, Quellec G, Bekri L, Roux C, Cochener B (2007) Content-based image retrieval based on wavelet transform coefficients distribution, In: Proceedings of the twenty ninth annual international conference of the IEEE Engineering in Medicine and Biology Society. IEEE Press, Lyon, pp 4532–4535

  24. Li S, Lee M-C, Pun C-M (2009) Complex Zernike moments features for shape-based image retrieval. IEEE Trans Syst Man Cybern Part A Syst Hum 39(1):227–237

    Article  Google Scholar 

  25. Manjunath BS, Ma WY (1996) Texture features for browsing and retrieval of image data. IEEE Trans Pattern Anal Mach Intell 18(8):837–842

    Article  Google Scholar 

  26. Mu T, Nandi AK, Rangayyan RM (2007) Classification of breast masses via nonlinear transformation of features based on a kernel matrix. Med Biol Eng Comput 45:769–780

    Article  PubMed  Google Scholar 

  27. Muller H, Muller W, Squire DM, Marchand-Maillet S, Pun T (2005) Performance evaluation on content-based image retrieval: overview and proposals. Pattern Recognit Lett 5:134–143

    Google Scholar 

  28. Quellec G, Lamard M, Cazuguel G, Cochener B, Roux C (2010) Wavelet optimization for content-based image retrieval in medical databases. Med Image Anal 14:227–241

    Article  CAS  PubMed  Google Scholar 

  29. Serdobolskii V (2000) Multivariate statistical analysis: a high-dimensional approach. Kluwer, London

    Google Scholar 

  30. Stolpman A, Dooley LS (1998) Genetic algorithms for automatic feature selection in a texture classification system. In: Proceedings of fourth international conference on signal processing, pp 1229–1232

  31. Suckling J, Parker J, Dance DR, Astley SM, Hutt I, Boggis CRM, Ricketts I, Stamatakis E, Cerneaz N, Kok SL, Taylor P, Betal D, Savage J (1994) Mammographic image analysis society digital mammogram database. In: Proceedings of international workshop on digital mammography, pp 211–221

  32. Sun J, Zhang Z (2008) An effective method for mammograph image retrieval. In: Proceedings of international conference on computational intelligence and security, pp 190–193

  33. Wang W, Li L, Liu W, Xu W (2009) A new two-stage hierarchical framework for mammogram retrieval, In: Proceedings of third international conference on bioinformatics and biomedical engineering, pp 1–4

  34. Wei C (2005) A content-based approach to medical image database retrieval. J Vis Commun Image Represent 15(5):285–302

    Google Scholar 

  35. Wei C, Li CT, Li Y (2008) Content-based retrieval of mammograms. In: Ma ZM (ed) Artificial intelligence for maximizing content-based image retrieval. Idea Group Publishing, Hershey, pp 313–339

    Google Scholar 

  36. Wei C, Li C (2006) Calcification descriptor and relevance feedback learning algorithms for content-based mammogram retrieval. In: Proceedings of the eighth international workshop on digital mammography, pp 307–314

  37. Wei C, Li Y, Li C (2007) Effective extraction of Gabor features for adaptive mammogram retrieval. In: Proceedings of IEEE international conference on multimedia and expo, pp 1503–1506

  38. Yin FF, Giger ML, Doi K, Vyborny CJ, Schmidt RA (1994) Computerized detection of masses in digital mammograms: investigation of feature analysis techniques. J Digit Imaging 7(1):18–26

    Article  CAS  PubMed  Google Scholar 

  39. Zighed DA, Tsumoto S, Ras ZW, Hacid H (eds) (2009) Mining complex data, studies in computational intelligence 165. Springer, Berlin. e-ISBN: 978-3-540-88067-7

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to D. Abraham Chandy.

Appendices

Appendix 1

1.1 GLCM features

The GLCM of an \(N \times N\) pixel image I contains the probabilities \(p_{d,\theta }(i,j)\) of the change over from a grey level i to a grey level j in a given direction \(\theta\) separated by a pixel distance d which are expressed as:

$$\begin{aligned} p_{d, \theta }(i, j) = \frac{C_{d, \theta }(i, j)}{\sum _{i=1}^{N_g}\sum _{j=1}^{N_g}C_{d, \theta }(i, j)}, \end{aligned}$$
(3)

where

$$\begin{aligned} C_{d, \theta }(i,j)&= {} \# \{(x_1, y_1), (x_2, y_2) \in N \times N : f(x_1, y_1)=i, f(x_2, y_2)=j,\\&|(x_1, y_1) - (x_2, y_2)| = d, ((x_1, y_1), (x_2, y_2)) = \theta \}. \end{aligned}$$

Here, \(\#\) denotes the cardinality of the set, \(f(\cdot )\) denotes the grey level of the pixels at locations \((x_1,y_1)\) and \((x_2, y_2)\), and \(N_g\) is the total number of grey levels in the image. The common choices for \(\theta\) are \(0^{\circ }, 45^{\circ }\) , \(90^{\circ }\) and \(135^{\circ }\) [22]. The expressions and the significance of fourteen statistical measures, called Haralick’s features [26], that is extracted from the GLCMs are given in [16].

1.2 Gabor features

A set of 24 Gabor filters [25] are obtained by appropriate dilations and rotations of the Gabor function,

$$\begin{aligned} g(x, y) = \frac{1}{2\pi \sigma _x \sigma _y} \exp \left[ - \frac{1}{2} \left( \frac{x^2}{\sigma _x^2}+\frac{y^2}{\sigma _y^2}\right) +2\pi jU_h x \right] \end{aligned}$$
(4)

through the generating function

$$\begin{aligned} g_{mn}(x,y) = a^{-m}g(x', y'). \end{aligned}$$
(5)

Here,

$$\begin{aligned} \sigma _x&= {} \frac{(a+1)(2\ln 2)^{\frac{1}{2}}}{2\pi (a-1) U_h },\\ \sigma _y&= {} \displaystyle \frac{1}{2\pi \tan \left( \frac{\pi }{2K}\right) \left[ U_h - 2 \ln \left( \frac{\sigma _u^2}{U_h}\right) \right] \left[ 2 \ln 2 - \frac{(2 \ln 2)^2\sigma _u^2}{U_h^2}\right] ^{-\frac{1}{2}}}, \end{aligned}$$

\(j=\sqrt{-1}, x' = a^{-m}\left( x\cos \phi + y \sin \phi \right)\) and \(y' = \left( -x \cos \phi + y \sin \phi \right)\); \(\phi\) and a are taken as \(\frac{n\pi }{K}\) and \(\left( \frac{U_h}{U_l}\right) ^{-\frac{1}{s-1}}\), respectively. The following values are used in our implementation as reported in [25]: m and n are integers selected in the interval [1, S] and [1, K], respectively, \(U_h = 0.04, U_l = 0.05\), and \(K=6\) and \(S=4\) are the total number of orientations and the number of scales, respectively. Six features, [37] are extracted from the outputs of Gabour filters given by

$$\begin{aligned} G_{mn}(x,y)= \text{ Re }(I(x, y) *\bar{g}_{mn}(x, y)), \end{aligned}$$
(6)

where \(\text{ Re }(\cdot )\) represents the real part of the argument, \(*\) is the convolution operator and \(\bar{g}_{mn}(x,y)\) is the complex conjugate of the generating function.

1.3 Daubechies (Db4) wavelet features

For an integer r, Daubechies wavelet can be defined as [14],

$$\begin{aligned} \phi _{i,j,k}(x) = 2^{j/2} \phi _r (2^j x - k),\quad j, k \in Z \end{aligned}$$
(7)

where, j is a scale, k is a translation and r is a filter. The coefficients of the orthonormal decomposition pairs in the case of Daubechies 4 wavelet function can be expressed as follows:

$$\begin{aligned} G(n)= & {} \left[ \frac{1-\sqrt{3}}{4\sqrt{2}}, - \frac{3-\sqrt{3}}{4\sqrt{2}}, \frac{3+\sqrt{3}}{4\sqrt{2}}, - \frac{1+\sqrt{3}}{4\sqrt{2}} \right] \end{aligned}$$
(8)
$$\begin{aligned} H(n)= & {} \left[ \frac{1+\sqrt{3}}{4\sqrt{2}}, \frac{3+\sqrt{3}}{4\sqrt{2}}, \frac{3-\sqrt{3}}{4\sqrt{2}}, \frac{1-\sqrt{3}}{4\sqrt{2}} \right] \end{aligned}$$
(9)

As per the parameter estimation method mentioned in [7], the estimated GGD model parameters of the Daubechies (Db4) wavelet coefficient distribution in a \(M \times N\) subband \(X = {x_{i,j}, i = 1, \ldots ,M, j = 1,\ldots ,N}\), namely \(\widehat{\alpha }\) and \(\widehat{\beta }\), are obtained as follows,

$$\begin{aligned} \widehat{\alpha } = \left( \frac{\beta }{MN}\sum _{i=1}^M \sum _{j=1}^N \left| x_{i,j}\right| ^{\beta }\right) ^{1/\beta } \end{aligned}$$
(10)

where \(\beta\) is an approximation of \(\widehat{\beta }\) determined using the Newton–Raphson iterative procedure [7]. In this work, the GGD model parameters are estimated from the details subbands of each level resulted in 18 feature components forming the feature vector for a given mammogram.

1.4 CDF 9/7 wavelet features

The lifting scheme for the CDF 9/7 wavelet includes two set of predict (\(p_1\) and \(p_2\)) and update (\(u_1\) and \(u_2\)) filters. The values used for a given operation in the CDF 9/7 wavelet implementation are as follows: \(\alpha \approx -1.58613432\) and \(\delta \approx 0.8829110762\) for predicting; \(\beta \approx -0.05298011854\) and \(\gamma \approx 0.4435068522\) for updating; \(\zeta \approx 1.149604398\) for scaling. The prediction errors, updated odd sets and normalized outputs are computed as given in Eqs. 11, 12 and 13, respectively [4].

$$\begin{aligned} x_e^1(n)&= {} x_e(n) - \alpha (x_o(n+1) + x_o(n))\nonumber \\ x_e^2(n)&= {} x_e^1(n) - \delta (x_o^1(n+1) + x_o^1(n)) \end{aligned}$$
(11)
$$\begin{aligned} x_o^1(n)&= {} x_o(n) + \beta (x_e^1(n) + x_e^1(n-1)) \nonumber \\ x_o^2(n)&= {} x_o^1(n) + \gamma (x_e^2(n) + x_e^2(n-1))\end{aligned}$$
(12)
$$\begin{aligned} x_o^3(n)&= {} \zeta x_o^2(n); \quad x_e^3 = \frac{x_e^2(n)}{\zeta } \end{aligned}$$
(13)

The above procedure is repeated with \(x = x_o^3\) for the next level decomposition. As mentioned in [28], a maximum of three levels of decomposition is chosen for this work. Next, the GGD [7] model parameters from the details subbands of each level and 32-bin histogram from the third level approximation coefficients are estimated to generate a feature vector of size 50 for a given mammogram.

1.5 Zernike moments features

The mammogram region of interest is pre-processed [10] and 256 ZMs are extracted without the need of previous image segmentation. For a digital image the respective ZMs of \(A_{pq}\) of order p with repetition q are computed as follows [24]:

$$\begin{aligned} A_{pq} = \frac{p+1}{\pi }\sum _{i} f(x_i, y_i)V_{pq}^*(\rho , \theta );\quad x_i^2+y_i^2\le 1 \end{aligned}$$
(14)

where i runs over all the image pixels, p is a non-negative integer, q is an integer subject to the constraint \(p - |q| = even, |q| \le p\), ‘\(*\)’ denotes complex conjugation.

$$\begin{aligned} V_{pq}(\rho , \theta )= {} R_{pq}(\rho )\exp (jq\theta ); \quad \rho = x^2+y^2 ;\quad \theta = tan^{-1}(y/x)\end{aligned}$$
(15)
$$\begin{aligned} R_{pq}(\rho )= & {} \sum _{s=0}^{(p-|q|)/2} \frac{(-1)^s \left[ (p-s)!\right] \rho ^{p-2s}}{s!\left( \frac{p+|q|}{2}-s\right) !\left( \frac{p-|q|}{2}-s\right) !} \end{aligned}$$
(16)

Appendix 2

1.1 Genetic algorithm (GA)

In genetic algorithm, a population is initially generated by randomly creating a group of individuals (or feature subsets). Then, these individuals are evaluated using the fitness function, i.e. the total precision value given in Eq. 2. Next, two individuals are selected based on their fitness. There is a higher chance of being selected if the fitness is higher. These individuals then reproduce through crossover to create offspring, which are then mutated randomly. This process continues until an optimal or near-optimal feature subset is found or a certain number of generations is over. Each chromosome representing a feature subset is encoded as a binary string of size equal to the particular feature set size. For example, the chromosome length is nine and 10 for Db4 and CDF 9/7 wavelet based features, respectively. Each gene or element of the chromosome refers to a particular feature and represented as 1 or 0 when it is present or absent, respectively. In our work, the population size considered is 20. The control parameters are tournament selection, single point crossover with 0.7 (probability of crossover) and 0.001 (probability of mutation) [32]. The stopping condition is either the number of generations (e.g. 100) or the maximum fitness value, which is equal to the total number of queries.

1.2 t test

Statistical multivariate t test select the most discriminant features by assessing the significance of the difference between the means of two sample set A and B. The value of the t test ‘t’ is obtained as follows [29],

$$\begin{aligned} D_a= {} \sum {(A_i - \mu _a)^2} \end{aligned}$$
(17)
$$\begin{aligned} D_b= {} \sum {(B_i - \mu _b)^2} \end{aligned}$$
(18)
$$\begin{aligned} V= {} \frac{D_a+D_b}{(n_a-1)+(n_b-1)} \end{aligned}$$
(19)
$$\begin{aligned} \sigma= {} \sqrt{\left( \frac{V}{n_a}+\frac{V}{n_b}\right) } \end{aligned}$$
(20)
$$\begin{aligned} t&= {} \frac{\mu _a-\mu _b}{\sigma } \end{aligned}$$
(21)

where \(A_i\) and \(B_i\) are the ith elements of the set A and B, \(\mu _a\) and \(\mu _b\) are its means, \(D_a\) and \(D_b\) are the sum of squared deviates of the set A and B, V is the estimated variance of the source population and \(\sigma\) is the standard deviation of the sampling distribution of sample mean differences. The degree of freedom (d.f.) is \((n_a - 1) + (n_b - 1)\). In our case study, with 20 normal images (set A) and 19 abnormal images (set B), [34] the d.f. is 37. The t value according to the Table of Critical Values for 37 degrees of freedom is 1.305. In the experiment, for a given feature if the ‘t’ value is greater than 1.305 implies that there is a significant mean difference between the normal and abnormal mammograms.

1.3 StARMiner and FD-ASE

The goal of StARMiner algorithm is to find statistical association rules to select a minimal set of features that preserves the ability of discerning image according to their types. Let \(x_j\) be a category of an image and \(f_i\), an image feature (attribute). The rules returned by the StARMiner algorithm have the format \(x_j \rightarrow f_i\). The threshold values used are as follows: \(\Delta \mu _{min}\) is the minimum allowed difference between the average of the feature \(f_i\) in images from category \(x_j\) and the average of \(f_i\) in the remaining data set; \(\sigma _{max}\) is the maximum standard deviation of \(f_i\) values allowed in a category; \(\gamma _{min}\) is the minimum confidence to reject the hypothesis \(H_0\). StARMiner mines rules of the form \(x_j \rightarrow f_i\), if the conditions given in Eqs. (2224) are satisfied [39].

$$\begin{aligned} \mu _{f_i}(V)= {} \frac{\sum _{k\in V} f_{ik}}{|V|} \end{aligned}$$
(22)
$$\begin{aligned} \sigma _{f_i}(V)= {} \sqrt{\frac{\sum _{k\in V}\left( f_{i_k} - \mu _{f_i}(V)\right) ^2 }{|V|}} \end{aligned}$$
(23)
$$\begin{aligned} \mu _{f_i}(T_{x_j})-\mu _{f_i}(T-T_{x_j})\ge {} \Delta \mu _{min} \end{aligned}$$
(24)
$$\begin{aligned} \sigma _{f_i}(T_{x_j})\le {} \sigma _{max}\end{aligned}$$
(25)
$$\begin{aligned} H_0 : \mu _{f_i}(T_{x_j})= {} \mu _{f_i}(T-T_{x_j}) \end{aligned}$$
(26)

FD-ASE algorithm performs the dimensionality reduction in the feature vector. Considering the contribution of each feature to increment the fractal dimension of the data set, the algorithm finds dependence relationship between attributes and determines a set of independent ones, discarding the others. The approach of attribute forward inclusion is used to recreate the data set. The fundamental idea is to calculate the partial correlation fractal dimension of data set projections, integrating more and more attributes until its calculated correlation fractal dimension is achieved. More details of this algorithm is found in [5].

1.4 Sequential selection methods

SFS [17] algorithm starts from an empty set. It sequentially adds the feature that results in the highest fitness value when combined with the features that have already been selected and moves towards. The search continues until full feature set is obtained. In SFS, it is unable to remove features that become obsolete after the addition of other features. SBS [17] starts from the full set and sequentially removes the feature that results in the best possible fitness value. SBS works best when the optimal feature subset has a large number of features. Reevaluation of the usefulness of a feature after it has been discarded is not possible in SBS.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chandy, D.A., Christinal, A.H., Theodore, A.J. et al. Neighbourhood search feature selection method for content-based mammogram retrieval. Med Biol Eng Comput 55, 493–505 (2017). https://doi.org/10.1007/s11517-016-1513-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11517-016-1513-x

Keywords

Navigation