Classification of breast masses in mammograms using genetic programming and feature selection

Nandi, R. J.; Nandi, A. K.; Rangayyan, R. M.; Scutt, D.

doi:10.1007/s11517-006-0077-6

Classification of breast masses in mammograms using genetic programming and feature selection

Original Article
Published: 21 July 2006

Volume 44, pages 683–694, (2006)
Cite this article

Medical and Biological Engineering and Computing Aims and scope Submit manuscript

R. J. Nandi¹,
A. K. Nandi¹,
R. M. Rangayyan² &
…
D. Scutt³

792 Accesses
Explore all metrics

Abstract

Mammography is a widely used screening tool and is the gold standard for the early detection of breast cancer. The classification of breast masses into the benign and malignant categories is an important problem in the area of computer-aided diagnosis of breast cancer. A small dataset of 57 breast mass images, each with 22 features computed, was used in this investigation; the same dataset has been previously used in other studies. The extracted features relate to edge-sharpness, shape, and texture. The novelty of this paper is the adaptation and application of the classification technique called genetic programming (GP), which possesses feature selection implicitly. To refine the pool of features available to the GP classifier, we used feature-selection methods, including the introduction of three statistical measures—Student’s t test, Kolmogorov–Smirnov test, and Kullback–Leibler divergence. Both the training and test accuracies obtained were high: above 99.5% for training and typically above 98% for test experiments. A leave-one-out experiment showed 97.3% success in the classification of benign masses and 95.0% success in the classification of malignant tumors. A shape feature known as fractional concavity was found to be the most important among those tested, since it was automatically selected by the GP classifier in almost every experiment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Genetic Programming for the Classification of Levels of Mammographic Density

Computer-Aided Breast Cancer Diagnosis with Optimal Feature Sets: Reduction Rules and Optimization Techniques

Breast Cancer Prediction Using Hybridization of Machine Learning and Optimization Technique

References

Page title: Breast Cancer Statistics (2005) Source: UK National Statistics website http://www.statistics.gov.uk/
Yaffe MJ (2001) Digital mammography: IWDM 2000, Madison. Medical Physics Publishing, WI
Google Scholar
Peitgen H–O (2003) Digital mammography: IWDM 2002. Springer, Bremen
Google Scholar
Rangayyan RM, Ayres FJ, Desautels JEL (2005) Computer-aided diagnosis of breast cancer: toward the detection of early and subtle signs, the 1st world experts’ congress on women’s health medicine and healthcare. World Academy of Biomedical Technologies, Paris
Google Scholar
Brzakovic D, Luo XM, Brzakovic P (1990) An approach to automated detection of tumours in mammograms. IEEE Trans Med Imaging 9(3):233–241
Article Google Scholar
Kegelmeyer WP, Pruneda Jr JM, Bourland PD, Hillis A, Riggs MW, Nipper ML (1994) Computer-aided mammographic screening for spiculated lesions. Radiology 191(2):331–337
Google Scholar
Laws KI (1980) Rapid texture identification. In: Proceedings of SPIE, vol 238: Image processing for missile guidance, pp 376–380
Rangayyan RM, Mudigonda NR, Desautels JEL (2000) Boundary modeling and shape analysis methods for classification of mammographic masses. Med Biol Eng Comput 38:487–95
Article Google Scholar
Sahiner BS, Chan H-P, Petrick N, Helvie MA, Hadjiiski LM (2001) Improvement of mammographic mass characterization using spiculation measures and morphological features. Med Phys 28(7):1455–1465
Article Google Scholar
Sahiner BS, Chan H-P, Petrick N, Helvie MA, Goodsitt MM (1998) Computerized characterization of masses on mammograms: the rubber band straightening transform and texture analysis. Med Phys 25(4):516–526
Article Google Scholar
Haralick RM, Shanmugam K, Dinstein I (1973) Textural features for image classification. IEEE Trans Syst Man Cybern SMC–3(6):610–621
Article Google Scholar
Haralick RM (1979) Statistical and structural approaches to texture. Proc IEEE 67(5):786–804
Article Google Scholar
Shen L, Rangayyan RM, Desautels JEL (1993) Detection and classification of mammographic calcifications. Int J Pattern Recognit Artif Intell 7(6):1403–1416
Article Google Scholar
Rangayyan RM, El-Faramawy NM, Desautels JEL, Alim OA (1997) Measures of acutance and shape for classification of breast tumors. IEEE Trans Med Imaging 16(6):799–810
Article Google Scholar
Sahiner BS, Chan HP, Petrick N, Wagner RF, Hadjiiski L (2000) Feature selection and classifier performance in computer-aided diagnosis: the effect of finite sample size. Med Phys 27(7):1509–1522
Article Google Scholar
Alto H, Rangayyan RM, Desautels JEL (2005) Content-based retrieval and analysis of mammographic masses. J Electron Imaging 14(2): Article no. 023016, pp 1–17
Google Scholar
Theodoridis S, Koutroumbas K (2005) Pattern recognition. Academic, New York
Google Scholar
Pearson K (1901) Principal components analysis. Lond Edinburgh Dublin Philos Mag J Sci 2(2):559
Google Scholar
Alberta Cancer Board (2004) Screen test: Alberta Program for the early detection of breast cancer, 2001/2003 biennial report, Edmonton, Alberta. http://www.cancerboard.ab.ca/screentest/
Mudigonda NR, Rangayyan RM, Desautels JEL (2000) Gradient and texture analysis for the classification of mammographic masses. IEEE Trans Med Imaging 19(10):1032–1043
Article Google Scholar
Mudigonda NR, Rangayyan RM, Desautels JEL (2001) Detection of breast masses in mammograms by density slicing and texture flow field analysis. IEEE Trans Med Imaging 20(12):1215–1227
Article Google Scholar
Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge, USA
MATH Google Scholar
Zhang L, Jack LB, Nandi AK (2005) Fault detection using genetic programming. Mech Syst Signal Process 19:271–289
Article Google Scholar
Guo H, Jack LB, Nandi AK (2005) Feature generation using genetic programming with application to fault classification. IEEE Trans Syst Man Cybern Part B 35(1):89–99
Article Google Scholar
Nordin P, Banzhaf W (1997) Real time control of a khepera robot using genetic programming. Cybern Control 26(3):533–561
MathSciNet Google Scholar
Kishore JK, Patnaik LM, Mani V, Agrawal VK (2000) Application of genetic programming for multicategory pattern classification. IEEE Trans Evol Comput 4(3):242–258
Article Google Scholar
Kudo M, Sklansky J (2000) Comparison of algorithms that select features for pattern classifiers. Pattern Recognit 33(1):25–41
Article Google Scholar
Press WH, Flannery BP, Teukolsky SA, Vetterling WT (1989) Numerical recipes in C. Cambridge University Press, Cambridge, UK
Google Scholar
Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Statist 22(1):79–86
Article MathSciNet MATH Google Scholar
Nykter M (2004) Feature selection for Lymphoma outcome prediction. In: Proceedings of the 2nd TICSP workshop on computational systems biology. WCSB’2004, Silja Opera, Helsinki-St. Petersburg 14–16 June, pp 51–52
Koller D, Shami M (1996) Toward optimal feature selection. In: Proceedings of the 13th international conference on machine learning. ICML–96, pp 284–292
Levner I (2005) Feature selection and nearest centroid classification for protein mass spectrometry. BMC Bioinf 6:68. doi: 10.1186/1471–2105–6–68
Sahiner B, Chan HP, Petrick N, Helvie MA, Goodsitt MM, Adler DA (1996) Classification of mass and normal breast tissue: feature selection using a genetic algorithm. In: Proceedings of 3rd internatrional workshop on digital mammography, Chicago, pp 379–384
American College of Radiology (ACR) (1998) Illustrated breast imaging reporting and data system (BI-RADS), 3rd edn. American College of Radiology, Reston
Fukunaga K, Hayes RR (1989) Effects of sample size in classifier design. IEEE Trans Pattern Anal Mach Intell 11(8):873–885
Article Google Scholar
Raudys SJ, Jain AK (1991) Small sample size effects in statistical pattern recognition: recommendations for practitioners. IEEE Trans Pattern Anal Mach Intell 13(3):252–264
Article Google Scholar
Duda RO, Hart PE (1973) Pattern classification and scene analysis. Wiley, New York
MATH Google Scholar
Efron B, Tibshirani RJ (1998) An introduction to the bootstrap. CRC Press LLC, Boca Raton
Google Scholar
Liu Y, Smith MR, Rangayyan RM (2004) The application of Efron’s bootstrap methods in validating feature classification using artificial neural networks for the analysis of mammographic masses. In: 26th annual international conference of the IEEE engineering in medicine and biology society, San Francisco. IEEE, CA, pp 1553–1556

Download references

Acknowledgments

This research work was partly funded by the Medical Research Council, UK, through the InterDisciplinary Bridging Awards (IDBA) scheme, and by a grant from the University of Calgary Research Grants Committee. Authors would like to thank Mr. L. Zhang, a research student at the University of Liverpool, for his initial assistance with genetic programming code.

Author information

Authors and Affiliations

Department of Electrical Engineering and Electronics, The University of Liverpool, Brownlow Hill, Liverpool, L69 3GJ, UK
R. J. Nandi & A. K. Nandi
Department of Electrical and Computer Engineering, Schulich School of Engineering, University of Calgary, Calgary, AB, Canada, T2N 1N4
R. M. Rangayyan
School of Health Sciences, The University of Liverpool, Thompson Yates Building, Liverpool, L69 3GB, UK
D. Scutt

Authors

R. J. Nandi
View author publications
You can also search for this author inPubMed Google Scholar
A. K. Nandi
View author publications
You can also search for this author inPubMed Google Scholar
R. M. Rangayyan
View author publications
You can also search for this author inPubMed Google Scholar
D. Scutt
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to A. K. Nandi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nandi, R.J., Nandi, A.K., Rangayyan, R.M. et al. Classification of breast masses in mammograms using genetic programming and feature selection. Med Bio Eng Comput 44, 683–694 (2006). https://doi.org/10.1007/s11517-006-0077-6

Download citation

Received: 16 December 2005
Accepted: 12 May 2006
Published: 21 July 2006
Issue Date: August 2006
DOI: https://doi.org/10.1007/s11517-006-0077-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Classification of breast masses in mammograms using genetic programming and feature selection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Genetic Programming for the Classification of Levels of Mammographic Density

Computer-Aided Breast Cancer Diagnosis with Optimal Feature Sets: Reduction Rules and Optimization Techniques

Breast Cancer Prediction Using Hybridization of Machine Learning and Optimization Technique

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now