Diagnosing Breast Masses in Digital Mammography Using Feature Selection and Ensemble Methods

Luo, Shu-Ting; Cheng, Bor-Wen

doi:10.1007/s10916-010-9518-8

Diagnosing Breast Masses in Digital Mammography Using Feature Selection and Ensemble Methods

Original Paper
Published: 14 May 2010

Volume 36, pages 569–577, (2012)
Cite this article

Journal of Medical Systems Aims and scope Submit manuscript

Shu-Ting Luo¹ &
Bor-Wen Cheng¹

850 Accesses
51 Citations
Explore all metrics

Abstract

Methods that can accurately predict breast cancer are greatly needed and good prediction techniques can help to predict breast cancer more accurately. In this study, we used two feature selection methods, forward selection (FS) and backward selection (BS), to remove irrelevant features for improving the results of breast cancer prediction. The results show that feature reduction is useful for improving the predictive accuracy and density is irrelevant feature in the dataset where the data had been identified on full field digital mammograms collected at the Institute of Radiology of the University of Erlangen-Nuremberg between 2003 and 2006. In addition, decision tree (DT), support vector machine—sequential minimal optimization (SVM-SMO) and their ensembles were applied to solve the breast cancer diagnostic problem in an attempt to predict results with better performance. The results demonstrate that ensemble classifiers are more accurate than a single classifier.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

An improved breast cancer disease prediction system using ML and PCA

Article 25 September 2023

Sara Laghmati, Soufiane Hamida, … Amal Tmiri

Breast Cancer Detection Algorithm Using Ensemble Learning

Applications of Machine Learning Techniques to Predict Diagnostic Breast Cancer

Article 14 August 2020

Vikas Chaurasia & Saurabh Pal

References

Razavi, A. R., Gill, H., Åhlfeldt, H., and Shahsavar, N., Predicting metastasis in breast cancer: comparing a decision tree with domain experts. J. Med. Syst. 31:263–273, 2007.
Article Google Scholar
Brenner, H., Long-term survival rates of cancer patients achieved by the end of the 20th century: a period analysis. Lancet. 360:1131–1135, 2002.
Article Google Scholar
Nystrom, L., Andersson, I., Bjurstam, N., Frisell, J., Nordenskjold, B., and Rutqvist, L. E., Long-term effects of mammography screening: updated overview of the Swedish randomised trials. Lancet. 359(9310):909–919, 2002.
Article Google Scholar
Bjurstam, N., Bjorneld, L., Warwick, J., Sala, E., Duffy, S. W., Nyström, L., et al., The Gothenburg breast screening trial. Cancer. 97(10):2387–2396, 2003.
Article Google Scholar
Rijnsburger, A. J., van Oortmarssen, G. J., Boer, R., Draisma, G., Miler, A. B., et al., Mammography benefit in the Canadian National Breast Screening Study-2: a model evaluation. Int. J. Cancer. 110(5):756–762, 2004.
Article Google Scholar
Carney, P. A., Miglioretti, D. L., Yankaskas, B. C., Kerlikowske, K., Rosenberg, R., Rutter, C. M., et al., Individual and combined effects of age, breast density, and hormone replacement therapy use on the accuracy of screening mammography. Ann. Intern. Med. 138(3):168–175, 2003.
Google Scholar
Pisano, E. D., Gatonis, C., Hendrick, E., Yaffe, M., Baum, J. K., Acharyya, S., et al., Diagnostic performance of digital versus film mammography for breast-cancer screening. N. Engl. J. Med. 353:1773–1783, 2005.
Article Google Scholar
Pisano, E. D., Gatonis, C., Hendrick, E., Yaffe, M., Baum, J. K., Acharyya, S., et al., Diagnostic accuracy of digital versus film mammography: exploratory analysis of selected population subgroups in DMIST. Radiology. 246(3):376–383, 2008.
Article Google Scholar
Kulkarni, A. D., Computer Vision and Fuzzy-Neural Systems. Prentice-Hall, Englewood-Cliffs, 2001.
Google Scholar
Karssemeijer, N., Adaptive noise equalization and recognition of microcalcification clusters in mammograms. Int. J. Pattern. Recog. Artificial. Intell. 7(6):1357–1376, 1993.
Article Google Scholar
Priebe, C. E., Lorey, R. A., Marchette, D. J., Solka, J. L., and Rogers, G. W., Nonparametric spatio-temporal change point analysis for early detection in mammography. In: Gale, A. G., Astley, S. M., Dance, D. R., and Cairns, A. Y. (Eds.), Digital mammography. Elsevier, Amsterdam, pp. 111–120, 1994.
Google Scholar
Heine, J. J., Deans, S. R., Cullers, D. K., Stauduhar, R., and Clarke, L. P., Multiresolution statistical analysis of high-resolution digital mammograms. IEEE. Trans. Med. Imaging. 5(16):503–515, 1997.
Article Google Scholar
Rakowski, W., and Clark, M. A., Do groups of women aged 50–75 match the national average mammography rate? Am. J. Prev. Med. 15(3):187–197, 1998.
Article Google Scholar
Chhatwal, J., Alagoz, O., Lindstrom, M. J., Kahn, C. E., Jr., Shaffer, K. A., and Burnside, E. S., A logistic regression model based on the national mammography database format to aid breast cancer diagnosis. Am. J. Roentgenol. 192(4):1117–1127, 2009.
Article Google Scholar
Sameti, M., and Ward, R. K., A fuzzy segmentation algorithm for mammogram partitioning. In: Doi, K., Giger, M. L., Nishikawa, R. M., and Schmidt, R. A. (Eds.), Third international workshop on digital mammography. Elsevier, Amsterdam, pp. 471–474, 1996.
Google Scholar
Qian, W., Sunden, P., Sjostrom, H., Fenger-Krog, H., and Brodin, U., Comparison of image quality for different digital mammogram image processing parameter settings versus analogue film. Electromedica. 71(1):2–6, 2003.
Google Scholar
Verma, B., and Zakos, J. A., Computer-aided diagnosis system for digital mammograms based on fuzzy-neural and feature extraction techniques. IEEE T. Inf. Technol. Biomed. 5(1):46–54, 2001.
Article Google Scholar
Acharya, U. R., Ng, E. Y. K., Chang, Y. H., Yang, J., and Kaw, G. J. L., Computer-based identification of breast cancer sing digitized mammograms. J. Med. Syst. 32(6):499–507, 2008.
Article Google Scholar
Rafayah, M., Qutaishat, M., and Abdallah, M., Breast cancer diagnosis system based on wavelet analysis and fuzzy-neural. Expert. Syst. Appl. 28(4):713–723, 2005.
Article Google Scholar
Verma, B., and Panchal, R., Neural networks for the classification of benign and malignant patterns in digital mammograms. In: Fulcher, J. (Ed.), Advances in applied artificial intelligence. Idea Group, USA, 2006.
Google Scholar
Brijesh, B., Novel network architecture and learning algorithm for the classification of mass abnormalities in digitized mammograms. Artif. Intell. Med. 42(1):67–79, 2008.
Article Google Scholar
Li, Y., and Jiang, J., Combination of SVM knowledge for microcalcification detection in digital mammograms. Lect. Notes Comput. Sci. 3177:359–365, 2004.
Article Google Scholar
de Oliveira Martins, L., Junior, G. B., Correa Silva, A., de Paiva, A. C., and Gattass, M., Detection of masses in digital mammograms using K-means and support vector machine. Electron. Lett. Comput. Vis. Image. Ana. 8(2):39–50, 2009.
Google Scholar
Yang, J., and Olafsson, S., Optimization-based feature selection with adaptive instance sampling. Comput. Oper. Res. 33(11):3088–3106, 2006.
Article MATH Google Scholar
Rodriguez, J. J., Kuncheva, L. I., and Alonso, C. J., Rotation forest: a new classifier ensemble method. IEEE Trans. Pattern Anal. Mach. Intell. 28(10):1619–1630, 2006.
Article Google Scholar
Kuncheva, L. I., Combining pattern classifiers: methods and algorithms. Wiley-IEEE Press, New York, 2004.
Book MATH Google Scholar
Schapire, R. E., The strength of weak learnability. Mach. Learn. 5(2):197–227, 1990.
Google Scholar
Mitchell, T., Machine learning. McGraw-Hill, New York, 1997.
MATH Google Scholar
Witten, I. H., and Frank, E., Data mining: practical machine learning tools with java implementations. Morgan Kaufmann, San Francisco, 2000.
Google Scholar
Razavi, A.R., Gill, H., Åhlfeldt, H., and Shahsavar, N.: A data pre-processing method to increase efficiency and accuracy in data mining. In: Miksch, S., Hunter, J., Keravnou, E. (eds.) 10th Conference on Artificial Intelligence in Medicine. Springer-Verlag GmbH, Aberdeen, pp. 434–443, 2005.
Quinlan, J. R., C4.5: Programs for machine learning. CA: Morgan Kaufmann, San Mateo, 1993.
Google Scholar
Vapnik, V. N., The nature of statistical learning theory. Springer, Berlin, 1995.
MATH Google Scholar
Platt, J.C.: Sequential minimal optimization: a fast algorithm for training support vector machines. Technical Report MSR-TR-98-14, Microsoft Research, 1998.
Melville, P., and Monney, R. J., Creating diversity in ensembles using artificial data. Inf. Fusion. 6(1):99–111, 2005.
Article Google Scholar
Schapire, R. E., Freund, Y., Bartlett, P. L., and Lee, W. S., Boosting the margin: a new explanation for the effectiveness of voting methods. Ann. Statist. 26(5):1651–1686, 1998.
Article MathSciNet MATH Google Scholar
Breiman, L., Random forests. Mach. Learn. 45(1):5–32, 2001.
Article MATH Google Scholar
Kim, H. C., Pang, S., Je, H. M., Kim, D., and Bang, S. Y., Constructing support vector machine ensemble. Pattern. Recognit. 36(12):2757–2767, 2003.
Article MATH Google Scholar
Valentini, G., and Dietterich, T. G., Low bias bagged support vector machines. In: Fawcett, T., and Mishra, N. (Eds.), International conference on machine learning. AAAI press, California, 2003.
Google Scholar
Breiman, L., Bagging predictors. Mach. Learn. 24(2):123–140, 1996.
MathSciNet MATH Google Scholar
Freund, Y., and Schapire, R. E., A decision-theoretic generalization of on-line learning and an application to Boosting. J. Comput. Syst. Sci. 55(1):119–139, 1997.
Article MathSciNet MATH Google Scholar
Zhang, C. X., Zhang, J. S., and Zhang, G. Y., An efficient modified Boosting method for solving classification problems. J. Comput. Appl. Math. 214(2):381–392, 2008.
Article MathSciNet MATH Google Scholar
Webb, G. I., MultiBoosting: a technique for combining Boosting and wagging. Mach. Learn. 40(2):159–197, 2000.
Article Google Scholar
Delen, D., Walker, G., and Kadam, A., Predicting breast cancer survivability: a comparison of three data mining methods. Artif. Intell. Med. 34:113–127, 2005.
Article Google Scholar
Centor, R. M., Signal detectability: the use of ROC curves and their analyses. Med. Decis. Mak. 11:102–106, 1991.
Article Google Scholar
Hanley, J. A., and McNeil, B., The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 143(1):29–36, 1982.
Google Scholar
DeLong, E. R., DeLong, D. M., and Clarke-Pearson, D. L., Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 44:837–845, 1988.
Article MATH Google Scholar
Newmann, D.J., Hettich, S., Blake, C.L., Merz, C.J.: UCI repository of machine learning database. http://archive.ics.uci.edu/ml/datasets/Mammographic+Mass, Irvine, CA: University of California, Department of Information and Computer Science, (1998)
Kopans, D. B., D’Orsi, C. J., Adler, D. D., et al., Breast Imaging Reporting and Data System (BIRADS). American College of Radiology, Reston, 1993.
Google Scholar
Elter, M., Wendtland, R. S., and Wittenberg, T., The prediction of breast cancer biopsy outcomes using two CAD approaches that both emphasize an intelligible decision process. Med. Phys. 34(11):4164–4172, 2007.
Article Google Scholar
Zhang, G. P., Neural networks for classification: a survey. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 30(4):451–462, 2000.
Article Google Scholar
Zangwill, L. M., Chan, K., Bowd, C., Hao, J., Lee, T. W., Weinreb, R. N., et al., Heidelberg retina tomograph measurements of the optic disc and parapapillary retina for detecting glaucoma analyzed by machine learning classifiers. Invest. Ophthalmol. Vis. Sci. 45(3):3144–3151, 2004.
Article Google Scholar

Download references

Acknowledgement

The authors like to express our appreciations to Prof. Gordon Turner-Walker for his help in correcting earlier versions of this paper. We also would like to thank the anonymous reviewers for their valuable comments and insightful suggestions.

Author information

Authors and Affiliations

Graduate School of Industry Engineering and Management, National Yunlin University of Science and Technology, 123 University Road, Section 3, Douliou, Yunlin, 64002, Taiwan
Shu-Ting Luo & Bor-Wen Cheng

Authors

Shu-Ting Luo
View author publications
You can also search for this author in PubMed Google Scholar
Bor-Wen Cheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shu-Ting Luo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Luo, ST., Cheng, BW. Diagnosing Breast Masses in Digital Mammography Using Feature Selection and Ensemble Methods. J Med Syst 36, 569–577 (2012). https://doi.org/10.1007/s10916-010-9518-8

Download citation

Received: 18 January 2010
Accepted: 23 April 2010
Published: 14 May 2010
Issue Date: April 2012
DOI: https://doi.org/10.1007/s10916-010-9518-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Diagnosing Breast Masses in Digital Mammography Using Feature Selection and Ensemble Methods

Abstract

Access this article

Similar content being viewed by others

An improved breast cancer disease prediction system using ML and PCA

Breast Cancer Detection Algorithm Using Ensemble Learning

Applications of Machine Learning Techniques to Predict Diagnostic Breast Cancer

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Diagnosing Breast Masses in Digital Mammography Using Feature Selection and Ensemble Methods

Abstract

Access this article

Similar content being viewed by others

An improved breast cancer disease prediction system using ML and PCA

Breast Cancer Detection Algorithm Using Ensemble Learning

Applications of Machine Learning Techniques to Predict Diagnostic Breast Cancer

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation