Skip to main content

Advertisement

Log in

A GA-based feature selection and parameter optimization of an ANN in diagnosing breast cancer

  • Theoretical Advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

Breast cancer is the most common cancer diagnosed and cause of death among women worldwide. There is evidence that early detection and treatment can increase the survival rate of breast cancer patients. The traditional method for diagnosing the disease relies on human experiences to identify the presence of certain pattern from the database. It is prone to human error, time consuming and labour intensive. Therefore, this work proposes an automatic breast cancer diagnosis technique using a genetic algorithm (GA) for simultaneous feature selection and parameter optimization of an artificial neural network (ANN). The proposed algorithm is implemented with three different variations of the backpropagation technique namely the resilient back-propagation (GAANN_RP), Levenberg–Marquardt (GAANN_LM) and gradient descent with momentum (GAANN_GD) for fine tuning of the weight of ANN, and their performances are compared. Besides, the effect of the feature selection and manual determination of the hidden node size has also been investigated. Interestingly, one of the proposed algorithms called GAANN_RP produces the best and on average, 99.24 and 98.29 % correct classification, respectively, on the Wisconsin breast cancer dataset, which is comparable with the results gathered from other works found in the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Ferlay J (2010) Nearly 1.4 million women worldwide diagnosed with breast cancer in 2008. http://www.wcrf.org/cancer_statistics/cancer_facts/women-breast-cancer.php. Accessed 31 Dec 2012

  2. Übeyli ED (2007) Implementing automated diagnostic systems for breast cancer detection. Expert Syst Appl 33(4):1054–1062

    Article  Google Scholar 

  3. Furundzic D, Djordjevic M, Jovicevic Bekic A (1998) Neural networks approach to early breast cancer detection. J Syst Archit 44(8):617–633

    Article  Google Scholar 

  4. Karabatak M, Ince MC (2009) An expert system for detection of breast cancer based on association rules and neural network. Expert Syst Appl 36(2, Part 2):3465–3469

  5. Rogers SK, Ruck DW, Kabrisky M (1994) Artificial neural networks for early detection and diagnosis of cancer. Cancer Lett 77(2–3):79–83

    Article  Google Scholar 

  6. Paliwal M, Kumar UA (2009) Neural networks and statistical techniques: a review of applications. Expert Syst Appl 36(1):2–17

    Article  Google Scholar 

  7. Walczak S, Cerpa N (1999) Heuristic principles for the design of artificial neural networks. Inf Softw Technol 41(2):107–117

    Article  Google Scholar 

  8. Rudy S, Huan L (1997) Neural-network feature selector. IEEE Trans Neural Netw 8(3):654–662

    Article  Google Scholar 

  9. Verikas A, Bacauskiene M (2002) Feature selection with neural networks. Pattern Recognit Lett 23(11):1323–1335

    Article  MATH  Google Scholar 

  10. Kabir MM, Islam MM, Murase K (2010) A new wrapper feature selection approach using neural network. Neurocomputing 73(16–18):3273–3283

    Article  Google Scholar 

  11. Tian J, Li M, Chen F (2010) Dual-population based coevolutionary algorithm for designing RBFNN with feature selection. Expert Syst Appl 37(10):6904–6918

    Article  Google Scholar 

  12. Huang C-L, Wang C-J (2006) A GA-based feature selection and parameters optimization for support vector machines. Expert Syst Appl 31(2):231–240

    Article  Google Scholar 

  13. Castillo PA, Merelo JJ, Prieto A, Rivas V, Romero G (2000) G-Prop: global optimization of multilayer perceptrons using GAs. Neurocomputing 35(1–4):149–163

    Article  MATH  Google Scholar 

  14. Kuo RJ (2001) A sales forecasting system based on fuzzy neural network with initial weights generated by genetic algorithm. Eur J Oper Res 129(3):496–517

    Article  MATH  Google Scholar 

  15. Kermani BG, White MW, Nagle HT (1995) Feature extraction by genetic algorithms for neural networks in breast cancer classification. In: Proceedings of the 17th annual conference on IEEE engineering in medicine and biology society, vol 831, pp 831–832

  16. Verma B, Zhang P (2007) A novel neural-genetic algorithm to find the most significant combination of features in digital mammograms. Appl Soft Comput 7(2):612–625

    Article  Google Scholar 

  17. Palaniappan R, Eswaran C (2009) Using genetic algorithm to select the presentation order of training patterns that improves simplified fuzzy ARTMAP classification performance. Appl Soft Comput 9(1):100–106

    Article  Google Scholar 

  18. Ferentinos KP (2005) Biological engineering applications of feedforward neural networks designed and parameterized by genetic algorithms. Neural Netw 18(7):934–950

    Article  Google Scholar 

  19. Almeida LM, Ludermir TB (2010) A multi-objective memetic and hybrid methodology for optimizing the parameters and performance of artificial neural networks. Neurocomputing 73(7–9):1438–1450

    Article  Google Scholar 

  20. Xin Y (1999) Evolving artificial neural networks. Proc IEEE 87(9):1423–1447

    Article  Google Scholar 

  21. Hall MA (1999) Correlation-based feature selection for machine learning. Ph.D. thesis, Department of Computer Science, University of Waikato, Hamilton

  22. Sun Y (2007) Iterative RELIEF for feature weighting: algorithms, theories, and applications. IEEE Trans Pattern Anal Mach Intell 29(6):1035–1051

    Article  Google Scholar 

  23. Peng H, Fulmi L, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238

    Article  Google Scholar 

  24. Kim G, Kim Y, Lim H, Kim H (2010) An MLP-based feature subset selection for HIV-1 protease cleavage site analysis. Artif Intell Med 48(2):83–89

    Article  MathSciNet  Google Scholar 

  25. Rumelhart D, Hinton G, Williams R (1986) Learning representations by back-propagating errors. Nature 323:533–536

    Article  Google Scholar 

  26. Kolen JF, Pollack JB (1991) Back propagation is sensitive to initial conditions. Adv Neural Inf Process Syst 3:860–867

    Google Scholar 

  27. Gupta JND, Sexton RS (1999) Comparing backpropagation with a genetic algorithm for neural network training. Omega 27(6):679–684

    Article  Google Scholar 

  28. K-j Kim (2006) Artificial neural networks with evolutionary instance selection for financial forecasting. Expert Syst Appl 30(3):519–526

    Article  Google Scholar 

  29. Sexton RS, Dorsey RE, Johnson JD (1998) Toward global optimization of neural networks: a comparison of the genetic algorithm and backpropagation. Decis Support Syst 22(2):171–185

    Article  Google Scholar 

  30. Fogel D, Wasson E III, Boughton E (1995) Evolving neural networks for detecting breast cancer. Cancer Lett 96(1):49–54

    Article  Google Scholar 

  31. Abbass HA (2002) An evolutionary artificial neural networks approach for breast cancer diagnosis. Artif Intell Med 25(3):265–281. doi:10.1016/S0933-3657(02)00028-3

  32. Arauzo-Azofra A, Aznarte JL, Benítez JM (2011) Empirical study of feature selection methods based on individual feature evaluation for classification problems. Expert Syst Appl 38(7):8170–8177

    Article  Google Scholar 

  33. Leung F, Lam H, Ling S, Tam P (2003) Tuning of the structure and parameters of a neural network using an improved genetic algorithm. IEEE Trans Neural Netw 14(1):79–88

    Article  Google Scholar 

  34. Wolberg WH (1990) Breast cancer Wisconsin (original) dataset. http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Original%29. Accessed 14 Oct 2012

  35. Hsu H–H, Hsieh C-W, Lu M-D (2011) Hybrid feature selection by combining filters and wrappers. Expert Syst Appl 38(7):8144–8150

    Article  Google Scholar 

  36. Zhao M, Fu C, Ji L, Tang K, Zhou M (2011) Feature selection and parameter optimization for support vector machines: a new approach based on genetic algorithm with feature chromosomes. Expert Syst Appl 38(5):5197–5204

    Article  Google Scholar 

  37. Goldberg D (1989) Genetic algorithms in search and optimization, 1st edn. Addison-Wesley, Boston

    MATH  Google Scholar 

  38. Prechelt L (1994) Proben1: a set of neural network benchmark problems and benchmarking rules. Technical Report, University of Karlsruhe, Karlsruhe, Germany

  39. Esugasini S, Mashor M, Isa N, Othman N (2005) Performance comparison for MLP networks using various back propagation algorithms for breast cancer diagnosis. In: Knowledge-based intelligent information and engineering systems. Lecture notes in computer science, vol 3682. Springer, Berlin, pp 166–166. doi:10.1007/11552451_17

  40. Riedmiller M, Braun H (1992) RPROP-A fast adaptive learning algorithm. In: Proceedings of the international symposium computer information sciences, Antalya, pp 279–285

  41. Quinlan JR (1996) Improved use of continuous attributes in C4.5. J Artif Int Res 4(1):77–90

  42. Hamilton HJ, Shan N, Cercone N (1996) RIAC: a rule induction algorithm based on approximate classification. Paper presented at the international conference on engineering applications of neural networks, University of Regina

  43. Nauck D, Kruse R (1999) Obtaining interpretable fuzzy classification rules from medical data. Artif Intell Med 16(2):149–169

    Article  MathSciNet  Google Scholar 

  44. Peña-Reyes CA, Sipper M (1999) A fuzzy-genetic approach to breast cancer diagnosis. Artif Intell Med 17(2):131–155

    Article  Google Scholar 

  45. Setiono R (2000) Generating concise and accurate classification rules for breast cancer diagnosis. Artif Intell Med 18(3):205–219

    Article  Google Scholar 

  46. Albrecht AA, Lappas G, Vinterbo SA, Wong C, Ohno-Machado L (2002) Two applications of the LSA machine. In: Proceedings of the 9th international conference on neural information processing (ICONIP’02), Singapore, pp 184–189

  47. Abonyi J, Szeifert F (2003) Supervised fuzzy clustering for the identification of fuzzy classifiers. Pattern Recognit Lett 24(14):2195–2207

    Article  MATH  Google Scholar 

  48. Polat K, Günes S (2007) Breast cancer diagnosis using least square support vector machine. Digit Signal Proc 17(4):694–701

    Article  Google Scholar 

  49. Guijarro-Berdiñas B, Fontenla-Romero O, Pérez-Sánchez B, Fraguela P (2007) A linear learning method for multilayer perceptrons using least-squares. In: Intelligent data engineering and automated learning (IDEAL’07). Lecture notes in computer science, vol 4881. Springer, Berlin, pp 365–374. doi:10.1007/978-3-540-77226-2_38

  50. Akay MF (2009) Support vector machines combined with feature selection for breast cancer diagnosis. Expert Syst Appl 36(2, Part 2):3240–3247

  51. Peng Y, Wu Z, Jiang J (2010) A novel feature selection approach for biomedical data classification. J Biomed Inform 43(1):15–23

    Article  Google Scholar 

  52. Marcano-Cedeño A, Quintanilla-Domínguez J, Andina D (2011) WBCD breast cancer database classification applying artificial metaplasticity neural network. Expert Syst Appl 38(8):9573–9579

    Article  Google Scholar 

  53. Stoean R, Stoean C (2013) Modeling medical decision making by support vector machines, explaining by rules of evolutionary algorithms with feature selection. Expert Syst Appl 40(7):2677–2686. doi:10.1016/j.eswa.2012.11.007

Download references

Acknowledgments

This research is partially supported by Universiti Sains Malaysia’s Research University Postgraduate Research Grant Scheme (USM-RU-PGRS) entitled ‘Genetic Algorithm-Artificial Neural Network Hybrid Intelligence’ and the Universiti Sains Malaysia’s Research University Grant entitled ‘Study on Compatibility of FTIR Spectral Characteristics for the Development of Intelligent Cervical Pre-cancerous Diagnostic System’. The author also wishes to thank Universiti Teknologi MARA for its financial assistance.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fadzil Ahmad.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ahmad, F., Mat Isa, N.A., Hussain, Z. et al. A GA-based feature selection and parameter optimization of an ANN in diagnosing breast cancer. Pattern Anal Applic 18, 861–870 (2015). https://doi.org/10.1007/s10044-014-0375-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-014-0375-9

Keywords

Navigation