Abstract
The effectiveness of classification and recognition systems has improved in a great deal to help medical experts in diagnosing diseases. Breast cancer is becoming a leading cause of death among women in the whole world; meanwhile, it is confirmed that the early detection and accurate diagnosis of this disease can ensure a long survival of the patients. This paper presents a hybrid intelligent system for recognition of breast cancer tumors. The proposed system includes two main modules: the feature extraction module and the predictor module. In the feature extraction module, rough set theory is used to preprocess the attributes on condition that the important information is not lost, deletes redundant attributes and conflicting objects from decision table. In the predictor module, a combined classifier is proposed based on K-nearest neighbor classifier. Experiments have been conducted on a widely used Wisconsin breast cancer dataset taken from University of California Irvine. Experimental results show that the proposed hybrid system can improve the rate of correct diagnosis of cases. The proposed combined classifier with rough set-based feature selection achieves 99.41 % classification accuracy and uses only 4 features which is the best shown to date. Different performance metrics are used to show the effectiveness of the proposed hybrid system. With these results, the proposed method is very promising compared to the previously reported results and can be used confidently for other breast cancer diagnosis problems.







Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
American Cancer Society Homepage (2014) Citing internet sources available from: http://www.cancer.org. Accessed 10 May 2014
Ghosh J (2002) Multiclassifier systems: back to the future. In: Roli F, Kittler J (eds) Multiple classifier systems. Lect Notes Comput Sci 2364:1–15
Zhang C, Ma Y (2012) Ensemble machine learning: methods and applications. Springer, Berlin
Etemad SA, Arya A (2014) Classification and translation of style and affect in human motion using RBF neural networks. Neurocomputing 129:585–595
Meynet J, Thiran JP (2010) Information theoretic combination of pattern classifiers. Pattern Recogn 43(10):3412–3421
Wolberg WH, Mangasarian OL (1990) Multisurface method of pattern separation for medical diagnosis applied to breast cytology. Proc Natl Acad Sci USA 87(23):9193–9196
Quinlan JR (1996) Improved use of continuous attributes in C4.5. J Artif Intell Res 4:77–90
Hamilton HJ, Shan N, Cercone N (1996) RIAC: a rule induction algorithm based on approximate classification. Technical Report CS 96-06, University of Regina
Ster B, Dobnikar A (1996) Neural networks in medical diagnosis: comparison with other methods. In: Proceedings of the international conference on engineering applications of neural networks, pp 427–430
Bennet KP, Blue JA (1997) A support vector machine approach to decision trees, Math Report, vols. 97–100, Rensselaer Polytechnic Institute
Nauck D, Kruse R (1999) Obtaining interpretable fuzzy classification rules from medical data. Artif Intell Med 16:149–169
Pena-Reyes CA, Sipper M (1999) A fuzzy-genetic approach to breast cancer diagnosis. Artif Intell Med 17:131–155
Setiono R (2000) Generating concise and accurate classification rules for breast cancer diagnosis. Artif Intell Med 18:205–219
Goodman DE, Boggess L, Watkins A (2002) Artificial immune system classification of multiple-class problems. In: Proceedings of the artificial neural networks in engineering ANNIE, pp 179–183
Abonyi J, Szeifert F (2003) Supervised fuzzy clustering for the identification of fuzzy classifiers. Pattern Recogn Lett 24:2195–2207
Polat K, Günes S (2007) Breast cancer diagnosis using least square support vector machine. Digit Signal Proc 17(4):694–701
Guijarro-Berdias B, Fontenla-Romero O, Perez-Sanchez B, Fraguela P (2007) A linear learning method for multilayer perceptrons using leastsquares. Lect Notes Comput Sci 365–374
Yang B, Wang L, Chen Z, Chen Y, Sun R (2010) A novel classification method using the combination of FDPS and flexible neural tree. Neurocomputing 73:690–699
Shafigh P, Yazdi Hadi S, Sohrab E (2013) Gravitation based classification. Inf Sci 220:319–330
Cateni S, Colla V, Vannucc M (2014) A method for resampling imbalanced data sets in binary classification tasks for real-world problems. Neurocomputing 135:32–41
Pawlak Z (1982) Rough sets. Int J Parallel Prog 11(5):341–356
Chen HL, Yang B, Liu J, Liu DY (2011) A support vector machine classifier with rough set-based feature selection for breast cancer diagnosis. Expert Syst Appl 38:9014–9022
Pawlak Z (1996) Why rough sets. In: Proceedings of the fifth IEEE international conference on fuzzy systems, vol 2, 8–11 September 1996, New Orleans, LA, USA, pp 738–743
Rami N, Khushaba N, Al-Ani A, Al-Jumaily A (2011) Feature subset selection using differential evolution and a statistical repair mechanism. In: Expert systems with applications. Elsevier, pp 11515–11526
Pawlak Z (1997) Rough set approach to knowledge-based decision support. Eur J Oper Res 99(1):48–57
Johnson DS (1974) Approximation algorithms for combinatorial problems. J Comput Syst Sci 9:256–278
Jensen R, Shen Q (2008) Computational intelligence and feature selection: rough and fuzzy approaches. Wiley
Mitchell TM (1997) Machine learning. The McGraw-Hill
Kittler J, Hatef M, Duin RPW, Matas J (1998) On combining classifiers. IEEE Trans Pattern Anal Mach Intell 20(3):226–239
Xu L, Krzyzak A, Suen CY (1992) Methods of combining multiple classifiers and their application to handwriting recognition. IEEE Trans SMC 22:418–435
Schapire RE (1990) The strenght of weak learnability. Mach Learn 5:197–227
Sokolova M, Japkowicz N, Szpakowicz S (2006) Beyond accuracy, F-score and ROC: a family of discriminant measures for performance evaluation. Adv Artif Intell 1015–1021
Kohavi R, Provost F (1998) Glossary of terms. Editorial for the Special Issue on Appl Mach Learn the Knowl Discov Process 30(2–3)
Tom F (2004) ROC graphs: notes and practical considerations for researchers. Mach Learn 31:1–38
Author information
Authors and Affiliations
Corresponding author
About this article
Cite this article
El-Baz, A.H. Hybrid intelligent system-based rough set and ensemble classifier for breast cancer diagnosis. Neural Comput & Applic 26, 437–446 (2015). https://doi.org/10.1007/s00521-014-1731-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-014-1731-9