Skip to main content

Advertisement

Log in

An ensemble algorithm using quantum evolutionary optimization of weighted type-II fuzzy system and staged Pegasos Quantum Support Vector Classifier with multi-criteria decision making system for diagnosis and grading of breast cancer

  • Data analytics and machine learning
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Breast cancer is a life-threatening and consequential disease due to its invasive and proliferative trait, predominantly found in women. Early detection of the cancer is a significant contributor to improved mortality and hence is an area of keen focus for ongoing researches. However, developing a technique to diagnose the severity of the patients at an early stage is a challenging task. Manual diagnostic techniques are time-consuming and result in inaccurate diagnosis of breast cancer. Prompted by these facts, a quantum optimized rule-base generated automated framework is developed to cluster the data based on degree of criticality of the cancer patients and further classify it as benign or malignant utilizing probability of malignancy of the clusters along with assignment of grades of cancer. Firstly, after implementing data pre-processing step, significant features are selected using an integrated feature selection approach. An efficient weightage algorithm is proposed incorporating the knowledge of physicians and the benefits of regression analysis which thereby provides a novel approach for detection of breast cancer. A novel ensemble clustering and classification algorithm employing voting-based Weighted Interval Type-II Fuzzy Inference System and Staged Pegasos Quantum Support Vector Classifier is then developed basis the prioritization of clusters depicting the critical state of breast cancer. A grading approach is also proposed based on fuzzy linguistic multi-criteria decision making system. Finally, the research is validated on Wisconsin Breast Cancer dataset. The detailed implementation of the proposed integrated model is accomplished to establish its superiority over other existing models in the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data availability

Enquiries about data availability should be directed to the authors.

References

  • Aalaei S, Shahraki H, Rowhanimanesh A, Eslami S (2016) Feature selection using genetic algorithm for breast cancer diagnosis: experiment on three different datasets. Iran J Basic Med Sci 19:476–482

    Google Scholar 

  • Agrawal U, Soria D, Wagner C, Garibaldi J, Ellis IO, Bartlett JM, Cameron D, Rakha EA, Green AR (2019) Combining clustering and classification ensembles: A novel pipeline to identify breast cancer profiles. Artif Intell Med 97:27–37

    Google Scholar 

  • Ahmad A, Dey L (2011) A k-means type clustering algorithm for subspace clustering of mixed numeric and categorical datasets. Pattern Recognit Lett 32:1062–1069

    Google Scholar 

  • Ahmad A, Hashmi S (2016) K-Harmonic means type clustering algorithm for mixed datasets. Appl Soft Comput 48:39–49

    Google Scholar 

  • Ahmad F, Isa NA, Hussain Z, Osman MK, Sulaiman SN (2015) A GA-based feature selection and parameter optimization of an ANN in diagnosing breast cancer. Pattern Anal Appl 18:861–870

    MathSciNet  Google Scholar 

  • Ahmadi A, Afshar P (2016) Intelligent breast cancer recognition using particle swarm optimization and support vector machines. J Exp Theor Artif Intell 28:1021–1034

    Google Scholar 

  • Alickovic E, Subasi A (2017) Breast cancer diagnosis using GA feature selection and rotation forest. Neural Comput Appl 28:753–763

    Google Scholar 

  • Alwidian J, Hammo BH, Obeid N (2018) WCBA: Weighted classification based on association rules algorithm for breast cancer disease. Appl Soft Comput 62:536–549

    Google Scholar 

  • Anisha PR, Babu BV (2019) CEBPS: cluster based effective breast cancer prediction system. Int J Recent Technol Eng 7:260–264

    Google Scholar 

  • Asri H, Mousannif H, Al Moatassime H, Noel T (2016) Using machine learning algorithms for breast cancer risk prediction and diagnosis. Procedia Comput Sci 83:1064–1069. https://doi.org/10.1016/j.procs.2016.04.224

    Article  Google Scholar 

  • Balanică V, Dumitrache I, Caramihai M, Rae W, Herbst C (2011) Evaluation of breast cancer risk by using fuzzy logic. U Politeh Buch Ser C 73:53–64

    Google Scholar 

  • Batista GE, Prati RC, Monard MC (2004) A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explorations Newsl 6:20–29

    Google Scholar 

  • Bauer E, Kohavi R (1999) An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Mach Learn 36:105–139

    Google Scholar 

  • Behzadian M, Otaghsara SK, Yazdani M, Ignatius J (2012) A state-of the-art survey of TOPSIS applications. Expert Syst Appl 39:13051–13069

    Google Scholar 

  • Benioff P (1982) Quantum mechanical Hamiltonian models of Turing machines. J Stat Phys 29:515–546

    MathSciNet  MATH  Google Scholar 

  • Bukya VP, Nandyala R, Banoth M, Yootla M, Chowhan AK, Prayaga AK (2018) Comparative study of Robinson’s and Mouriquand’s cytological grading systems and correlation with histological grading in breast carcinoma. J Clin of Diagn Res 12:4–8

    Google Scholar 

  • Caramihai M, Severin I, Blidaru A, Balan H, Saptefrati C (2010) Evaluation of breast cancer risk by using fuzzy logic. In: Proceedings of the 10th WSEAS international conference on applied informatics and communications, and 3rd WSEAS international conference on biomedical electronics and biomedical informatics, World Scientific and Engineering Academy and Society (WSEAS), pp 37–42

  • Castillo O, Melin P (2008) Type-2 fuzzy logic: theory and applications. Springer-Verlag, Berlin

    MATH  Google Scholar 

  • Chaurasia V, Pal S, Tiwari BB (2018) Prediction of benign and malignant breast cancer using data mining techniques. J Algorithm Comput Technol 12:119–126. https://doi.org/10.1177/1748301818756225

    Article  Google Scholar 

  • Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

    MATH  Google Scholar 

  • Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pp 785–794

  • Cheung YM, Jia H (2013) Categorical-and-numerical-attribute data clustering based on a unified similarity metric without knowing cluster number. Pattern Recognit 46:2228–2238

    MATH  Google Scholar 

  • Cutler A, Cutler DR, Stevens JR (2012) Random forests. Ensemble machine learning. Springer, Boston, MA, pp 157–175

    Google Scholar 

  • Dalton L, Ballarin V, Brun M (2009) Clustering algorithms: on learning, validation, performance, and applications to genomics. Curr Genomics 10:430–445

    Google Scholar 

  • Dalwinder S, Birmohan S, Manpreet K (2020) Simultaneous feature weighting and parameter determination of neural networks using ant lion optimization for the classification of breast cancer. Biocybern Biomed Eng 40:337–351

    Google Scholar 

  • De Maesschalck R, Jouan-Rimbaud D, Massart DL (2000) The mahalanobis distance. Chemometr Intell Lab Syst 50:1–8. https://doi.org/10.1016/S0169-7439(99)00047-7

    Article  Google Scholar 

  • Dua D, Graff C (2019) UCI machine learning repository, 2017. http://archive.ics.uci.edu/ml

  • Dubey AK, Gupta U, Jain S (2016) Analysis of k-means clustering approach on the breast cancer Wisconsin dataset. Int J Comput Assist Radiol Surg 11:2033–2047. https://doi.org/10.1007/s11548-016-1437-9

    Article  Google Scholar 

  • Dubey AK, Gupta U, Jain S (2018) Comparative study of k-means and fuzzy C-means algorithms on the breast cancer data. Int J Adv Sci Eng Inf Technol 8:18–29. http://dx.doi.org/https://doi.org/10.18517/ijaseit.8.1.3490

  • Ed-daoudy A, Maalmi K. (2020) Breast cancer classification with reduced feature set using association rules and support vector machine. NetMAHIB 9:1–0.

  • Feynman RP (2018) Simulating physics with computers. In: Feynman and computation, CRC Press, pp 133–153

  • García V, Sánchez JS, Mollineda RA (2012) On the effectiveness of preprocessing methods when dealing with different levels of class imbalance. Knowl Based Syst 25:13–21

    Google Scholar 

  • He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21:1263–1284

    Google Scholar 

  • Ho TK, Basu M (2002) Complexity measures of supervised classification problems. IEEE Trans Pattern Anal Mach Intell 24:289–300

    Google Scholar 

  • Huang Z (1997) Clustering large data sets with mixed numeric and categorical values. In: Proceedings of the 1st pacific-asia conference on knowledge discovery and data mining (PAKDD), pp 21–34

  • Jain YK, Bhandare SK (2011) Min max normalization based data perturbation method for privacy protection. Int J Comput Commun 2:45–50

    Google Scholar 

  • Jia H, Cheung YM (2017) Subspace clustering of categorical and numerical data with an unknown number of clusters. IEEE Trans Neural Netw Learn Syst 29:3308–3325

    MathSciNet  Google Scholar 

  • Juang CF, Huang RB, Lin YY (2009) A recurrent self-evolving interval type-2 fuzzy neural network for dynamic system processing. IEEE Trans Fuzzy Syst 17:1092–1105. https://doi.org/10.1109/TFUZZ.2009.2021953

    Article  Google Scholar 

  • Keerin P, Kurutach W, Boongoen T (2012) Cluster-based knn missing value imputation for dna microarray data. In: Proceedings of international conference on systems, man, and cybernetics (SMC), IEEE, pp 445–450. https://doi.org/10.1109/ICSMC.2012.6377764

  • Khairunnahar L, Hasib MA, Rezanur RH, Islam MR, Hosain MK (2019) Classification of malignant and benign tissue with logistic regression. Inform Med Unlocked 16:1–12

    Google Scholar 

  • Khezri R, Hosseini R, Mazinani M (2014) A fuzzy rule-based expert system for the prognosis of the risk of development of the breast cancer. Int J Eng Sci 27:1557–1564

    Google Scholar 

  • Khodadi I, Abadeh MS (2016) Genetic programming-based feature learning for question answering. Inf Process Manage 52:340–357

    Google Scholar 

  • Kutner MH, Nachtsheim CJ, Neter J, Wasserman W (2004) Applied linear regression models. New York: Mcgraw-Hill/irwin 4:563–568

    Google Scholar 

  • Li Z, Liu X, Xu N, Du J (2015) Experimental realization of a quantum support vector machine. Phys Rev Lett 114:140504

    Google Scholar 

  • Lin M, Tang K, Yao X (2013) Dynamic sampling approach to training neural networks for multiclass imbalance classification. IEEE Trans Neural Netw Learn Syst 24:647–660

    Google Scholar 

  • Mafarja M, Mirjalili S (2018) Whale optimization approaches for wrapper feature selection. Appl Soft Comput 62:441–453

    Google Scholar 

  • Mashayekhi M, Gras R (2015) Rule extraction from random forest: the RF+ HC methods. Canadian conference on artificial intelligence. Springer, Cham, pp 223–237

    Google Scholar 

  • Melin P, Castillo O (2013) A review on the applications of type-2 fuzzy logic in classification and pattern recognition. Expert Syst Appl 40:5413–5423. https://doi.org/10.1016/j.eswa.2013.03.020

    Article  Google Scholar 

  • Mendel JM (2017) Uncertain rule-based fuzzy systems: introduction and new directions. Springer, New York

    MATH  Google Scholar 

  • Modi N, Ghanchi K (2016) A comparative analysis of feature selection methods and associated machine learning algorithms on Wisconsin breast cancer dataset (WBCD). In: Proceedings of international conference on ICT for sustainable development, Springer, Singapore, pp 215–224

  • Nguyen TT, Nguyen MP, Pham XC, Liew AWC (2018) Heterogeneous classifier ensemble with fuzzy rule-based meta learner. Inf Sci 422:144–160

    Google Scholar 

  • Nguyen QH, Do TT, Wang Y, Heng SS, Chen K, Ang WHM, Philip CE, Singh M, Pham HN, Nguyen B, Chua MC (2019) Breast cancer prediction using feature selection and ensemble voting. In: 2019 International conference on system science and engineering (ICSSE), IEEE, pp 250–254

  • Nielsen MA, Chuang I (2002) Quantum computation and quantum information. Am J Phys 70:558–560. https://doi.org/10.1119/1.1463744

    Article  Google Scholar 

  • Nilashi M, Ibrahim O, Ahmadi H, Shahmoradi L (2017) A knowledge-based system for breast cancer classification using fuzzy logic method. Telemat Inform 34:133–144. https://doi.org/10.1016/j.tele.2017.01.007

    Article  Google Scholar 

  • Ohri K, Singh H, Sharma A (2016) Fuzzy expert system for diagnosis of breast cancer. In: Proceedings of international conference on wireless communications, signal processing and networking (WiSPNET), IEEE, pp 2487–2492. https://doi.org/10.1109/WiSPNET.2016.7566591

  • Ojha U, Goel S (2017) A study on prediction of breast cancer recurrence using data mining techniques. In: Proceedings of 7th international conference on cloud computing, data science & engineering-confluence, IEEE, pp 527–530. https://doi.org/10.1109/CONFLUENCE.2017.7943207

  • Phan AV, Le Nguyen M, Bui LT (2017) Feature weighting and SVM parameters optimization based on genetic algorithms for classification problems. Appl Intell 46:455–469

    Google Scholar 

  • Rahman MA, Muniyandi RC (2018) Feature selection from colon cancer dataset for cancer classification using artificial neural network. Int J Adv Sci Eng Inf Technol 8:1387–1393

    Google Scholar 

  • Rebentrost P, Mohseni M, Lloyd S (2014) Quantum support vector machine for big data classification. Phys Rev Lett 113:130503

    Google Scholar 

  • Ronoud S, Asadi S (2019) An evolutionary deep belief network extreme learning-based for breast cancer diagnosis. Soft Comput 23:13139–13159. https://doi.org/10.1007/s00500-019-03856-0

    Article  Google Scholar 

  • Rousseeuw PJ, Croux C (1993) Alternatives to the median absolute deviation. J Am Stat Assoc 88:1273–1283

    MathSciNet  MATH  Google Scholar 

  • Sahran S, Albashish D, Abdullah A, Abd Shukor N, Pauzi SH (2018) Absolute cosine based SVM-RFE feature selection method for prostate histopathological grading. Artif Intell Med 87:78–90

    Google Scholar 

  • Shalev-Shwartz S, Singer Y, Srebro N, Cotter A (2011) Pegasos: primal estimated sub-gradient solver for svm. Math Program 127:3–30

    MathSciNet  MATH  Google Scholar 

  • Simmons JP, Nelson LD, Simonsohn U (2011) False-positive psychology: undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychol Sci 22:1359–1366

    Google Scholar 

  • Singh S, Jangir SK, Kumar M, Verma M, Kumar S, Walia TS, Kamal SM (2022) Feature importance score-based functional link artificial neural networks for breast cancer classification. Biomed Res Int 2022:1–8

    Google Scholar 

  • Sun Y, Wong AK, Kamel MS (2009) Classification of imbalanced data: A review. Intern J Pattern Recognit Artif Intell 23:687–719

    Google Scholar 

  • Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B Stat Methodol 58:267–288

    MathSciNet  MATH  Google Scholar 

  • Tintu PB, Paulin R (2013) Detect breast cancer using fuzzy c means techniques in wisconsin prognostic breast cancer (WPBC) data sets. Int J Comput Appl Technol Res 2:614–617. https://doi.org/10.7753/IJCATR0205.1017

    Article  Google Scholar 

  • Venkatadri M, Reddy LC (2011) A review on data mining from past to the future. Int J Comput Appl 15:19–22

    Google Scholar 

  • Vives-Boix V, Ruiz-Fernandez D (2021) Fundamentals of artificial metaplasticity in radial basis function networks for breast cancer classification. Neural Comput Appl 17:1–12

    Google Scholar 

  • Wang H, Liu J, Zhi J, Fu C (2013) The improvement of quantum genetic algorithm and its application on function optimization. Math Probl Eng 2013:1–10

    Google Scholar 

  • Wang H, Zheng B, Yoon SW, Ko HS (2018) A support vector machine-based ensemble algorithm for breast cancer diagnosis. Eur J Oper Res 267:687–699

    MathSciNet  MATH  Google Scholar 

  • Wang S, Wang Y, Wang D, Yin Y, Wang Y, Jin Y (2020) An improved random forest based rule extraction method for breast cancer diagnosis. Appl Soft Comput 86:105941

    Google Scholar 

  • Weiss GM (2004) Mining with rarity: a unifying framework. ACM SIGKDD Explor Newsl 6:7–19

    Google Scholar 

  • Weiss GM, Tian Y (2008) Maximizing classifier utility when there are data acquisition and modeling costs. Data Min Knowl Discov 17:253–282

    MathSciNet  Google Scholar 

  • Xu R, Wunsch D (2005) Survey of clustering algorithms. IEEE Trans Neural Netw Learn Syst 16:645–678. https://doi.org/10.1109/TNN.2005.845141

    Article  Google Scholar 

  • Yang J, Rahardja S, Fränti P (2018) Mean-Shift Outlier Detection. In: FSDM, pp 208–215

  • Yedjour D, Benyettou A (2018) Symbolic interpretation of artificial neural networks based on multiobjective genetic algorithms and association rules mining. Appl Soft Comput 72:177–188

    Google Scholar 

  • Zadeh LA (1965) Fuzzy sets. Inf Control 8:338–353

    MATH  Google Scholar 

  • Zadeh LA (1975) The concept of a linguistic variable and its application to approximate reasoning-II. Inf Sci 8:199–249. https://doi.org/10.1016/0020-0255(75)90046-8

    Article  MathSciNet  MATH  Google Scholar 

  • Zeng J, Xie L, Liu ZQ (2008) Type-2 fuzzy gaussian mixture models. Pattern Recognit 41:3636–3643. https://doi.org/10.1016/j.patcog.2008.06.006

    Article  MATH  Google Scholar 

  • Zhang GX, Li N, Jin WD (2004) A novel quantum genetic algorithm and it’s application. Acta Electron Sin 32:476–479

    Google Scholar 

  • Zhang Y, Qian X, Wang J, Gendeel M (2019) Fuzzy rule-based classification system using multi-population quantum evolutionary algorithm with contradictory rule reconstruction. Appl Intell 49:4007–4021

    Google Scholar 

  • Zhang B (2000) Generalized k-harmonic means-boosting in unsupervised learning. Hp Laboratories Technical Report Hpl 137

  • Zheng H, Peng C (2005) Collaboration and fairness in opportunistic spectrum access. In: Proceedings of the 40th annual IEEE international conference on communications (ICC’05), Seoul, Korea, vol 5, pp 3132–3136

Download references

Acknowledgements

The authors are grateful to Indian Institute of Technology (Indian School of Mines), Dhanbad, for contributing necessary facilities to conclude this research.

Funding

The authors have not received any funding for conducting this study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ananya Das.

Ethics declarations

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent

Statement of informed consent is not applicable in this manuscript.

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chatterjee, S., Das, A. An ensemble algorithm using quantum evolutionary optimization of weighted type-II fuzzy system and staged Pegasos Quantum Support Vector Classifier with multi-criteria decision making system for diagnosis and grading of breast cancer. Soft Comput 27, 7147–7178 (2023). https://doi.org/10.1007/s00500-023-07939-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-023-07939-x

Keywords

Navigation