Skip to main content
Log in

Multiobjective hybrid monarch butterfly optimization for imbalanced disease classification problem

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Datasets obtained from the real world are far from balanced, particularly for disease datasets, since such datasets are usually highly skewed having a few minority classes apart from one or more prominent majority classes. In this research, we put forward the novel hybrid architecture to handle imbalanced binary disease datasets that arrives upon the efficient combination of Support vector machine (SVM) classifier’s sensitive parameter values for improved performance of SVM by means of an Evolutionary algorithm (EA), namely monarch butterfly optimization (MBO). In this paper, MBO is used to enumerate three objectives, namely prediction accuracy (PAC), sensitivity (SEN), specificity (SPE). Additionally, we propose a Totally uni-modular matrix (TUM) and limit points based non-dominated solutions selection for deciding local and global search and to generate an efficient initial population respectively. Since these two greatly affect the performance of EAs, the performance of the proposed hybrid architecture is tested on 18 disease datasets having binary class labels and the results obtained demonstrate improvements using the proposed method. For the majority of the datasets, either 100% sensitivity and/or specificity were attained. Moreover, pertinent statistical tests were carried out to ascertain the performances obtained.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Bashir S, Qamar U, Khan FH (2015) BagMOOV: a novel ensemble for heart disease prediction bootstrap aggregation with multi-objective optimized voting. Australas Phys Eng Sci Med 38(2):305–323

    Google Scholar 

  2. Bashir S, Qamar U, Khan FH (2016) IntelliHealth: a medical decision support application using a novel weighted multi-layer classifier ensemble framework. J Biomed Inf 59:185–200

    Google Scholar 

  3. Bashir S, Qamar U, Khan FH, Naseem L (2016) HMV: a medical decision support framework using multi-layer classifiers for disease prediction. J Comput Sci 13:10–25

    Google Scholar 

  4. Berge C (1984) Hypergraphs: combinatorics of finite sets, vol 45. Elsevier

  5. Brodley CE, Friedl MA (1999) Identifying mislabeled training data. J Artif Intell Res 11:131–167

    MATH  Google Scholar 

  6. Bukala J, Damaziak K, Karimi HR et al (2019) Evolutionary computing methodology for small wind turbine supporting structures. Int J Adv Manufac Technol 100(9–12):2741–2752

    Google Scholar 

  7. Chau KW (2007) Reliability and performance-based design by artificial neural network. Adv Eng Softw 38(3):145–149

    Google Scholar 

  8. Chen S, Chen R, Gao J (2017) A monarch butterfly optimization for the dynamic vehicle routing problem. Algorithms 10(3):107

    MathSciNet  MATH  Google Scholar 

  9. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297

    MATH  Google Scholar 

  10. Deb K (2001) Multi-objective optimization using evolutionary algorithms, vol 16. Wiley, Hoboken

    MATH  Google Scholar 

  11. Díez-Pastor JF, Rodríguez JJ, García-Osorio CI et al (2015) Diversity techniques improve the performance of the best imbalance learning ensembles. Inf Sci 325:98–117

    MathSciNet  Google Scholar 

  12. Díez-Pastor JF, Rodríguez JJ, García-Osorio C et al (2015) Random balance: ensembles of variable priors classifiers for imbalanced data. Knowl-Based Syst 85:96–111

    Google Scholar 

  13. Elrahman SMA, Abraham A (2013) A review of class imbalance problem. J Netw Innov Comput 1(2013):332–340

    Google Scholar 

  14. Farid DM, Al-Mamun MA, Manderick B, Nowe A (2016) An adaptive rule-based classifier for mining big biological data. Expert Syst Appl 64:305–316

    Google Scholar 

  15. Faris H, Aljarah I, Mirjalili S (2018) Improved monarch butterfly optimization for unconstrained global search and neural network training. Appl Intell 48(2):445–464

    Google Scholar 

  16. Feng Y, Wang GG, Deb S, Lu M, Zhao XJ (2017) Solving 0–1 knapsack problem by a novel binary monarch butterfly optimization. Neural Comput Appl 28(7):1619–1634

    Google Scholar 

  17. Feng Y, Wang GG, Dong J, Wang L (2018) Opposition-based learning monarch butterfly optimization with Gaussian perturbation for large-scale 0-1 knapsack problem. Comput Electr Eng 67:454–468

    Google Scholar 

  18. Feng Y, Wang GG, Li W, Li N (2018) Multi-strategy monarch butterfly optimization algorithm for discounted 0–1 knapsack problem. Neural Comput Appl 30(10):3019–3036

    MathSciNet  Google Scholar 

  19. Feng Y, Yang J, Wu C, Lu M, Zhao XJ (2018) Solving 0–1 knapsack problems by chaotic monarch butterfly optimization algorithm with Gaussian mutation. Mem Comput 10(2):135–150

    Google Scholar 

  20. Fernández A, del Río S, Chawla NV, Herrera F (2017) An insight into imbalanced Big Data classification: outcomes and challenges. Complex Intell Syst 3(2):105–120

    Google Scholar 

  21. Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F (2012) A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans Syst Man Cybern Part C (Appl Rev) 42(4):463–484

    Google Scholar 

  22. George L, Nemhauser, Laurence A (1999) Wolsey Integer and combinatorial optimization. Wiley, Hoboken, pp 540–546

    Google Scholar 

  23. Ghanem WA, Jantan A (2018) Hybridizing artificial bee colony with monarch butterfly optimization for numerical optimization problems. Neural Comput Appl 30(1):163–181

    Google Scholar 

  24. Gil D, Girela JL, De Juan J, Gomez-Torres MJ, Johnsson M (2012) Predicting seminal quality with artificial intelligence methods. Expert Syst Appl 39(16):12564–12573

    Google Scholar 

  25. Haixiang G, Yijing L, Shang J, Mingyun G, Yuanyue H, Bing G (2017) Learning from class-imbalanced data: review of methods and applications. Expert Syst Appl 73:220–239

    Google Scholar 

  26. Haixiang G, Yijing L, Yanan L, Xiao L, Jinling L (2016) BPSO-Adaboost-KNN ensemble learning algorithm for multi-class imbalanced data classification. Eng Appl Artif Intell 49:176–193 (247)

    Google Scholar 

  27. Huang C, Li Y, Change LC, Tang X (2016) Learning deep representation for imbalanced classification. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 5375–5384

  28. Jian C, Gao J, Ao Y (2016) A new sampling method for classifying imbalanced data based on support vector machine ensemble. Neurocomputing 193:115–122

    Google Scholar 

  29. Jiang B, Karimi HR, Kao Y, Gao C (2018) A novel robust fuzzy integral sliding mode control for nonlinear semi-Markovian jump T-S fuzzy systems. IEEE Trans Fuzzy Syst 26(6):3594–3604

    Google Scholar 

  30. Krawczyk B, Galar M, Jelen L, Herrera F (2016) Evolutionary undersampling boosting for imbalanced classification of breast cancer malignancy. Appl Soft Comput 38:714–726

    Google Scholar 

  31. Krawczyk B, Wozniak M, Schaefer G (2014) Cost-sensitive decision tree ensembles for effective imbalanced classification. Appl Soft Comput 14:554–562

    Google Scholar 

  32. Lipschutz S (2010) General topology. McGraw-Hill, New York

    MATH  Google Scholar 

  33. López V, Fernández A, García S, Palade V, Herrera F (2013) An insight into classification with imbalanced data: empirical results and current trends on using data intrinsic characteristics. Inf Sci 250:113–141

    Google Scholar 

  34. López V, Triguero I, Carmona CJ, García S, Herrera F (2014) Addressing imbalanced classification with instance generation techniques: IPADE-ID. Neurocomputing 126:15–28

    Google Scholar 

  35. Mangat V, Vig R (2014) Novel associative classifier based on dynamic adaptive PSO: application to determining candidates for thoracic surgery. Expert Syst Appl 41(18):8234–8244

    Google Scholar 

  36. Moazenzadeh R, Mohammadi B, Shamshirband S, Chau KW (2018) Coupling a firefly algorithm with support vector regression to predict evaporation in northern Iran. Eng Appl Comput Fluid Mech 12(1):584–597

    Google Scholar 

  37. Nalluri MR, Roy DS (2017) Hybrid disease diagnosis using multiobjective optimization with evolutionary parameter optimization. J Healthc Eng 2017:1–27

    Google Scholar 

  38. Nalluri MSR, Kannan K, Gao XZ, Roy DS (2019) An efficient hybrid meta-heuristic approach for cell formation problem. Soft Comput 23:1–25

    Google Scholar 

  39. Napierala K, Stefanowski J (2015) Addressing imbalanced data with argument based rule learning. Expert Syst Appl 42(24):9468–9481

    Google Scholar 

  40. Napierala K, Stefanowski J, Wilk S (2010) Learning from imbalanced data in presence of noisy and borderline examples. In International Conference on rough sets and current trends in computing. Springer, Berlin, Heidelberg, pp 158–167

  41. Platt JC (1999) Fast training of support vector machines using sequential minimal optimization. In: Advances in kernel methods, pp 185-208

  42. Quionero-Candela J, Sugiyama M, Schwaighofer A, Lawrence ND (2009) Dataset shift in machine learning. The MIT Press, Cambridge

    Google Scholar 

  43. Rao NM, Kannan K, Gao XZ, Roy DS (2018) Novel classifiers for intelligent disease diagnosis with multi-objective parameter evolution. Comput Electr Eng 67:483–496

    Google Scholar 

  44. Sáez JA, Luengo J, Stefanowski J, Herrera F (2015) SMOTE–IPF: addressing the noisy and borderline examples problem in imbalanced classification by a re-sampling method with filtering. Inf Sci 291:184–203

    Google Scholar 

  45. Shen L, Chen H, Yu Z, Kang W, Zhang B, Li H, Liu D (2016) Evolving support vector machines using fruit fly optimization for medical data classification. Knowl-Based Syst 96:61–75

    Google Scholar 

  46. Sheskin DJ (2003) Handbook of parametric and nonparametric statistical procedures. CRC Press, Boca Roton

    MATH  Google Scholar 

  47. Stefanowski J (2013) Overlapping, rare examples and class decomposition in learning classifiers from imbalanced data. In: Ramanna S, Jain L, Howlett R (eds) Emerging paradigms in machine learning. Springer, Berlin, Heidelberg, pp 277–306

    Google Scholar 

  48. Sun Y, Wong AK, Kamel MS (2009) Classification of imbalanced data: a review. Int J Pattern Recognit Artif Intell 23(04):687–719

    Google Scholar 

  49. Sun Z, Song Q, Zhu X, Sun H, Xu B, Zhou Y (2015) A novel ensemble method for classifying imbalanced data. Pattern Recogn 48(5):1623–1637

    Google Scholar 

  50. Tang Y, Zhang YQ, Chawla NV, Krasser S (2009) SVMs modeling for highly imbalanced classification. IEEE Trans Syst Man Cybern Part B (Cybernetics) 39(1):281–288

    Google Scholar 

  51. Taormina R, Chau KW, Sivakumar B (2015) Neural Network River forecasting through base flow separation and binary-coded swarm optimization. J Hydrol 529:1788–1797

    Google Scholar 

  52. Uriarte-Arcia AV, López-Yáñez I, Yáñez-Márquez C (2014) One-hot vector hybrid associative classifier for medical data classification. PLoS One 9(4):e95715

    Google Scholar 

  53. Wang GG, Deb S, Zhao X, Cui Z (2018) A new monarch butterfly optimization with an improved crossover operator. Oper Res Int J 18(3):731–755

    Google Scholar 

  54. Wang GG, Zhao X, Deb S (2015) A novel monarch butterfly optimization with greedy strategy and self-adaptive. In: Soft computing and machine intelligence (ISCMI), 2015 Second International Conference on, pp 45–50. IEEE

  55. Wang Y, Karimi HR, Lam HK, Shen H (2018) An improved result on exponential stabilization of sampled-data fuzzy systems. IEEE Trans Fuzzy Syst 26(6):3875–3883

    Google Scholar 

  56. Weiss GM (2010) The impact of small disjuncts on classifier learning. In Data Mining (pp. 193-226).Springer, Boston, MA

  57. Wu CL, Chau KW (2011) Rainfall–runoff modeling using artificial neural network coupled with singular spectrum analysis. J Hydrol 399(3–4):394–409

    Google Scholar 

  58. Xiao W, Zhang J, Li Y, Zhang S, Yang W (2017) Class-specific cost regulation extreme learning machine for imbalanced classification. Neurocomputing 261:70–82

    Google Scholar 

  59. Zhang S, Chau KW (2009) Dimension reduction using semi-supervised locally linear embedding for plant leaf classification. In: International Conference on intelligent computing. Springer, Berlin, Heidelberg, pp 948-955

  60. Zhao ZQ (2009) A novel modular neural network for imbalanced classification problems. Pattern Recogn Lett 30(9):783–788

    Google Scholar 

  61. Zhihua C, Feixiang L, Wensheng Z (2019) Bat algorithm with principal component analysis. Int J Mach Learn Cybern 10(3):603–622

    Google Scholar 

  62. Zhihua C, Jiangjiang Z, Yechuang W, Yang W et al (2019) A pigeon-inspired optimization algorithm for many-objective optimization problems. Sci China Inf Sci 62(7):070212. https://doi.org/10.1007/s11432-018-9729-5

    Article  Google Scholar 

  63. Zieba M, Tomczak JM, Lubicz M, Swiatek J (2014) Boosted SVM for extracting rules from imbalanced data in application to prediction of the post-operative life expectancy in the lung cancer patients. Appl Soft Comput 14:99–108

    Google Scholar 

  64. Zou Q, Xie S, Lin Z, Wu M, Ju Y (2016) Finding the best classification threshold in imbalanced classification. Big Data Res 5:2–8

    Google Scholar 

Download references

Acknowledgements

K. Kannan gratefully acknowledge Tata Realty-IT city-SASTRA Srinivasa Ramanujan Research Cell of SASTRA University (India) for the financial support extended to us in carrying out this research work. Xiao-Zhi Gao’s research work was partially supported by the National Natural Science Foundation of China (NSFC) under Grant 51875113.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Diptendu Sinha Roy.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nalluri, M.R., Kannan, K., Gao, XZ. et al. Multiobjective hybrid monarch butterfly optimization for imbalanced disease classification problem. Int. J. Mach. Learn. & Cyber. 11, 1423–1451 (2020). https://doi.org/10.1007/s13042-019-01047-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-019-01047-9

Keywords

Navigation