Abstract
Classification constitutes focal topic of study within the machine learning research community. Interpretable machine learning algorithms have been gaining ground against black box models because people want to understand the decision-making process. Mathematical programming based classifiers have received attention because they can compete with state-of-the-art algorithms in terms of accuracy and interpretability. This work introduces a single-level hyper-box classification approach, which is formulated mathematically as Mixed Integer Linear Programming model. Its objective is to identify the patterns of the dataset using a hyper-box representation. Hyper-boxes enclose as many samples of the corresponding class as possible. At the same time, they are not allowed to overlap with hyper-boxes of different class. The interpretability of the approach stems from the fact that IF-THEN rules can easily be generated. Towards the evaluation of the performance of the proposed method, its prediction accuracy is compared to other state-of-the-art interpretable approaches in a number of real-world datasets. The results provide evidence that the algorithm can compare favourably against well-known counterparts.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aghaei, S., Gomez, A., Vayanos, P.: Learning optimal classification trees: strong max-flow formulations (2020). https://doi.org/10.48550/arXiv.2002.09142
Bertsimas, D., Dunn, J.: Optimal classification trees. Mach. Learn. 106, 1039–1082 (2017). https://doi.org/10.1007/s10994-017-5633-9
Bixby, R.E.: A brief history of linear and mixed-integer programming computation. Doc. Math. 1, 107–121 (2012)
Blanco, V., Japón, A., Puerto, J.: Optimal arrangements of hyperplanes for SVM-based multiclass classification. Adv. Data Anal. Classif. 14, 175–199 (2020). https://doi.org/10.1007/s11634-019-00367-6
Blanco, V., Japón, A., Puerto, J.: A mathematical programming approach to SVM-based classification with label noise. Comput. Ind. Eng. 172, 108611 (2022). https://doi.org/10.1016/j.cie.2022.108611
Blanquero, R., Carrizosa, E., Molero-RÃo, C., Morales, D.R.: Sparsity in optimal randomized classification trees. Eur. J. Oper. Res. 284, 255–272 (2020). https://doi.org/10.1016/j.ejor.2019.12.002
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Taylor & Francis, Milton Park (1984). https://doi.org/10.1201/9781315139470
Busygin, S., Prokopyev, O.A., Pardalos, P.M.: An optimization-based approach for data classification. Optim. Meth. Softw. 22, 3–9 (2007). https://doi.org/10.1080/10556780600881639
Carrizosa, E., Morales, D.R.: Supervised classification and mathematical optimization. Comput. Oper. Res. 40, 150–165 (2013). https://doi.org/10.1016/j.cor.2012.05.015
Dua, D., Graff, C.: UCI machine learning repository. https://archive.ics.uci.edu/ml/index.php (2017)
GAMS Development Corporation: General Algebraic Model System (GAMS) (2022). Release 41.5.0, Washington, DC, USA
Gehrlein, W.V.: General mathematical programming formulations for the statistical classification problem. Oper. Res. Lett. 5, 299–304 (1986). https://doi.org/10.1016/0167-6377(86)90068-4
Gurobi Optimization, LLC: Gurobi Optimizer Reference Manual (2023). https://www.gurobi.com
Interpretable AI, LLC: Interpretable AI Documentation (2023). https://www.interpretable.ai
Maskooki, A.: Improving the efficiency of a mixed integer linear programming based approach for multi-class classification problem. Comput. Ind. Eng. 66, 383–388 (2013). https://doi.org/10.1016/j.cie.2013.07.005
Müller, T.T., Lio, P.: Peclides neuro: a personalisable clinical decision support system for neurological diseases. Front. Artif. Intell. 3, 23 (2020). https://doi.org/10.3389/frai.2020.00023
Nasseri, A.A., Tucker, A., Cesare, S.D.: Quantifying stockTwits semantic terms’ trading behavior in financial markets: an effective application of decision tree algorithms. Expert Syst. Appl. 42, 9192–9210 (2015). https://doi.org/10.1016/j.eswa.2015.08.008
Papageorgiou, L.G., Rotstein, G.E.: Continuous-domain mathematical models for optimal process plant layout. Ind. Eng. Chem. Res. 37, 3631–3639 (1998). https://doi.org/10.1021/ie980146v
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011). https://doi.org/10.48550/arXiv.1201.0490
Quinlan, J.R.: Improved use of continuous attributes in c4.5. J. Artif. Intell. Res. 4, 77–90 (1996). https://doi.org/10.1613/jair.279
Simpson, P.: Fuzzy min-max neural networks. I. classification. IEEE Transa. Neural Netw. 3, 776–786 (1992). https://doi.org/10.1109/72.159066. https://ieeexplore.ieee.org/document/159066/
Sueyoshi, T.: Mixed integer programming approach of extended DEA-discriminant analysis. Eur. J. Oper. Res. 152, 45–55 (2004). https://doi.org/10.1016/S0377-2217(02)00657-4
Verwer, S., Zhang, Y.: Learning optimal classification trees using a binary linear program formulation. In: 33rd Conference on Artificial Intelligence (2019). https://doi.org/10.1609/aaai.v33i01.33011624
Xu, G., Papageorgiou, L.G.: A mixed integer optimisation model for data classification. Comput. Ind. Eng. 56, 1205–1215 (2009). https://doi.org/10.1016/j.cie.2008.07.012
Yang, L., Liu, S., Tsoka, S., Papageorgiou, L.G.: Sample re-weighting hyper box classifier for multi-class data classification. Comput. Ind. Eng. 85, 44–56 (2015). https://doi.org/10.1016/j.cie.2015.02.022
Yoo, I., et al.: Data mining in healthcare and biomedicine: a survey of the literature. J. Med. Syst. 36, 2431–2448 (2012). https://doi.org/10.1007/s10916-011-9710-5
Zibanezhad, E., Foroghi, D., Monadjemi, A.: Applying decision tree to predict bankruptcy. In: IEEE International Conference on Computer Science and Automation Engineering, vol. 4, pp. 165–169 (2011). https://doi.org/10.1109/CSAE.2011.5952826
Üney, F., Türkay, M.: A mixed-integer programming approach to multi-class data classification problem. Eur. J. Oper. Res. 173, 910–920 (2006). https://doi.org/10.1016/j.ejor.2005.04.049
Acknowledgements
Authors gratefully acknowledge the financial support from Engineering and Physical Sciences Research Council (EPSRC) under the project EP/V051008/1.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Liapis, G.I., Papageorgiou, L.G. (2023). Hyper-box Classification Model Using Mathematical Programming. In: Sellmann, M., Tierney, K. (eds) Learning and Intelligent Optimization. LION 2023. Lecture Notes in Computer Science, vol 14286. Springer, Cham. https://doi.org/10.1007/978-3-031-44505-7_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-44505-7_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44504-0
Online ISBN: 978-3-031-44505-7
eBook Packages: Computer ScienceComputer Science (R0)