Abstract
We present a way to measure similarity between sets of rules for regression tasks. This was identified to be an important but missing tool to investigate Metaheuristic Rule Set Learners (MRSLs), a class of algorithms that utilize metaheuristics such as Genetic Algorithms to solve learning tasks: The commonly-used predictive performance-based metrics such as mean absolute error do not capture most users’ actual preferences when they choose these kinds of models since they typically aim for model interpretability (i. e. low number of rules, meaningful rule placement etc.) and not low error alone. Our similarity measure is based on a form of metaheuristic-agnostic edit distance. It is meant to be used—in conjunction with a certain class of benchmark problems—for analysing and improving an as-of-yet underresearched part of MRSL algorithms: The metaheuristic that optimizes the model’s structure (i. e. the set of rule conditions). We discuss the measure’s most important properties and demonstrate its applicability by performing experiments on the best-known MRSL, XCSF, comparing it with two non-metaheuristic Rule Set Learners, Decision Trees and Random Forests.
Notes
- 1.
A rule k’s training match set is the set of training data points that \(m(\psi _{k}; \cdot )\) is fulfilled for, i. e. \(\{x \in X \mid m(\psi _{k}; x) = 1\}\).
- 2.
We slightly abuse notation here and overload the matching function m to be able to pass the training data input \(N\times \mathcal {D}_\mathcal {X}\) matrix X consisting of N vectors \(x_{n} \in \mathcal {X}\) to a single condition \(m(\psi ; \cdot )\) to get an N-vector, i. e. \(m(\psi ; X) = \left( m(\psi ; x_{n})\right) _{n=1}^{N} \in \{0, 1\}^{N}\).
- 3.
For \(N=768\) training data points, our own (not at all optimized) code took around 0.0005 s per computation of \(\delta _{X}\) (mean over all computations of \(d_{X}\) with \(N=768\) performed for Fig. 2) and correspondingly around 0.2 s for computing \(d_{X}\) for two model structures of size 20. For \(N=2000\), we measured 0.002 s per \(\delta _{X}\) computation (and correspondingly 0.8 s for size 20 model structures).
References
Akiba, T., Sano, S., Yanase, T., Ohta, T., Koyama, M.: Optuna: a next-generation hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2019)
Alcala, R., Gacto, M.J., Herrera, F.: A fast and scalable multiobjective genetic fuzzy system for linguistic fuzzy modeling in high-dimensional regression problems. IEEE Trans. Fuzzy Syst. 19(4), 666–681 (2011). https://doi.org/10.1109/TFUZZ.2011.2131657
Bernadó-Mansilla, E., Garrell-Guiu, J.M.: Accuracy-based learning classifier systems: models, analysis and applications to classification tasks. Evolut. Comput. 11(3), 209–238 (2003). https://doi.org/10.1162/106365603322365289
Brusco, M., Cradit, J.D., Steinley, D.: A comparison of 71 binary similarity coefficients: the effect of base rates. Plos One 16(4) (2021)
Butz, M.V., Stolzmann, W.: An algorithmic description of ACS2. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) IWLCS 2001. LNCS (LNAI), vol. 2321, pp. 211–229. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-48104-4_13
Choi, S.S., Cha, S.H., Tappert, C.C.: A survey of binary similarity and distance measures. J. Syst. Cybernet. Inform. 8(1), 43–48 (2010)
Cordón, O.: A historical review of evolutionary learning methods for mamdani-type fuzzy rule-based systems: designing interpretable genetic fuzzy systems. Int. J. Approximate Reasoning 52(6), 894–913 (2011). https://doi.org/10.1016/j.ijar.2011.03.004
Corriveau, G., Guilbault, R., Tahan, A., Sabourin, R.: Review and study of genotypic diversity measures for real-coded representations. IEEE Trans. Evol. Comput. 16(5), 695–710 (2012). https://doi.org/10.1109/TEVC.2011.2170075
Drugowitsch, J.: Design and Analysis of Learning Classifier Systems - A Probabilistic Approach. SCI, vol. 139. Springer, Berlin (2008). https://doi.org/10.1007/978-3-540-79866-8
Eiter, T., Mannila, H.: Distance measures for point sets and their computation. Acta Informatica 34(2), 109–133 (1997). https://doi.org/10.1007/S002360050075
Ganti, V., Gehrke, J., Ramakrishnan, R.: A framework for measuring changes in data characteristics. In: Proceedings of the Eighteenth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 1999 pp. 126–137. Association for Computing Machinery, New York (1999). https://doi.org/10.1145/303976.303989
Gao, X., Xiao, B., Tao, D., Li, X.: A survey of graph edit distance. Pattern Anal. Appl. 13(1), 113–129 (2010). https://doi.org/10.1007/S10044-008-0141-Y
Gustafson, S., Vanneschi, L.: Crossover-based tree distance in genetic programming. IEEE Trans. Evol. Comput. 12(4), 506–524 (2008). https://doi.org/10.1109/TEVC.2008.915993
Heider, M., Pätzel, D., Stegherr, H., Hähner, J.: A Metaheuristic Perspective on Learning Classifier Systems, pp. 73–98. Springer Nature Singapore, Singapore (2023). https://doi.org/10.1007/978-981-19-3888-7_3
Heider, M., Stegherr, H., Nordsieck, R., Hähner, J.: Learning classifier systems for self-explaining socio-technical-systems (2022)
Heider, M., et al.: Discovering rules for rule-based machine learning with the help of novelty search. SN Comput. Sci. 4(6), 778 (2023). https://doi.org/10.1007/s42979-023-02198-x
Heider, M., Stegherr, H., Wurth, J., Sraj, R., Hähner, J.: Separating rule discovery and global solution composition in a learning classifier system. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, GECCO 2022, pp. 248–251. Association for Computing Machinery, New York(2022). https://doi.org/10.1145/3520304.3529014
Kharbat, F., Odeh, M., Bull, L.: New approach for extracting knowledge from the XCS learning classifier system. Inter. J. Hybrid Intell. Syst. 4, 49–62 (2007). https://doi.org/10.3233/HIS-2007-4201
Kovacs, T.: Deletion schemes for classifier systems. In: Proceedings of the 1st Annual Conference on Genetic and Evolutionary Computation, pp. 329–336 (1999)
Kovacs, T.: What should a classifier system learn and how should we measure it? Soft. Comput. 6(3), 171–182 (2002)
Kovacs, T., Kerber, M.: High classification accuracy does not imply effective genetic search. In: Deb, K. (ed.) GECCO 2004. LNCS, vol. 3103, pp. 785–796. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24855-2_93
Liu, B., Hsu, W., Han, H.-S., Xia, Y.: Mining changes for real-life applications. In: Kambayashi, Y., Mohania, M., Tjoa, A.M. (eds.) DaWaK 2000. LNCS, vol. 1874, pp. 337–346. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-44466-1_34
Liu, B., Hsu, W., Ma, Y.: Discovering the set of fundamental rule changes. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2001, pp. 335–340. Association for Computing Machinery, New York (2001). https://doi.org/10.1145/502512.502561
Liu, B., Ma, Y., Lee, R.: Analyzing the interestingness of association rules from the temporal dimension. In: Proceedings 2001 IEEE International Conference on Data Mining, pp. 377–384 (2001). https://doi.org/10.1109/ICDM.2001.989542
Liu, Y., Browne, W.N., Xue, B.: A comparison of learning classifier systems’ rule compaction algorithms for knowledge visualization. ACM Trans. Evol. Learn. Optim. 1(3) (2021). https://doi.org/10.1145/3468166
Parpinelli, R.S., Lopes, H.S., Freitas, A.A.: An ant colony algorithm for classification rule discovery. In: Data Mining, pp. 191–208. IGI Global (2002). https://doi.org/10.4018/978-1-930708-25-9.ch010
Pätzel, D., Heider, M., Hähner, J.: Towards principled synthetic benchmarks for explainable rule set learning algorithms. In: Proceedings of the Companion Conference on Genetic and Evolutionary Computation, GECCO 2023 Companion, pp. 1657–1662. Association for Computing Machinery, New York (2023). https://doi.org/10.1145/3583133.3596416
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Pekerskaya, I., Pei, J., Wang, K.: Mining changing regions from access-constrained snapshots: a cluster-embedded decision tree approach. J. Intell. Inf. Syst. 27(3), 215–242 (2006). https://doi.org/10.1007/S10844-006-9951-9
Preen, R.J., Pätzel, D.: Xcsf (2023). https://doi.org/10.5281/zenodo.8193688
Pätzel, D.: dpaetzel/rslmodels.jl: v0.1.1. https://doi.org/10.5281/zenodo.10557400
Pätzel, D.: dpaetzel/run-rsl-bench: v1.1.0. https://doi.org/10.5281/zenodo.10550923
Pätzel, D.: dpaetzel/syn-rsl-benchs: v1.0.0 (May 2023). https://doi.org/10.5281/zenodo.7919420
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1993)
Serpen, G., Sabhnani, M.: Measuring similarity in feature space of knowledge entailed by two separate rule sets. Knowl.-Based Syst. 19(1), 67–76 (2006). https://doi.org/10.1016/j.knosys.2003.11.001
Setnes, M., Babuska, R., Kaymak, U., van Nauta Lemke, H.: Similarity measures in fuzzy rule base simplification. IEEE Trans. Syst. Man Cybernet. Part B (Cybernetics) 28(3), 376–386 (1998). https://doi.org/10.1109/3477.678632
Stalph, P.O., Butz, M.V.: Guided evolution in XCSF. In: Proceedings of the 14th Annual Conference on Genetic and Evolutionary Computation, GECCO 2012, pp. 911–918. Association for Computing Machinery, New York (2012). https://doi.org/10.1145/2330163.2330289
Tamee, K., Bull, L., Pinngern, O.: Towards clustering with XCS. In: Proceedings of the 9th Annual Conference on Genetic and Evolutionary Computation, GECCO 2007, pp. 1854–1860. Association for Computing Machinery, New York (2007). https://doi.org/10.1145/1276958.1277326
Tan, J., Moore, J.H., Urbanowicz, R.J.: Rapid rule compaction strategies for global knowledge discovery in a supervised learning classifier system. In: Liò, P., Miglino, O., Nicosia, G., Nolfi, S., Pavone, M. (eds.) Proceedings of the Twelfth European Conference on the Synthesis and Simulation of Living Systems: Advances in Artificial Life, ECAL 2013, Sicily, Italy, 2–6 September 2013, pp. 110–117. MIT Press (2013). https://doi.org/10.7551/978-0-262-31709-2-CH017
Wang, K., Zhou, S., Fu, C.A., Yu, J.X.: Mining Changes of Classification by Correspondence Tracing, pp. 95–106. https://doi.org/10.1137/1.9781611972733.9
Wilson, S.W.: Classifier fitness based on accuracy. Evol. Comput. 3(2), 149–175 (1995)
Wilson, S.W.: Classifiers that approximate functions. Nat. Comput. 1(2), 211–234 (2002). https://doi.org/10.1023/A:1016535925043
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Pätzel, D., Nordsieck, R., Hähner, J. (2024). Measuring Similarities in Model Structure of Metaheuristic Rule Set Learners. In: Smith, S., Correia, J., Cintrano, C. (eds) Applications of Evolutionary Computation. EvoApplications 2024. Lecture Notes in Computer Science, vol 14635. Springer, Cham. https://doi.org/10.1007/978-3-031-56855-8_16
Download citation
DOI: https://doi.org/10.1007/978-3-031-56855-8_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-56854-1
Online ISBN: 978-3-031-56855-8
eBook Packages: Computer ScienceComputer Science (R0)