Abstract
Prototype generation deals with the problem of generating a small set of instances, from a large data set, to be used by KNN for classification. The two key aspects to consider when developing a prototype generation method are: (1) the generalization performance of a KNN classifier when using the prototypes; and (2) the amount of data set reduction, as given by the number of prototypes. Both factors are in conflict because, in general, maximizing data set reduction implies decreasing accuracy and viceversa. Therefore, this problem can be naturally approached with multi-objective optimization techniques. This paper introduces a novel multi-objective evolutionary algorithm for prototype generation where the objectives are precisely the amount of reduction and an estimate of generalization performance achieved by the selected prototypes. Through a comprehensive experimental study we show that the proposed approach outperforms most of the prototype generation methods that have been proposed so far. Specifically, the proposed approach obtains prototypes that offer a better tradeoff between accuracy and reduction than alternative methodologies.




Similar content being viewed by others
Notes
One should note that among the considered data sets numeric and nominal attributes are included. For simplicity we have deliberatively transformed nominal attributes into integers and applied MOPG without any modification.
Please note that, in general, in evolutionary algorithms large populations do not necessarily mean better performance. This behavior is observed when the search space has not been explored extensively, which is beneficial for avoiding overfitting.
See also http://sci2s.ugr.es/pgtax/.
This is the statistical test recommended by Demsar for comparing classification methods over multiple data sets [11].
References
Aler R, Handl J, Knowles JD (2013) Comparing multi-objective and threshold-moving roc curve generation for a prototype-based classifier. In: Proceedings of the fifteenth annual conference on Genetic and evolutionary computation conference. ACM, pp 1029–1036
Cervantes A, Galvan IM, Isasi P (2009) AMPSO: a new particle swarm method for nearest neighborhood classification. IEEE Trans. Sys. Man Cybern. B 39(5):1082–1091
Chatelain Clément, Adam Sébastien, Lecourtier Yves, Heutte Laurent, Paquet Thierry (2010) A multi-model selection framework for unknown and/or evolutive misclassification cost problems. Pattern Recogn. 43(3):815–823
Chen JH, Chen HM, Ho SY (2005) Design of nearest neighbor classifiers: multi-objective approach. Int. J. Approx. Reason. 40:3–22
Coello Coello CA, Lamont GB, Veldhuizen DAV (2007) Evolutionary algorithms for solving multi-objective problems. Genetic and evolutionary computation, 2nd edn. Springer, USA
Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans. Inform. Theory 13(1):21–27
Cruz-Vega I, Garcia-Limon M, Escalante HJ (2014) Adaptive surrogates with a neuro-fuzzy network and granular computing. In: Proceedings of GECCO 2014. ACM Press, pp 761–768
Deb K (2001) Multi-objective optimization using evolutionary algorithms. Wiley
Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2):182–197
Decaestecker C (1997) Finding prototypes for nearest neighbour classification by means of gradient descent and deterministic annealing. Pattern Recogn. 30(2):281–288
Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Dos-Santos EM, Sabourina R, Maupinb P (2008) A dynamic overproduce-and-choose strategy for the selection of classifier ensembles. Pattern Recogn. 41:2993–3009
Eiben AE, Smith JE (2010) Introduction to evolutionary computing. Natural computing. Springer
Escalante HJ, Mendoza KM, Graff M, Morales-Reyes A (2013) Genetic programming of prototypes for pattern classification. In: Proceedings of IbPRIA 2013, vol. 7887 of LNCS. Springer, pp 100–107
Fernandez F, Isasi P (2004) Evolutionary design of nearest prototype classifiers. J. Heuristics 10:431–454
Garain U (2008) Prototype reduction using an artificial immune system. Pattern Anal. Appl. 11(3–4):353–363
García S, Derrac J, Cano JR, Herrera F (2012) Prototype selection for nearest neighbor classification: Taxonomy and empirical study. IEEE Trans. Pattern Anal. Mach. Intell. 34(3):417–435
Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer, New York
Kim SW, Oommen BJ (2003) A brief taxonomy and ranking of creative prototype reduction schemes. Pattern Anal. Appl. 6:232–244
Koplowitz J, Brown T (1981) On the relation of performance to editing in nearest neighbor rules. Pattern Recogn. 13(3):251–255
Li J, Wang Y (2013) A nearest prototype selection algorithm using multi-objective optimization and partition. In: Proceedings of the 9th International Conference on Computational Intelligence and Security. IEEE, pp. 264–268
Lozano M, Sotoca JM, Sánchez JS, Pla F, Pkalska E, Duin RPW (2006) Experimental study on prototype optimisation algorithms for prototype-based classification in vector spaces. Pattern Recogn. 39(10):1827–1838
Nanni L, Lumini A (2008) Particle swarm optimization for prototype reduction. Neurocomputing 72(4–6):1092–1097
Olvera A, Carrasco-Ochoa JA, Martinez-Trinidad JF, Kittler J (2010) A review of instance selection methods. Artif. Intell. Rev. 34:133–143
Storn R, Price KV (1997) Differential evolution a simple and efficient heuristic for global optimization over continuous spaces. J. Global Optim. 11(10):341–359
Rosales A, Coello CA, Gonzalez J, Reyes CA, Escalante HJ (2013) A hybrid surrogate-based approach for evolutionary multi-objective optimization. In: Proceedings of Congress on Evolutionary Computation 2013. IEEE, pp 2548–2555
Rosales A, Gonzalez J, Coello CA, Escalante HJ, Reyes CA (2014) Surrogate-assisted multi-objective model selection for support vector machines. Neurocomputing (in press)
Triguero I, Derrac J, García S, Herrera F (2012) A taxonomy and experimental study on prototype generation for nearest neighbor classification. IEEE Trans. Sys. Man Cybern. C 42(1):86–100
Triguero I, Peralta D, Bacardit J, Garcia S, Herrera F (2014) MRPR: a mapreduce solution for prototype reduction in big data classification. Neurocomputing (in press)
Triguero I, Garcia S, Herrera F (2011) Differential evolution for optimizing the positioning of prototypes in nearest neighbor classification. Pattern Recogn. 44:901–916
Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Yu Ps, Zhou ZH, Steinbach M, Hand DJ, Steinberg D (2007) Top 10 algorithms in data mining. Knowl Inf Syst 14(1):1–37
Xia H, Zhuang J, Yu D (2013) Novel soft subspace clustering with multi-objective evolutionary approach for high-dimensional data. Pattern Recogn. 46:2562–2575
Acknowledgments
This work was partially supported by the LACCIR programme under project ID R1212LAC006. Hugo Jair Escalante was supported by the internships programme of CONACyT under grant No. 234415.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Escalante, H.J., Marin-Castro, M., Morales-Reyes, A. et al. MOPG: a multi-objective evolutionary algorithm for prototype generation. Pattern Anal Applic 20, 33–47 (2017). https://doi.org/10.1007/s10044-015-0454-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-015-0454-6