Abstract
The development of mathematical software for training sampling is considered. Exhaustive and evolutionary sampling methods are developed. Criteria for selection, censoring, and pseudoclustering of instances are introduce in these methods. This makes it possible to speed up the sampling process and to ensure the compliance of the samples with the limited size. The proposed methods allow for the automatic allocation of a subset of instances with the minimal size from the original sample. The subset contains the most important instances for the model’s construction. The complexity estimates of the developed methods are defined. Experiments to determine the practical applicability of the methods are conducted. The use of the proposed estimates and identified dependences makes it possible to take into account the available computer resources during the sampling.
Similar content being viewed by others
References
Computational Intelligence in Fault Diagnosis, Palade, V., Bocaniala, C.D., and Jain, L., Eds., London: Springer-Verlag, 2006.
Bishop, C.M., Pattern Recognition and Machine Learning, New York: Springer-Verlag, 2011.
Intelligent Hybrid Systems: Fuzzy Logic, Neural Networks, and Genetic Algorithms, Ruan, D., Ed., Berlin: Springer-Verlag, 2012.
Bernard, H.R., Social Research Methods: Qualitative and Quantitative Approaches, Thousand Oaks: Sage Publications, 2006.
Chaudhuri, A. and Stenger, H., Survey Sampling Theory and Methods, New York: Chapman and Hall, 2005.
Encyclopedia of Survey Research Methods, Lavrakas, P.J., ed., Thousand Oaks: Sage Publications, 2008.
Hansen, M.H., Hurtz, W.N., and Madow, W.G., Sample Survey Methods and Theory, Vol. 1. Methods and Applications, New York: Wiley, 1953.
Miltivariate Analysis, Design of Experiment, and Survey Sampling, Ghosh, S., ed., New York: Marcel Dekker, 1999.
Plutowski, M., Selecting training exemplars for neural network learning, Dissertation doctor of philosophy in computer science and engineering, San Diego: University of California, 1994.
Smith, G., A deterministic approach to partitioning neural network training data for the classification problem, Dissertation doctor of philosophy in business, Blacksburg: Virginia Polytechnic Institute and State University, 2006.
Guyon, I. and Elisseeff, A., An introduction to variable and feature selection, J. Machine Learning Research, 2003, no. 3, pp. 1157–1182.
Abraham, A., Grosan, C., and Pedrycz, W., Engineering Evolutionary Intelligent Systems, Berlin: Springer-Verlag, 2008.
Subbotin, S.A., Oleinik, A.A., Gofman, E.A., Zaitsev, S.A., and Oleinik, A.A., Intellektual’nye informatsionnye tekhnologii proektirovaniya avtomatizirovannykh sistem diagnostirovaniya i raspoznavaniya obrazov, (Intellectual Information Technologies of Design of Automated Systems of Image Diagnosis and Recognition), Subbotin, S.A., ed., Kharkov: Kompaniya SMIT, 2012.
UCI machine learning repository. http://archive.ics.uci.edu/ml/datasets/Iris
Author information
Authors and Affiliations
Corresponding author
Additional information
Original Russian Text © S.A. Subbotin, 2013, published in Avtomatika i Vychislitel’naya Tekhnika, 2013, No. 3, pp. 5–16.
About this article
Cite this article
Subbotin, S.A. Methods of sampling based on exhaustive and evolutionary search. Aut. Control Comp. Sci. 47, 113–121 (2013). https://doi.org/10.3103/S0146411613030073
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.3103/S0146411613030073