Abstract
The cluster analysis is a traditional tool for multi-varietal data processing. Using the k-means method, we can split a pattern set into a given number of clusters. These clusters can be used for the final classification of known output classes. This paper focuses on various approaches that can be used for an optimal union of hidden classes. The resulting tasks include binary programming or convex optimization ones. Another possibility of obtaining hidden classes is designing imperfect classifier system. Novel context out learning approach is also discussed as possibility of using simple classifiers as background of the system of hidden classes which are easy to union to output classes using the optimal algorithm. All these approaches are useful in many applications, including econometric research. There are two main methodologies: supervised and unsupervised learning based on given pattern set with known or unknown output classification. Preferring supervised learning, we can combine the context out learning with optimal union of hidden classes to obtain the final classifier. But if we prefer unsupervised learning, we will begin with cluster analysis or another similar approach to also obtain the hidden class system for future optimal unioning. Therefore, the optimal union algorithm is widely applicable for any kind of classification tasks. The presented techniques are demonstrated on an artificial pattern set and on real data related to crisis prediction based on the clustering of macroeconomic indicators.
Similar content being viewed by others
References
Andrés JD, Lorca P, de Cos Juez FJ, Sánchez-Lasheras F (2011) Bankruptcy forecasting: a hybrid approach using fuzzy c-means clustering and multivariate adaptive regression splines (mars). Expert Syst Appl 38(3):1866–1875
Bolin JH, Edwards JM, Finch WH, Cassady JC (2014) Applications of cluster analysis to the creation of perfectionism profiles: a comparison of two clustering approaches. Front Psychol 5:343
Boser BE, Guyon IM, Vapnik VN, (1992) A training algorithm for optimal margin classifiers. In: Proceedings of the fifth annual workshop on computational learning theory, COLT ’92, New York, NY, USA, pp 144–152
Chang L, Slikker W (1995) Neurotoxicology: approaches and methods. Elsevier, Amsterdam
Dias JG, Vermunt JK, Ramos S (2015) Clustering financial time series: new insights from an extended hidden Markov model. Eur J Oper Res 243(3):852–864
Directorate General for Economic and Financial Affairs (ECFIN), Statistical annex to European economy. Autumn 2015, Technical report, European Commission (2015). http://ec.europa.eu/economy_finance/publications/eeip/2015-sa-autumn_en.htm
DiStefano J (2015) Dynamic systems biology modeling and simulation. Elsevier, Amsterdam
Ghassempour S, Girosi F, Maeder A (2014) Clustering multivariate time series using hidden Markov models. Int J Environ Res Public Health 11(3):2741–2763
Harshbarger R, Reynolds J (2015) Mathematical applications for the management, life, and social sciences. Cengage Learning, Boston
Hiriart-Urruty J, Lemarechal C (1996) Convex analysis and minimization algorithms I: fundamentals. Springer, Berlin
Hrebik R, Kukal J, (2015) Multivarietal data whitening of main trends in economic development. In: Martincik D, Irgincova J, Janecek P (eds) Mathematical methods in economics, University of West Bohemia, Plzeň, pp 279–284
Jaynes ET (1968) Prior probabilities. IEEE Trans Syst Sci Cybern 4(3):227–241
Kadir SN, Goodman DFM, Harris KD (2014) High-dimensional cluster analysis with the masked em algorithm. Neural Comput 26(11):2379–2394
Kateri M (2014) Contingency table analysis: methods and implementation using R. Statistics for Industry and Technology, Springer, New York
Konishi S, Kitagawa G (2008) Information criteria and statistical modeling. Springer series in statistics. Springer, New York
Kropat E, Weber G-W, Rückmann J-J (2010) Regression analysis for clusters in gene-environment networks based on ellipsoidal calculus and optimization. Dyn Continuous Discrete Impulsive Syst Ser B Appl Algorithms 17(5):639–657
Novakova K (2008) Application of transforms in object recognition (in czech), Ph.D. thesis, FNSPE, CTU in Prague
O’Brien L (1989) The statistical analysis of contingency table designs, concepts and techniques in modern geography, Environmental Publications, University of East Anglia, Norwich
Santi E, Aloise D, Blanchard SJ (2016) A model for clustering data from heterogeneous dissimilarities. Eur J Oper Res 253(3):659–672
Shi G (2013) Data mining and knowledge discovery for geoscientists. Elsevier, Amsterdam
Taylor J (1997) An introduction to error analysis: the study of uncertainties in physical measurements. A series of books in physics. University Science Books, Sausalito
Volkovich Z, Barzily Z, Morozensky L (2008) A statistical model of cluster stability. Pattern Recogn 41(7):2174–2188
Volkovich Z, Barzily Z, Avros R, Toledano-Kitai D (2011) On application of a probabilistic k-nearest neighbors model for cluster validation problem. Commun Stat Theory Methods 40(16):2997–3010
Volkovich Z, Toledano-Kitai D, Weber GW (2013) Self-learning k-means clustering: a global optimization approach (report), J Glob Optim 56 (2):219(14)
Wang J, Ma Y, Ouyang L, Tu Y (2016) A new Bayesian approach to multi-response surface optimization integrating loss function with posterior probability. Eur J Oper Res 249(1):231–237
Weber G (1978) A solution technique for binary integer programming using matchings on graphs. Cornell University, Ithaca
Weber G-W, Defterli O, Gök SZA, Kropat E (2011) Modeling, inference and optimization of regulatory networks based on time series data. Eur J Oper Res 211(1):1–14
Acknowledgements
The authors would like to acknowledge the support of the research grants SGS 17/196/OHK4/3T/14 and SGS 17/197/OHK4/3T/14.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hrebik, R., Kukal, J. & Jablonsky, J. Optimal unions of hidden classes. Cent Eur J Oper Res 27, 161–177 (2019). https://doi.org/10.1007/s10100-017-0496-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10100-017-0496-5