Skip to main content
Log in

Optimal unions of hidden classes

  • Original Paper
  • Published:
Central European Journal of Operations Research Aims and scope Submit manuscript

Abstract

The cluster analysis is a traditional tool for multi-varietal data processing. Using the k-means method, we can split a pattern set into a given number of clusters. These clusters can be used for the final classification of known output classes. This paper focuses on various approaches that can be used for an optimal union of hidden classes. The resulting tasks include binary programming or convex optimization ones. Another possibility of obtaining hidden classes is designing imperfect classifier system. Novel context out learning approach is also discussed as possibility of using simple classifiers as background of the system of hidden classes which are easy to union to output classes using the optimal algorithm. All these approaches are useful in many applications, including econometric research. There are two main methodologies: supervised and unsupervised learning based on given pattern set with known or unknown output classification. Preferring supervised learning, we can combine the context out learning with optimal union of hidden classes to obtain the final classifier. But if we prefer unsupervised learning, we will begin with cluster analysis or another similar approach to also obtain the hidden class system for future optimal unioning. Therefore, the optimal union algorithm is widely applicable for any kind of classification tasks. The presented techniques are demonstrated on an artificial pattern set and on real data related to crisis prediction based on the clustering of macroeconomic indicators.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Andrés JD, Lorca P, de Cos Juez FJ, Sánchez-Lasheras F (2011) Bankruptcy forecasting: a hybrid approach using fuzzy c-means clustering and multivariate adaptive regression splines (mars). Expert Syst Appl 38(3):1866–1875

    Article  Google Scholar 

  • Bolin JH, Edwards JM, Finch WH, Cassady JC (2014) Applications of cluster analysis to the creation of perfectionism profiles: a comparison of two clustering approaches. Front Psychol 5:343

    Google Scholar 

  • Boser BE, Guyon IM, Vapnik VN, (1992) A training algorithm for optimal margin classifiers. In: Proceedings of the fifth annual workshop on computational learning theory, COLT ’92, New York, NY, USA, pp 144–152

  • Chang L, Slikker W (1995) Neurotoxicology: approaches and methods. Elsevier, Amsterdam

    Google Scholar 

  • Dias JG, Vermunt JK, Ramos S (2015) Clustering financial time series: new insights from an extended hidden Markov model. Eur J Oper Res 243(3):852–864

    Article  Google Scholar 

  • Directorate General for Economic and Financial Affairs (ECFIN), Statistical annex to European economy. Autumn 2015, Technical report, European Commission (2015). http://ec.europa.eu/economy_finance/publications/eeip/2015-sa-autumn_en.htm

  • DiStefano J (2015) Dynamic systems biology modeling and simulation. Elsevier, Amsterdam

    Google Scholar 

  • Ghassempour S, Girosi F, Maeder A (2014) Clustering multivariate time series using hidden Markov models. Int J Environ Res Public Health 11(3):2741–2763

    Article  Google Scholar 

  • Harshbarger R, Reynolds J (2015) Mathematical applications for the management, life, and social sciences. Cengage Learning, Boston

    Google Scholar 

  • Hiriart-Urruty J, Lemarechal C (1996) Convex analysis and minimization algorithms I: fundamentals. Springer, Berlin

    Google Scholar 

  • Hrebik R, Kukal J, (2015) Multivarietal data whitening of main trends in economic development. In: Martincik D, Irgincova J, Janecek P (eds) Mathematical methods in economics, University of West Bohemia, Plzeň, pp 279–284

  • Jaynes ET (1968) Prior probabilities. IEEE Trans Syst Sci Cybern 4(3):227–241

    Article  Google Scholar 

  • Kadir SN, Goodman DFM, Harris KD (2014) High-dimensional cluster analysis with the masked em algorithm. Neural Comput 26(11):2379–2394

    Article  Google Scholar 

  • Kateri M (2014) Contingency table analysis: methods and implementation using R. Statistics for Industry and Technology, Springer, New York

    Book  Google Scholar 

  • Konishi S, Kitagawa G (2008) Information criteria and statistical modeling. Springer series in statistics. Springer, New York

    Book  Google Scholar 

  • Kropat E, Weber G-W, Rückmann J-J (2010) Regression analysis for clusters in gene-environment networks based on ellipsoidal calculus and optimization. Dyn Continuous Discrete Impulsive Syst Ser B Appl Algorithms 17(5):639–657

  • Novakova K (2008) Application of transforms in object recognition (in czech), Ph.D. thesis, FNSPE, CTU in Prague

  • O’Brien L (1989) The statistical analysis of contingency table designs, concepts and techniques in modern geography, Environmental Publications, University of East Anglia, Norwich

    Google Scholar 

  • Santi E, Aloise D, Blanchard SJ (2016) A model for clustering data from heterogeneous dissimilarities. Eur J Oper Res 253(3):659–672

    Article  Google Scholar 

  • Shi G (2013) Data mining and knowledge discovery for geoscientists. Elsevier, Amsterdam

    Google Scholar 

  • Taylor J (1997) An introduction to error analysis: the study of uncertainties in physical measurements. A series of books in physics. University Science Books, Sausalito

    Google Scholar 

  • Volkovich Z, Barzily Z, Morozensky L (2008) A statistical model of cluster stability. Pattern Recogn 41(7):2174–2188

    Article  Google Scholar 

  • Volkovich Z, Barzily Z, Avros R, Toledano-Kitai D (2011) On application of a probabilistic k-nearest neighbors model for cluster validation problem. Commun Stat Theory Methods 40(16):2997–3010

    Article  Google Scholar 

  • Volkovich Z, Toledano-Kitai D, Weber GW (2013) Self-learning k-means clustering: a global optimization approach (report), J Glob Optim 56 (2):219(14)

  • Wang J, Ma Y, Ouyang L, Tu Y (2016) A new Bayesian approach to multi-response surface optimization integrating loss function with posterior probability. Eur J Oper Res 249(1):231–237

    Article  Google Scholar 

  • Weber G (1978) A solution technique for binary integer programming using matchings on graphs. Cornell University, Ithaca

    Google Scholar 

  • Weber G-W, Defterli O, Gök SZA, Kropat E (2011) Modeling, inference and optimization of regulatory networks based on time series data. Eur J Oper Res 211(1):1–14

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to acknowledge the support of the research grants SGS 17/196/OHK4/3T/14 and SGS 17/197/OHK4/3T/14.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Radek Hrebik.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hrebik, R., Kukal, J. & Jablonsky, J. Optimal unions of hidden classes. Cent Eur J Oper Res 27, 161–177 (2019). https://doi.org/10.1007/s10100-017-0496-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10100-017-0496-5

Keywords

Navigation