Skip to main content
Log in

An approach to structure determination and estimation of hierarchical Archimedean Copulas and its application to Bayesian classification

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

Copulas are distribution functions with standard uniform univariate marginals. Copulas are widely used for studying dependence among continuously distributed random variables, with applications in finance and quantitative risk management; see, e.g., the pricing of collateralized debt obligations (Hofert and Scherer, Quantitative Finance, 11(5), 775–787, 2011). The ability to model complex dependence structures among variables has recently become increasingly popular in the realm of statistics, one example being data mining (e.g., cluster analysis, evolutionary algorithms or classification). The present work considers an estimator for both the structure and the parameters of hierarchical Archimedean copulas. Such copulas have recently become popular alternatives to the widely used Gaussian copulas. The proposed estimator is based on a pairwise inversion of Kendall’s tau estimator recently considered in the literature but can be based on other estimators as well, such as likelihood-based. A simple algorithm implementing the proposed estimator is provided. Its performance is investigated in several experiments including a comparison to other available estimators. The results show that the proposed estimator can be a suitable alternative in the terms of goodness-of-fit and computational efficiency. Additionally, an application of the estimator to copula-based Bayesian classification is presented. A set of new Archimedean and hierarchical Archimedean copula-based Bayesian classifiers is compared with other commonly known classifiers in terms of accuracy on several well-known datasets. The results show that the hierarchical Archimedean copula-based Bayesian classifiers are, despite their limited applicability for high-dimensional data due to expensive time consumption, similar to highly-accurate classifiers like support vector machines or ensemble methods on low-dimensional data in terms of accuracy while keeping the produced models rather comprehensible.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. also called fully-nested Archimedean copula

  2. also called partially-nested Archimedean copula

References

  • Aas, K., Czado, C., Frigessi, A., Bakken, H. (2009). Pair-copula constructions of multiple dependence. Insurance: Mathematics and Economics, 44(2), 182–198.

    MATH  MathSciNet  Google Scholar 

  • Alcalá, J., Fernández, A., Luengo, J., Derrac, J., García, S., Sánchez, L., Herrera, F. (2010). Keel data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework. Journal of Multiple-Valued Logic and Soft Computing, 17, 255–287.

    Google Scholar 

  • Bache, K., & Lichman, M. (2013). UCI machine learning repository. http://archive.ics.uci.edu/ml.

  • Berg, D. (2009). Copula goodness-of-fit testing: an overview and power comparison. The European Journal of Finance, 15(7–8), 675–701.

    Article  Google Scholar 

  • Bouyé, E., Durrleman, V., Nikeghbali, A., Riboulet, G., Roncalli, T. (2000). Copulas for finance - a reading guide and some applications. Available at SSRN 1032533.

  • Breiman, L. (1996). Bagging predictors. Machine learning, 24(2), 123–140.

    MATH  MathSciNet  Google Scholar 

  • Breiman, L., Freidman, J., Olshen, R., Stone, C. (1984). Classification and Regression Trees. Wadsworth.

  • Chen, X., Fan, Y., Patton, A.J. (2004). Simple tests for models of dependence between multiple financial time series, with applications to us equity returns and exchange rates. Discussion paper 483, Financial Markets Group, London School of Economics.

  • Clarke, B., Fokoue, E., Zhang, H.H. (2009). Principles and Theory for Data Mining and Machine Learning. Springer.

  • Cramér, H. (1928). On the composition of elementary errors: First paper: Mathematical deductions. Scandinavian Actuarial Journal, 1928(1), 13–74.

    Article  Google Scholar 

  • Cuvelier, E., & Noirhomme-Fraitur, M. (2005). Clayton copula and mixture decomposition. In Applied Stochastic Models and Data Analysis, ASMDA’05. Brest.

  • Freund, Y., & Schapire, R.E. (1995). A desicion-theoretic generalization of on-line learning and an application to boosting. In Computational learning theory, (pp. 23–37). Springer.

  • Genest, C., & Favre, A. (2007). Everything you always wanted to know about copula modeling but were afraid to ask. Hydrologic Engineering, 12, 347–368.

    Article  Google Scholar 

  • Genest, C., & Rémillard, B. (2008). Validity of the parametric bootstrap for goodness-of-fit testing in semiparametric models. In Annales de l’Institut Henri Poincaré: Probabilités et Statistiques, (Vol. 44, pp. 1096–1127).

  • Genest, C., Rémillard, B., Beaudoin, D. (2009). Goodness-of-fit tests for copulas: A review and a power study. Insurance: Mathematics and Economics, 44(2), 199–213.

    MATH  MathSciNet  Google Scholar 

  • Genest, C., & Rivest, L.P. (1993). Statistical inference procedures for bivariate archimedean copulas. Journal of the American statistical Association, 88(423), 1034–1043.

    Article  MATH  MathSciNet  Google Scholar 

  • González-Fernández, Y., & Soto, M. (2012). Copulaedas: An R package for estimation of distribution algorithms based on copulas. arXiv:1209.5429.

  • Górecki, J., Hofert, M., Holeṅa, M. (2014). On the consistency of an estimator for hierarchical archimedean copulas. In Talaṡová, J., Stoklasa, J., Taláṡek, T. (Eds.) 32nd International Conference on Mathematical Methods in Economics, (pp. 239–244). Olomouc: Palacký University.

    Google Scholar 

  • Górecki, J., & Holeňa, M. (2013). An alternative approach to the structure determination of hierarchical Archimedean copulas. In Proceedings of the 31st International Conference on Mathematical Methods in Economics (MME 2013) (pp. 201–206). Jihlava.

  • Górecki, J., & Holeňa, M. (2014). Structure determination and estimation of hierarchical Archimedean copulas based on Kendall correlation matrix. In Appice, A., Ceci, M., Loglisci, C., Manco, G., Masciari, E., Ras, Z.W. (Eds.) New Frontiers in Mining Complex Patterns, Lecture Notes in Computer Science, (pp. 132–147).

  • Hofert, M. (2010a). Construction and sampling of nested Archimedean copulas. In Jaworski, P., Durante, F., Hardle, W.K., Rychlik, T. (Eds.), Copula Theory and Its Applications, Lecture Notes in Statistics, vol 198, (pp. 147–160). Springer Berlin Heidelberg.

  • Hofert, M. (2010b). Sampling Nested Archimedean Copulas with Applications to CDO Pricing: Suedwestdeutscher Verlag fuer Hochschulschriften.

  • Hofert, M. (2011). Efficiently sampling nested Archimedean copulas. Computational Statistics and Data Analysis, 55(1), 57–70.

    Article  MATH  MathSciNet  Google Scholar 

  • Hofert, M. (2012). A stochastic representation and sampling algorithm for nested Archimedean copulas. Journal of Statistical Computation and Simulation, 82(9), 1239–1255. doi:10.1080/00949655.2011.574632.

    Article  MATH  MathSciNet  Google Scholar 

  • Hofert, M., Mächler, M., Mcneil, A.J. (2012). Likelihood inference for archimedean copulas in high dimensions under known margins. Journal of Multivariate Analysis, 110, 133–150.

    Article  MATH  MathSciNet  Google Scholar 

  • Hofert, M., Mächler, M., McNeil, A.J. (2013). Archimedean copulas in high dimensions: Estimators and numerical challenges motivated by financial applications. Journal de la Société Française de Statistique, 154(1), 25–63.

    Google Scholar 

  • Hofert, M., & Scherer, M. (2011). CDO pricing with nested Archimedean copulas. Quantitative Finance, 11(5), 775–787.

    Article  MATH  MathSciNet  Google Scholar 

  • Holeňa, M., & Ščavnický, M. (2013). Application of copulas to data mining based on observational logic. In ITAT: Information Technologies Applications and Theory Workshops, Posters, and Tutorials, North Charleston: CreateSpace Independent Publishing Platform, Donovaly Slovakia.

  • Joe, H. (1997). Multivariate Models and Dependence Concepts. London: Chapman & Hall.

    Book  MATH  Google Scholar 

  • Kao, S.C., Ganguly, A.R., Steinhaeuser, K. (2009). Motivating complex dependence structures in data mining: A case study with anomaly detection in climate. In International Conference on Data Mining Workshops. doi:10.1109/ICDMW.2009.37, (Vol. 0, pp. 223–230).

  • Kao, S.C., & Govindaraju, R.S. (2008). Trivariate statistical analysis of extreme rainfall events via plackett family of copulas. Water Resources Research, 44.

  • Kojadinovic, I. (2010). Hierarchical clustering of continuous variables based on the empirical copula process and permutation linkages. Computational Statistics & Data Analysis, 54(1), 90–108.

    Article  MATH  MathSciNet  Google Scholar 

  • Kojadinovic, I., & Yan, J. (2010a). Comparison of three semiparametric methods for estimating dependence parameters in copula models. Insurance: Mathematics and Economics, 47, 52–63.

    MATH  MathSciNet  Google Scholar 

  • Kojadinovic, I., & Yan, J. (2010b). Modeling multivariate distributions with continuous margins using the copula r package. Journal of Statistical Software, 34(9), 1–20.

    Article  MathSciNet  Google Scholar 

  • Kuhn, G., Khan, S., Ganguly, A.R., Branstetter, M.L. (2007). Geospatial-temporal dependence among weekly precipitation extremes with applications to observations and climate model simulations in South America. Advances in X-ray Analysis, 30(12), 2401–2423.

    Google Scholar 

  • Lachenbruch, P.A. (1975). Discriminant analysis. Wiley Online Library.

  • Lascio, F., & Giannerini, S. (2012). A copula-based algorithm for discovering patterns of dependent observations. Journal of Classification, 29, 50–75. doi:10.1007/s00357-012-9099-y.

    Article  MathSciNet  Google Scholar 

  • Maity, R., & Kumar, D.N. (2008). Probabilistic prediction of hydroclimatic variables with nonparametric quantification of uncertainty. Journal of Geophysical Research, 113.

  • McNeil, A.J. (2008). Sampling nested Archimedean copulas. Journal of Statistical Computation and Simulation, 78(6), 567–581.

    Article  MATH  MathSciNet  Google Scholar 

  • McNeil, A.J., & Nešlehová, J. (2009). Multivariate Archimedean copulas, d-monotone functions and l 1-norm symmetric distributions. The Annals of Statistics, 37, 3059–3097.

    Article  MATH  MathSciNet  Google Scholar 

  • Moehmel, S., Steinfeldt, N., Engelschalt, S., Holena, M., Kolf, S., Baerns, M., Dingerdissen, U., Wolf, D., Weber, R., Bewersdorf, M. (2008). New catalytic materials for the high-temperature synthesis of hydrocyanic acid from methane and ammonia by high-throughput approach. Applied Catalysis A: General, 334(1), 73–83.

    Article  Google Scholar 

  • Nelsen, R. (2006). An Introduction to Copulas, 2nd edn. Springer.

  • Okhrin, O., Okhrin, Y., Schmid, W. (2013a). On the structure and estimation of hierarchical Archimedean copulas. Journal of Econometrics, 173(2), 189–204. http://www.sciencedirect.com/science/article/pii/S0304407612002667.

    Article  MathSciNet  Google Scholar 

  • Okhrin, O., Okhrin, Y., Schmid, W. (2013b). Properties of hierarchical Archimedean copulas. Statistics & Risk Modeling, 30(1), 21–54.

    Article  MATH  MathSciNet  Google Scholar 

  • Okhrin, O., & Ristig, A. (2014). Hierarchical Archimedean copulae: The HAC package. Journal of Statistical Software, 58(4). http://www.jstatsoft.org/v58/i04.

  • Rey, M., & Roth, V. (2012). Copula mixture model for dependency-seeking clustering. Preprint. arXiv:1206.6433

  • Sathe, S. (2006). A novel Bayesian classifier using copula functions. Preprint arXiv:cs/0611150.

  • Savu, C., & Trede, M. (2008). Goodness-of-fit tests for parametric families of Archimedean copulas. Quantitative Finance, 8(2), 109–116.

    Article  MATH  MathSciNet  Google Scholar 

  • Savu, C., & Trede, M. (2010). Hierarchies of Archimedean copulas. Quantitative Finance, 10, 295–304.

    Article  MATH  MathSciNet  Google Scholar 

  • Segers, J., & Uyttendaele, N. (2014). Nonparametric estimation of the tree structure of a nested Archimedean copula. Computational Statistics & Data Analysis, 72, 190–204.

    Article  MathSciNet  Google Scholar 

  • Sklar, A. (1959). Fonctions de répartition a n dimensions et leurs marges. Publishing Institute of Statistical University Paris, 8, 229–231.

    MathSciNet  Google Scholar 

  • Smith, M.S., Gan, Q., Kohn, R.J. (2012). Modelling dependence using skew t copulas: Bayesian inference and applications. Journal of Applied Econometrics, 27(3), 500–522.

    Article  MathSciNet  Google Scholar 

  • Vapnik, V. (2000). The nature of statistical learning theory. Springer.

  • Wang, L., Guo, X., Zeng, J., Hong, Y. (2012). Copula estimation of distribution algorithms based on exchangeable Archimedean copula. International Journal of Computer Applications in Technology, 43, 13–20. doi:10.1504/IJCAT.2012.045836, http://inderscience.metapress.com/content/42R4M650P16V1227.

    Article  Google Scholar 

  • Wolpert, D.H. (2002). The supervised learning no-free-lunch theorems. In Soft Computing and Industry (pp. 25–42). Springer.

  • Yuan, A., Chen, G., Zhou, Z.C., Bonney, G., Rotimi, C. (2008). Gene copy number analysis for family data using semiparametric copula model. Bioinform Biol Insights, 2, 343–355.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jan Górecki.

Additional information

The work of Jan Górecki was funded by the project SGS/21/2014 - Advanced methods for knowledge discovery from data and their application in expert systems, Czech Republic. The work of Martin Holeňa was funded by the Czech Science Foundation (GA ČR) grant 13-17187S.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Górecki, J., Hofert, M. & Holeňa, M. An approach to structure determination and estimation of hierarchical Archimedean Copulas and its application to Bayesian classification. J Intell Inf Syst 46, 21–59 (2016). https://doi.org/10.1007/s10844-014-0350-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-014-0350-3

Keywords

Navigation