Skip to main content

Learning in the Presence of Large Fluctuations: A Study of Aggregation and Correlation

  • Conference paper
New Frontiers in Mining Complex Patterns (NFMCP 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7765))

Included in the following conference series:

  • 612 Accesses

Abstract

Consider a scenario where one aims to learn models from data being characterized by very large fluctuations that are neither attributable to noise nor outliers. This may be the case, for instance, when predicting the potential future damages of earthquakes or oil spills, or when conducting financial data analysis. If follows that, in such a situation, the standard central limit theorem does not apply, since the associated Gaussian distribution exponentially suppresses large fluctuations. In this paper, we present an analysis of data aggregation and correlation in such scenarios. To this end, we introduce the Lévy, or stable, distribution which is a generalization of the Gaussian distribution. Our theoretical conclusions are illustrated with various simulations, as well as against a benchmarking financial database. We show which specific strategies should be adopted for aggregation, depending on the stability exponent of the Lévy distribution. Our results indicate that the correlation in between two attributes may be underestimated if a Gaussian distribution is erroneously assumed. Secondly, we show that, in the scenario where we aim to learn a set of rules to estimate the level of stability of a stock market, the Lévy distribution produces superior results. Thirdly, we illustrate that, in a multi-relational database mining setting, aggregation using average values may be highly unsuitable.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Knobbe, A.J., Siebes, A., Marseille, B.: Involving Aggregate Functions in Multi-Relational Search. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS (LNAI), vol. 2431, pp. 145–168. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  2. Malerba, D.: A relational perspective on spatial data mining. Int. J. Data Mining. Modelling and Management 1(1), 103–118 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  3. Groot, R.D.: Lévy distribution and long correlation times in supermarket sales. Physica A: Statistical Mechanics and its Applications 353, 501–514 (2005)

    Article  MathSciNet  Google Scholar 

  4. Walter, C.: Lévy-stability-under-addition and fractal structure of markets: implications for the investment management industry and emphasized examination of MATIF notional contract. Mathematical and Computer Modelling 29(10-12), 37–56 (1999)

    Article  Google Scholar 

  5. Krogel, M.A., Wrobel, S.: Facets of aggregation approaches to propositionalization. In: The 13th International Conference on Inductive Logic Programming, ILP 2003 (2003)

    Google Scholar 

  6. Guo, H., Viktor, H.L.: Multirelational classification: A multiple view approach. Knowledge and Information Systems 17, 287–312 (2008)

    Article  Google Scholar 

  7. Zliobaite, I., et al.: Next challenges for adaptive learning systems. ACM SIGKDD Explorations Newsletter 14(1), 9 (2012)

    Article  Google Scholar 

  8. Samorodnitsky, G., Taqqu, M.S.: Stable Non-Gaussian Random Processes: Stochastic Models with Infinite Variance. Chapman & Hall, New York (1994)

    MATH  Google Scholar 

  9. Paulson, A.S., Holcomb, E., Leitch, R.: The estimation of the parameters of the stable law. Biometrica 62(1), 163–170 (1977)

    Article  MathSciNet  Google Scholar 

  10. Nolan, J.P., Panorska, A.K., McCulloch, J.H.: Estimation of spectral measures. Mathematical and Computer Modelling 34(9-11), 1113–1122 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  11. Guo, H., Viktor, H.L., Paquet, E.: Privacy Disclosure and Preserving in Learning with Multi-relational Databases. Journal of Computing Science and Engineering 5(3), 183–196 (2011)

    Article  Google Scholar 

  12. Cheng, B., Rachev, S.: Multivariate Stable Future Prices. Mathematical Finance 5, 133–153 (1995)

    Article  MATH  Google Scholar 

  13. Tao, Y., Pei, J., Li, L., Xiao, X., Yi, K., Xing, Z.: Correlation hiding by independence masking. In: IEEE 26th International Conference on Data Engineering, ICDE, pp. 964–967 (2010)

    Google Scholar 

  14. Jafer, Y., Viktor, H.L., Paquet, E.: Aggregation and privacy in multi-relational databases. In: Tenth Annual International Conference on Privacy, Security and Trust, PST, pp. 67–74 (2012)

    Google Scholar 

  15. Lévy Véhel, J., Walter, C.: Les marchés fractals (“The fractal markets”). Presses Universitaires de France, Paris (2002)

    Google Scholar 

  16. Berka, P.: Guide to the Financial Data Set. In: Siebes, A., Berka, P. (eds.) PKDD 2000 Discovery Challenge (2000)

    Google Scholar 

  17. Rinne, H.: The Weibull Distribution: A Handbook. Taylor & Francis Group, Boca Raton (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Paquet, E., Viktor, H.L., Guo, H. (2013). Learning in the Presence of Large Fluctuations: A Study of Aggregation and Correlation. In: Appice, A., Ceci, M., Loglisci, C., Manco, G., Masciari, E., Ras, Z.W. (eds) New Frontiers in Mining Complex Patterns. NFMCP 2012. Lecture Notes in Computer Science(), vol 7765. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37382-4_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-37382-4_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-37381-7

  • Online ISBN: 978-3-642-37382-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics