Skip to main content
Log in

The Data Mining Group at University of Vienna

Clustering, Causality, Massive Data and Applications

  • Datenbankgruppen vorgestellt
  • Published:
Datenbank-Spektrum Aims and scope Submit manuscript

Abstract

How can we extract meaningful knowledge from massive amounts of data? The data mining group at University of Vienna contributes novel methods for exploratory data analysis. Our main research focus is on unsupervised learning, where we want to identify any kind of non-random structure or patterns in the data without restricting ourselves to a pre-defined target variable or analysis goal. Our major lines of current research are clustering, causality detection and highly efficient exploratory data analysis on massive data. Besides that, we develop application-specific methods addressing specific challenges in biomedicine, neuroscience and environmental sciences. In teaching, we offer fundamental and advanced courses in data mining, machine learning and scientific data management for Bachelor and Master students of computer science and related programs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Notes

  1. https://dm.cs.univie.ac.at/.

  2. https://informatik.univie.ac.at/en/study/courses-of-study/master/master-computer-science/.

References

  1. Altinigneli C, Konte B, Rujescir D, Böhm C, Plant C (2014) Identification of snp interactions using data-parallel primitives on gpus. In: 2014 IEEE International Conference on Big Data (Big Data), pp 539–548

    Chapter  Google Scholar 

  2. Altinigneli C, Miklautz L, Plant C, Böhm C (2020) Hierarchical quick shift guided recurrent clustering. In: To be published in 36th IEEE International Conference on Data Engineering. ICDE, Dallas, Texas

    Google Scholar 

  3. Altinigneli C, Plant C, Böhm C (2013) Massively parallel expectation maximization using graphics processing units. In: KDD, pp 838–846

    Google Scholar 

  4. Arnold A, Liu Y, Abe N (2007) Temporal causal modeling with graphical Granger methods. In: KDD. ACM, San Jose, California, USA, pp 66–75

    Google Scholar 

  5. Bauer LGM, Grohs P, Wohlschläger A, Plant C (2019) Planting synchronisation trees for discovering interaction patterns among brain regions. In: ICDM PhD Forum

    Google Scholar 

  6. Behzadi S, Hlaváčková-Schindler K, Plant C (2019) Granger causality for heterogeneous processes. In: PAKDD. Springer, Berlin Heidelberg, pp 463–475

    Google Scholar 

  7. Behzadi S, Ibrahim MA, Plant C (2018) Parameter free mixed-type density-based clustering. In: DEXA

    Google Scholar 

  8. Behzadi S, Müller N, Plant C, Böhm C (2019) Clustering of mixed-type data considering concept hierarchies. In: PAKDD

    Google Scholar 

  9. Böhm C, Faloutsos C, Pan J, Plant C (2007) RIC: parameter-free noise-robust clustering. TKDD 1(3):10

    Article  Google Scholar 

  10. Böhm C, Faloutsos C, Plant C (2008) Outlier-robust clustering using independent components. In: SIGMOD, pp 185–198

    Google Scholar 

  11. Böhm C, Goebl S, Oswald A, Plant C, Plavinski M, Wackersreuther B (2010) Integrative parameter-free clustering of data with mixed type attributes. In: PAKDD, pp 38–47

    Google Scholar 

  12. Böhm C, Perdacher M, Plant C (2016) Cache-oblivious loops based on a novel space-filling curve. In: IEEE international conference on big data, bigdata, pp 17–26

    Google Scholar 

  13. Böhm C, Perdacher M, Plant C (2017) Multi-core k‑means. In: SDM, pp 273–281

    Google Scholar 

  14. Böhm C, Perdacher M, Plant C (2018) A novel hilbert curve for cache-locality preserving loops. In: IEEE Transactions on Big Data, pp 1–18

    Google Scholar 

  15. Deco G, Kringelbach ML (2017) Hierarchy of information processing in the brain: a novel ‘intrinsic ignition’ framework. Neuron 94(5):961–968

    Article  Google Scholar 

  16. Goebl S, He X, Plant C, Böhm C (2014) Finding the optimal subspace for clustering. In: ICDM. IEEE Computer Society, Shenzhen, China, pp 130–139

    Google Scholar 

  17. Goebl S, Tonch A, Böhm C, Plant C (2016) Megs: Partitioning meaningful subgraph structures using minimum description length. In: ICDM. IEEE Computer Society, Barcelona, Spain, pp 889–894

    Google Scholar 

  18. Granger CW (1969) Investigating causal relations by econometric models and cross-spectral methods. Econom J Econom Soc 37(3):424–438

    MATH  Google Scholar 

  19. Hahn S (1996) Hilbert transforms in signal processing. Artech House signal processing library, Artech House, Boston, USA

    MATH  Google Scholar 

  20. Hartigan JA, Hartigan PM (1985) The dip test of unimodality. Ann Stat 13(1):70–84

    Article  MathSciNet  Google Scholar 

  21. He X, Feng J, Konte B, Mai ST, Plant C (2014) Relevant overlapping subspace clusters on categorical data. In: KDD. ACM, New York, NY, USA, pp 213–222

    Google Scholar 

  22. Hlaváčková-Schindler K, Naumova V, Pereverzyev S (2017) Multi-penalty regularization for detecting relevant variables. In: Recent applications of harmonic analysis to function spaces, differential equations, and data science. Springer, Berlin Heidelberg, pp 889–916

    Chapter  Google Scholar 

  23. Hlaváčková-Schindler K (2016) Prediction consistency of Lasso regression does not need normal errors. Br J Math Comput Sci 19(4):1–7

    Article  Google Scholar 

  24. Kuramoto Y (1975) Self-entrainment of a population of coupled non-linear oscillators. In: Araki H (ed) International symposium on mathematical problems in theoretical physics. Springer, Berlin Heidelberg, pp 420–422

    Chapter  Google Scholar 

  25. Leodolter M (2017) R‑package for incremental dynamic time warping. https://cran.r-project.org/web/packages/IncDTW/index.html. Accessed: 18 Dez 2019

  26. Leodolter M, Brändle N, Plant C (2018) Automatic detection of warped patterns in time series: the caterpillar algorithm. In: ICBK, pp 423–431

    Google Scholar 

  27. Leodolter M, Plant C, Brändle N IncDTW: An R package for incremental calculation of dynamic time warping. Journal of Statistical Software. https://CRAN.R-project.org/package=IncDTW. Accessed: 18 Dez 2019

  28. Breakspear M, Heitmann S, Daffertshofer A (2010) Generative models of cortical oscillations: neurobiological implications of the Kuramoto model. Front Hum Neurosci. https://doi.org/10.3389/fnhum.2010.00190

    Article  Google Scholar 

  29. Maurus S, Plant C (2016) Skinny dip: clustering in a sea of noise. In: KDD. ACM, San Francisco, CA, USA

    Google Scholar 

  30. Mautz D, Plant C, Böhm C (2019) Deep embedded cluster tree. In: To be published in IEEE international conference on data mining. ICDM, Beijing, China

    Google Scholar 

  31. Mautz D, Ye W, Plant C, Böhm C (2017) Towards an optimal subspace for k‑means. In: KDD. ACM, Halifax, NS, Canada, pp 365–373

    Google Scholar 

  32. Mautz D, Ye W, Plant C, Böhm C (2018) Discovering non-redundant k‑means clusterings in optimal subspaces. In: KDD. ACM, London, United Kingdom, pp 1973–1982

    Google Scholar 

  33. Miklautz L, Mautz D, Altinigneli C, Böhm C, Plant C (2020) Deep embedded non-redundant clustering. In: To be published in proceedings of the conference on artificial intelligence. AAAI, New York, NY, USA

    Google Scholar 

  34. Pelleg D, Moore AW (2000) X‑means: Extending k‑means with efficient estimation of the number of clusters. In: ICML. Morgan Kaufmann, Stanford, CA, USA, pp 727–734

    Google Scholar 

  35. Perdacher M, Plant C, Böhm C (2019) Cache-oblivious high-performance similarity join. In: SIGMOD, pp 87–104

    Google Scholar 

  36. Pereverzyev S, Hlaváčková-Schindler K (2013) Graphical Lasso Granger method with 2‑level-thresholding for recovering causality networks. In: IFIP conference on system modeling and optimization. Springer, Berlin Heidelberg, pp 220–229

    Google Scholar 

  37. Plant C (2011) SONAR: signal de-mixing for robust correlation clustering. In: SDM. SIAM, Omnipress, Mesa, AZ, USA, pp 319–330

    Google Scholar 

  38. Plant C (2014) Metric factorization for exploratory analysis of complex data. In: ICDM. IEEE Computer Society, Shenzhen, China, pp 510–519

    Google Scholar 

  39. Plant C, Böhm C (2011) Inconco: interpretable clustering of numerical and categorical objects. KDD https://doi.org/10.1145/2020408.2020584

    Article  Google Scholar 

  40. Ries A, Chang C, Glim S, Meng C, Sorg C, Wohlschläger A (2018) Grading of frequency spectral centroid across resting-state networks. Front Hum Neurosci 12:436

    Article  Google Scholar 

  41. Sadilek M, Thurner S (2015) Physiologically motivated multiplex Kuramoto model describes phase diagram of cortical activity. Sci Rep. https://doi.org/10.1038/srep10015

    Article  Google Scholar 

  42. Schelling B, Plant C (2018) Diptransformation: enhancing the structure of a dataset and thereby improving clustering. In: ICDM, pp 407–416

    Google Scholar 

  43. Schelling B, Plant C (2018) Kmn – removing noise from k‑means clustering results. In: DaWaK

    Google Scholar 

  44. Schelling B, Plant C (2019) Dataset-transformation: Improving clustering by enhancing the structure with dipscaling and diptransformation. Knowl Inf Syst. https://doi.org/10.1007/s10115-019-01388-5

    Article  Google Scholar 

  45. Shao J, Plant C, Yang Q, Böhm C (2011) Detection of arbitrarily oriented synchronized clusters in high-dimensional data. In: ICDM. IEEE Computer Society, Vancouver, BC, Canada, pp 607–616

    Google Scholar 

  46. Ye W, Maurus S, Hubig N, Plant C (2016) Generalized independent subspace clustering. In: ICDM. IEEE Computer Society, Barcelona, Spain, pp 569–578

    Google Scholar 

Download references

Acknowledgements

We thank the University of Vienna, the Austrian government and theWiener Wissenschafts‑, Forschungs- und Technologiefondsfor the funding of our research. We thank all collaboration partners, most importantly Christian Böhm, Norbert Brändle, Tobias Golling, Anke Mayer-Baese, Irene Schicker, Junming Shao, Xin Sun and Afra Wohlschläger.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Martin Perdacher.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Altinigneli, C., Bauer, L.G.M., Behzadi, S. et al. The Data Mining Group at University of Vienna . Datenbank Spektrum 20, 71–79 (2020). https://doi.org/10.1007/s13222-020-00337-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13222-020-00337-9

Keywords

Navigation