Scalability, Search, and Sampling: From Smart Algorithms to Active Discovery

Wrobel, Stefan

doi:10.1007/3-540-44794-6_45

Stefan Wrobel^3,4

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2168))

Included in the following conference series:

European Conference on Principles of Data Mining and Knowledge Discovery

2570 Accesses

Abstract

The focus on scalability to very large datasets has been a distinguishing feature of the KDD endeavour right from the start of the area. In the present stage of its development, the field has begun to seriously approach the issue, and a number of different techniques for scaling up KDD algorithms have emerged. Traditionally, such techniques are concentrating on the search aspects of the problem, employing algorithmic techniques to avoid searching parts of the space or to speed up processing by exploiting properties of the underlying host systems. Such techniques guarantee perfect correctness of solutions, but can never reach sublinear complexity. In contrast, researchers have recently begun to take a fresh and principled look at stochastic sampling techniques which give only an approximate quality guarantee, but can make runtimes almost independent of the size of the database at hand. In the talk, we give an overview of both of these classes of approaches, focusing on individual examples from our own work for more detailed illustrations of how such techniques work. We briefly outline how active learning elements may enhance KDD approaches in the future.

Download to read the full chapter text

Chapter PDF

A Family of Unsupervised Sampling Algorithms

Space-Time Trade-Offs for the LCP Array of Wheeler DFAs

Structural Decomposition Methods: Key Notions and Database Applications

Author information

Authors and Affiliations

School of Computer Science, IWS, Otto-von-Guericke-Universitat Magdeburg, P.O.Box 4120, Universitatsplatz 2, 39016, Magdeburg, Germany
Stefan Wrobel
Knowledge Discovery and Machine Learning Group, Otto-von-Guericke-Universitat Magdeburg, P.O.Box 4120, Universitatsplatz 2, 39016, Magdeburg, Germany
Stefan Wrobel

Authors

Stefan Wrobel
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Albert-Ludwigs University Freiburg, Georges Köhler-Allee, Geb. 079, 79110, Freiburg, Germany
Luc De Raedt
Inst.of Information and Computing Sciences Dept. of Mathematics and Computer Science, University of Utrecht, Padualaan 14, de Uithof, 3508, TB Utrecht, The Netherlands
Arno Siebes

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wrobel, S. (2001). Scalability, Search, and Sampling: From Smart Algorithms to Active Discovery. In: De Raedt, L., Siebes, A. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 2001. Lecture Notes in Computer Science(), vol 2168. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44794-6_45

Download citation

DOI: https://doi.org/10.1007/3-540-44794-6_45
Published: 28 August 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42534-2
Online ISBN: 978-3-540-44794-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Scalability, Search, and Sampling: From Smart Algorithms to Active Discovery

Abstract

Chapter PDF

Similar content being viewed by others

A Family of Unsupervised Sampling Algorithms

Space-Time Trade-Offs for the LCP Array of Wheeler DFAs

Structural Decomposition Methods: Key Notions and Database Applications

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Scalability, Search, and Sampling: From Smart Algorithms to Active Discovery

Abstract

Chapter PDF

Similar content being viewed by others

A Family of Unsupervised Sampling Algorithms

Space-Time Trade-Offs for the LCP Array of Wheeler DFAs

Structural Decomposition Methods: Key Notions and Database Applications

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation