Skip to main content

Revdbscan and Flexscan—\(O(n\log n)\) Clustering Algorithms

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2021)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1516))

Included in the following conference series:

Abstract

The goal of this paper is to present two new algorithms conceptually close to density-based clustering. Both algorithms deal with problems no worse than the dbscan algorithm, and additionally, flexscan deals with nonuniform distributions of data. The complexity of both algorithms is \(O(n \log n)\) in contrary to the well-known dbscan algorithm which complexity is \(O(n^2)\). Additionally, we show that the complexity of dbscan cannot be reduced to \(O(n \log n)\) just by using locality sensitive hashing trees (or either r-trees or kd-trees).

In the final part of the paper, we present results on benchmark datasets. Results clearly show the superiority of the proposed algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    The \(\log ^*\) denotes iterative logarithm.

References

  1. Barton, T., Bruna, T., Kordik, P.: Chameleon 2: an improved graph-based clustering algorithm. ACM Trans. Knowl. Discov. Data 13(1), 10:2–10:27 (2019)

    Google Scholar 

  2. Barton, T., Bruna, T., Kordik, P.: Web page (2021). https://github.com/deric/clustering-benchmark

  3. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Data structures for disjoint sets. In: Introduction to Algorithms, pp. 571–572. MIT Press, Cambridge (2009)

    MATH  Google Scholar 

  4. Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Evangelos Simoudis, J.H., Fayyad, U.M. (eds.) Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD), pp. 226–231. AAAI Press (1996)

    Google Scholar 

  5. Forgy, E.W.: Cluster analysis of multivariate data: efficiency versus interpretability of classifications. Biometrics 21(3), 768–769 (1965)

    Google Scholar 

  6. Indyk, P., Motwani, R.: Approximate nearest neighbor–towards removing the curse of dimensionality. In: The Thirtieth Annual ACM Symposium on Theory of Computing, pp. 604–613 (1998)

    Google Scholar 

  7. Karypis, G., Han, E.H.S., Kumar, V.: Chameleon: hierarchical clustering using dynamic modeling. Computer 32(8), 68–75 (1999)

    Article  Google Scholar 

  8. Lloyd, S.P.: Least square quantization in PCM. Technical Report, Bell Telephone Laboratories Paper (1957)

    Google Scholar 

  9. Lloyd, S.P.: Least squares quantization in PCM. IEEE Trans. Inf. Theor. 28(2), 129–137 (1982). https://doi.org/10.1109/TIT.1982.1056489

    Article  MathSciNet  MATH  Google Scholar 

  10. Orliński, M., Jankowski, N.: Fast t-SNE algorithm with forest of balanced LSH trees and hybrid computation of repulsive forces. Knowl. Based Syst. 206, 1–16 (2020). https://doi.org/10.1016/j.knosys.2020.106318

    Article  Google Scholar 

  11. Steinhaus, H.: Sur la division des corps matériels en parties. Bull. Acad. Polon. Sci. Cl. III. 4(1956), 801–804 (1957)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Norbert Jankowski .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jankowski, N. (2021). Revdbscan and Flexscan—\(O(n\log n)\) Clustering Algorithms. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds) Neural Information Processing. ICONIP 2021. Communications in Computer and Information Science, vol 1516. Springer, Cham. https://doi.org/10.1007/978-3-030-92307-5_75

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-92307-5_75

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-92306-8

  • Online ISBN: 978-3-030-92307-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics