skip to main content
10.1145/2882903.2914833acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
poster

K-means Split Revisited: Well-grounded Approach and Experimental Evaluation

Authors Info & Claims
Published:26 June 2016Publication History

ABSTRACT

R-tree is a data structure used for multidimensional indexing. Essentially, it is a balanced tree consisting of nested hyper-rectangles which are used to locate the data. One of the most performance sensitive parts of this data structure is its split algorithm, which runs during node overflows. The split can be performed in multiple ways, according to many different criteria and in general the problem of finding an optimal solution is NP-hard. There are many heuristic split algorithms. In this paper we study an existing k-means node split algorithm. We describe a number of serious issues in its theoretical foundation, which made us to re-design k-means split. We propose several well-grounded solutions to the re-emerged problem of k-means split. Finally, we report the comparison results using PostgreSQL and contemporary benchmark for multidimensional structures.

References

  1. N. Beckmann and B. Seeger. A benchmark for multidimensional index structures. http://www.mathematik.uni-marburg.de/~rstar/benchmark/distributions.pdf, 2008.Google ScholarGoogle Scholar
  2. N. Beckmann and B. Seeger. A revised R*-tree in comparison with related index structures. ACM SIGMOD, pages 799--812, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Brakatsoulas et al. Revisiting R-Tree Construction Principles. ADBIS, pages 149--162, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Chavent and J. Saracco. On central tendency and dispersion measures for intervals and hypercubes. Communications in Statistics--Theory and Methods, 37(9):1471--1482, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  5. A. Guttman. R-trees: a dynamic index structure for spatial searching. SIGMOD Rec., 14(2):47--57, 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. N. Papadopoulos et al. R-Tree (and Family). In L. Liu and M. T. Özsu, editors, Encyclopedia of Database Systems, pages 2453--2459. 2009.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. K-means Split Revisited: Well-grounded Approach and Experimental Evaluation

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SIGMOD '16: Proceedings of the 2016 International Conference on Management of Data
          June 2016
          2300 pages
          ISBN:9781450335317
          DOI:10.1145/2882903

          Copyright © 2016 Owner/Author

          Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 26 June 2016

          Check for updates

          Qualifiers

          • poster

          Acceptance Rates

          Overall Acceptance Rate785of4,003submissions,20%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader