Asymptotically optimal declustering schemes for 2-dim range queries

https://doi.org/10.1016/S0304-3975(02)00742-9Get rights and content
Under an Elsevier user license
open archive

Abstract

Declustering techniques have been widely adopted in parallel storage systems (e.g. disk arrays) to speed up bulk retrieval of multidimensional data. A declustering scheme distributes data items among multiple disks, thus enabling parallel data access and reducing query response time. We measure the performance of any declustering scheme as its worst case additive deviation from the ideal scheme. The goal thus is to design declustering schemes with as small an additive error as possible. We describe a number of declustering schemes with additive error O(logM) for 2-dimensional range queries, where M is the number of disks. These are the first results giving O(logM) upper bound for all values of M. Our second result is a lower bound on the additive error. It is known that except for a few stringent cases, additive error of any 2-dimensional declustering scheme is at least one. We strengthen this lower bound to Ω((logM)(d−1/2)) for d-dimensional schemes and to Ω(logM) for 2-dimensional schemes, thus proving that the 2-dimensional schemes described in this paper are (asymptotically) optimal. These results are obtained by establishing a connection to geometric discrepancy. We also present simulation results to evaluate the performance of these schemes in practice.

Cited by (0)

A preliminary version of this paper appeared in the 8th International Conference on Database Theory, ICDT 2001. The first two authors did this work at Bell Laboratories.