Abstract
Recently, Arya, da Fonseca, and Mount [STOC 2011, SODA 2012] made notable progress in improving the ϵ-dependencies in the space/query-time tradeoffs for (1 + ϵ)-factor approximate nearest neighbor search in fixed-dimensional Euclidean spaces. However, ϵ-dependencies in the preprocessing time were not considered, and so their data structures cannot be used to derive faster algorithms for offline proximity problems. Known algorithms for many such problems, including approximate bichromatic closest pair (BCP) and approximate Euclidean minimum spanning trees (EMST), typically have factors near (1/ϵ)d/2±O(1) in the running time when the dimension d is a constant.
We describe a technique that breaks the (1/ϵ)d/2 barrier and yields new results for many well-known proximity problems, including:
• an O((1/ϵ)d/3+O(1) n)-time randomized algorithm for approximate BCP,
• an O((1/ϵ)d/3+O(1) n log n)-time algorithm for approximate EMST, and
• an O(n log n + (1/ϵ)d/3+O(1) n)-time algorithm to answer n approximate nearest neighbor queries on n points.
Using additional bit-packing tricks, we can shave off the log n factor for EMST, and even move most of the ϵ-factors to a sublinear term.
The improvement arises from a new time bound for exact "discrete Voronoi diagrams", which were previously used in the construction of ϵ-kernels (or extent-based coresets), a well-known tool for another class of fundamental problems. This connection leads to more results, including:
• a streaming algorithm to maintain an approximate diameter in O((1/ϵ)d/3+O(1)) time per point using O((1/ϵ)d/2+O(1)) space, and
• a streaming algorithm to maintain an ϵ-kernel in O((1/ϵ)d/4+O(1)) time per point using O((1/ϵ)d/2+O(1)) space.