Skip to main content
Log in

Binary Searching with Nonuniform Costs and Its Application to Text Retrieval

  • Published:
Algorithmica Aims and scope Submit manuscript

Abstract.

We study the problem of minimizing the expected cost of binary searching for data where the access cost is not fixed and depends on the last accessed element, such as data stored in magnetic or optical disk. We present an optimal algorithm for this problem that finds the optimal search strategy in O(n 3 ) time, which is the same time complexity of the simpler classical problem of fixed costs. Next, we present two practical linear expected time algorithms, under the assumption that the access cost of an element is independent of its physical position. Both practical algorithms are online, that is, they find the next element to access as the search proceeds. The first one is an approximate algorithm which minimizes the access cost disregarding the goodness of the problem partitioning. The second one is a heuristic algorithm, whose quality depends on its ability to estimate the final search cost, and therefore it can be tuned by recording statistics of previous runs.

We present an application for our algorithms related to text retrieval. When a text collection is large it demands specialized indexing techniques for efficient access. One important type of index is the suffix array, where data access is provided through an indirect binary search on the text stored in magnetic disk or optical disk. Under this cost model we prove that the optimal algorithm cannot perform better than Ω(1/ log n) times the standard binary search. We also prove that the approximate strategy cannot, on average, perform worse than 39% over the optimal one. We confirm the analytical results with simulations, showing improvements between 34% (optimal) and 60% (online) over standard binary search for both magnetic and optical disks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Author information

Authors and Affiliations

Authors

Additional information

Received February 13, 1997; revised May 27, 1998.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Navarro, G., Baeza-Yates, R., Barbosa, E. et al. Binary Searching with Nonuniform Costs and Its Application to Text Retrieval . Algorithmica 27, 145–169 (2000). https://doi.org/10.1007/s004530010010

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s004530010010

Navigation