Skip to main content
Log in

Dictionary-based order-preserving string compression

  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract.

As no database exists without indexes, no index implementation exists without order-preserving key compression, in particular, without prefix and tail compression. However, despite the great potentials of making indexes smaller and faster, application of general compression methods to ordered data sets has advanced very little. This paper demonstrates that the fast dictionary-based methods can be applied to order-preserving compression almost with the same freedom as in the general case. The proposed new technology has the same speed and a compression rate only marginally lower than the traditional order-indifferent dictionary encoding. Procedures for encoding and generating the encode tables are described covering such order-related features as ordered data set restrictions, sensitivity and insensitivity to a character position, and one-symbol encoding of each frequent trailing character sequence. The experimental results presented demonstrate five-folded compression on real-life data sets and twelve-folded compression on Wisconsin benchmark text fields.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

Author information

Authors and Affiliations

Authors

Additional information

Edited by M.T. Ozsu. Received 1 February 1995 / Accepted 1 November 1995

Rights and permissions

Reprints and permissions

About this article

Cite this article

Antoshenkov, G. Dictionary-based order-preserving string compression . The VLDB Journal 6, 26–39 (1997). https://doi.org/10.1007/s007780050031

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s007780050031

Navigation