Rank and select operations on a word

doi:10.1016/j.ipl.2021.106148

Information Processing Letters

Volume 172, December 2021, 106148

https://doi.org/10.1016/j.ipl.2021.106148 Get rights and content

Highlights

•
Improve the running time of the population count from $O (\log \log w)$ to $O (\log^{⁎} w)$ .
•
$O (\log^{⁎} w)$ -time rank and select on an arbitrary word of w bits using $O (1)$ space.
•
$O (1)$ -time rank and select on bit-vectors not longer than $w / (\lg w - \lg \lg w + 1)$ .

Abstract

Given a bit-vector t, operation ${rank}_{1} (t, i)$ returns the number of occurrences of 1-bits in the prefix of t ending at position i. Operation ${select}_{1} (t, c), c \geq 1$ returns the position of the c-th 1-bit in t. These operations are building blocks for succinct data structures. We present algorithms of rank and select on an arbitrary word that run in $O (\log^{⁎} w)$ time and use constant space by basic operations on words, where w is the word size in bits of the word RAM model. The method improves the current known $O (\log \log w)$ method of counting the number of 1-bits in a word. We also give rank and select algorithms taking constant running time and space for bit-vectors of length not greater than $w / (⌈ \lg w - \lg \lg w ⌉ + 1)$ .

Introduction

The word sizes in modern CPUs are 32-bit, 64-bit, and 128-bit. CPU instructions access and process bits in a word in parallel. Much research has focused on efficient bit-wise operations on words and their applications [1], [2], [3], [5], [7], [14], including the position of the least/most significant bit, suffix and prefix sums, and bit-permutations. Knuth [9] gave a comprehensive introduction to bit-wise tricks and techniques.

Operations rank and select are fundamental to succinct data structures. Given a string t of length n on an alphabet Σ, ${rank}_{a} (t, i)$ ( $i \geq 0, a \in Σ$ ) returns the number of occurrences of a in the prefix of t ending at position i; ${select}_{a} (t, c)$ ( $c \geq 1$ ) returns the position of the c-th occurrence of a in t. These problems were first studied by Jacobson [8] who presented a data structure that takes $o (n)$ bits to support the rank queries on binary sequences in constant time. For select queries, Clark and Munro [4] gave a data structure that answers the query in constant time on a RAM with word-size $\log n$ , using $o (n)$ bits space. Pagh [10] and Raman et al. [11] presented the compressed representations of bit-vectors that support these queries in $O (1)$ time.

In this paper, we consider rank and select on a bit-vector packed in a word without any pre-processing. These operations are used in the construction of succinct data structures such as their counterparts for long strings. When the input is packed into words, these algorithms will benefit from the new approach in this paper.

Previous results. Counting the number of 1-bits in a word (also called the population count) has been studied extensively. Wegner [12] introduced an efficient $O (w)$ -time method that uses $x = x & (x - 1)$ to set the rightmost 1 of word x to 0, where w is the machine word size in bits. The first $O (\log w)$ time method is attributed to Wheeler et al. [13] by Kunth [9]. The method computes a word y from the input word x, where each field contains the number of 1's in the corresponding field of x. By doubling the length of the field of y, it achieves $O (\log w)$ time. HAKMEM [1] memo gave an $O (\log \log w)$ time method. It is the same as the $O (\log w)$ -time algorithm, except that when the field length l exceeds $\lg w$ , it adds all the fields of y by y modulo $2^{l} - 1$ . Recently, both AMD and Intel introduced a new instruction that supports counting the number of 1's in a word, called POPCNT, but no direct support for select.

Our results. We consider the Random Access Machine (RAM) with word size w and instructions on words, including arithmetic operations and bit-wise Boolean operations. We present algorithms of rank and select on an arbitrary word that both run in $O (\log^{⁎} w)$ time and use constant space, where $\log^{⁎} w$ is the number of times the logarithm is repeated before the number is less than 1. We also introduce a method to diffuse $n < w$ bits to a word. Based on the method, we present constant time and space algorithms of rank and select for bit-vectors of length not longer than $w / (⌈ \lg w - \lg \lg w ⌉ + 1)$ .

Section snippets

Notions and basic operations

For a bit-vector $t = t [0] t [1] \dots t [n - 1]$ of n bits, denote the substring $t [i] \dots t [j]$ ( $i \leq j$ ) by $t [i, j]$ . Taking as an integer, the value of t is $Σ_{i = 0}^{| t | - 1} t [i] 2^{i}$ . For convenience, when the context is clear we use t as the value of t. An l-bit ( $l > 0$ ) field of t, say $t [i l, (i + 1) l - 1] (i \geq 0)$ , is denoted by $t {〈 i 〉}_{l}$ . We will write a bit-vector in big-endian, that is, the higher bits are on the left. For example, a bit-vector t of two bits, where $t [0] = 1, t [1] = 0$ , is written as 01. For an integer $c \geq 0$ , $t^{c}$ denotes the

Computing a block sum of a word

We use multiplication to count the number of 1's in a word. Assume that the number of 1's in x is not greater than $2^{l} - 1$ , that is, the number fits into l bits. If w is a multiple of l, then by $y \leftarrow x^{[l]} ⁎ {(0^{l - 1} 1)}^{w / l}$ , we have that the most significant l bits of word y equals the sum of all $x^{[l]} {〈 i 〉}_{l}$ , that is, the number of 1's in x.

For smaller l such that $2^{l} - 1 < w$ , that is, the number of 1's in x cannot fit into l bits, we use a multiplication to work out $x^{[l^{'}]}$ from $x^{[l]}$ , where $l^{'}$ is a multiple of l and $l$

Rank

Algorithm. Using the idea of the last section, we give the rank algorithm as follows.

Analysis. The procedure computes block sums of x with increasing block sizes, say,

x^{[4]}, x^{[L (4)]}, x^{[L^{(2)} (4)]}, \dots

. If

L (l)

is greater than

i + 1

, line 6 sets the block size to a multiple of l that is the minimum not less than

i + 1

, and computes the block sum. Each iteration uses a constant number of operations, and the number of iterations of the while loop is

L^{⁎} (i + 1, 4)

. The running time of rank is

O (L^{⁎} (w, 4))

Select

We present an algorithm to compute ${select}_{1} (x, c)$ . The algorithm uses a prefix sum of x of block size g, where $⌈ \lg (w + 1) ⌉ \leq g \leq ⌊ \sqrt{w} ⌋$ . The selection of g and the computation of $x^{[g]}$ are presented in Subsection 5.1. We assume that the most significant $(w \mod g)$ bits of the input word are 0's.

The select procedure operates in two stages. In the first stage, it finds the index of the block where the c-th 1-bit resides, which is the minimum integer j such that ${rank}_{1} (x, (j + 1) g - 1) \geq c$ . In the second stage, it

$O (\log^{⁎} w)$ -time rank

We give a variation of the rank algorithm that runs in $O (\log^{⁎} w)$ time. We set the initial block size $ℓ_{1}$ of rank to 2 and the block size in the i-th ( $i \geq 2$ ) iteration $ℓ_{i}$ to $2^{ℓ_{i - 1}}$ . The maximum number that can be represented by $ℓ_{i}$ bits is $2^{ℓ_{i}} - 1 = ℓ_{i + 1} - 1 < ℓ_{i + 1}$ . So $ℓ_{i}$ bits cannot hold the number of 1's in an $ℓ_{i + 1}$ -bit block. In the running of this revised rank, sums in blocks may overflow to adjacent left blocks. But the revised rank works on $x & {(01)}^{w / 2}$ correctly. In $x & {(01)}^{w / 2}$ , the number of 1's in each ${}^{i}2$

Constant-time rank and select for short bit-vectors

For short bit-vectors, the number of 1's can be computed by a constant number of basic operations. Let v be an n-bit bit-vector. Let l be the block size such that $n l \leq w$ . Assume that we can diffuse bits in v to a word z such that the LSB of each block $z {〈 i 〉}_{l}$ equals a unique bit in v. If l is big enough, for example $l = ⌈ \lg (n + 1) ⌉$ , then one BS operation can compute the number of 1's in z.

In this section, we first present a diffuse algorithm. The algorithm of Diffuse was first given by Fredman and

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This work is supported by the Commission of Development and Reform of Jilin Province No. 2019C053-10 and the Education Department of Jilin Province No. JJKH20190162KJ.

References (14)

Oren Ben-Kiki et al.
Towards optimal packed string matching
Theor. Comput. Sci.
(2014)
Michael L. Fredman et al.
Surpassing the information theoretic bound with fusion trees
J. Comput. Syst. Sci.
(1993)
Kimmo Fredriksson et al.
Exploiting word-level parallelism for fast convolutions and their applications in approximate string matching
Eur. J. Comb.
(2013)
R.W. Gosper et al.
Andrej Brodnik et al.
Trans-dichotomous algorithms without multiplication - some upper and lower bounds
D.R. Clark
Compact PAT Trees
(1996)
F.E. Fich
Constant time operations for words of length w
(1999)

There are more references available in the full text version of this article.

Cited by (0)

View full text

Rank and select operations on a word

Highlights

Abstract

Introduction

Section snippets

Notions and basic operations

Computing a block sum of a word

Rank

Select

O(log⁎⁡w)-time rank

Constant-time rank and select for short bit-vectors

Declaration of Competing Interest

Acknowledgements

Theor. Comput. Sci.

J. Comput. Syst. Sci.

Eur. J. Comb.

Trans-dichotomous algorithms without multiplication - some upper and lower bounds

Compact PAT Trees

Constant time operations for words of length w

$O (\log^{⁎} w)$ -time rank