Succinct data structures for nearest colored node in a tree

doi:10.1016/j.ipl.2017.10.001

Information Processing Letters

Volume 132, April 2018, Pages 6-10

https://doi.org/10.1016/j.ipl.2017.10.001 Get rights and content

Highlights

•
We give a data-structure that stores a tree with colors on the nodes. Given a node x and a color α, the structure finds the nearest node to x with color α.
•
The structure is succinct, namely its space complexity is very near to the optimal space required to store the tree.
•
This structure improves the $O (n l o g n)$ -bits structure.

Abstract

We give succinct data structures that store a tree with colors on the nodes. Given a node x and a color α, the structures find the nearest node to x with color α. Our results improve the $O (n \log n)$ -bits structure of Gawrychowski et al. (2016) [12].

Introduction

In the nearest colored node problem the goal is to store a tree with colors on the nodes such that given a node x and a color α, the nearest node to x with color α can be found efficiently. Gawrychowski et al. [12] gave a data structure for this problem that uses $O (n \log n)$ bits and answers queries in $O (\log \log n)$ time, where n is the number of nodes in the tree.

In this paper we give succinct structures for the nearest colored node problem. Our results are given in the following theorem.

Theorem 1

Let T be a colored tree with n nodes and colors from $[1, σ]$ , and let $P_{T}$ be a string containing the colors of the nodes in preorder.

1.
For $σ = o (\log n / {(\log \log n)}^{2})$ , for any $k = o (\log n / \log^{2} σ)$ , there is a representation of T that uses $n H_{k} (P_{T}) + 2 n + o (n)$ bits and answers nearest colored node queries in $O (1)$ time, where $H_{k} (P_{T})$ is the k-th order entropy of $P_{T}$ .
2.
For $σ = w^{O (1)}$ (where w is the word size), for any function $f (n) = ω (1)$ , there is a representation of T that uses $n H_{0} (P_{T}) + 2 n + o (n)$ bits and answers nearest colored node queries in $O (f (n))$ time.
3.
For $σ \leq n$ , there is a representation of T that uses $n H_{0} (P_{T}) + 2 n + o (n H_{0} (P_{T})) + o (n)$ bits and answers nearest colored node queries in $O (\log \frac{\log σ}{\log w})$ time.

Theorem 1 improves both the space complexity and the query time complexity of the structure of Gawrychowski et al. [12].

Gawrychowski et al. [12] also considered a dynamic version of the nearest colored node problem in which the colors of the nodes can be changed. For this problem they gave an $O (n \log n)$ bits structure that supports updates and queries in $O (\log n)$ time. They also gave a structure with $O (n \log^{2 + ϵ} n)$ space, optimal $O (\log n / \log \log n)$ query time, and $O (\log^{1 + ϵ} n)$ update time.

Several papers studied data structures for storing colored trees with support for various queries [4], [6], [9], [13], [14], [19], [20]. In particular, the problem of finding the nearest ancestor with color α was considered in [6], [13], [14], [19], [20]. In order to solve the nearest colored node problem, we combine techniques from the papers above and from Gawrychowski et al. [12].

Another related problem is to find an approximate nearest node with color α. This problem has been studied in general graphs [7], [15], [16] and planar graphs [1], [17], [18].

Section snippets

Preliminaries

Throughout the paper we assume the tree T is an ordinal tree (for a non-ordinal tree an ordering can be chosen arbitrarily). When we write that a node w is a descendant of v it means that either $w = v$ or w is a proper descendant of v. The same holds for other tree terminology, e.g. ancestor.

A node with color α will be called α-node. We also use other α-terms with the appropriate meaning, e.g. an α-descendant of a node v is a descendant of v with color α.

Proof of part 1 of Theorem 1

Our structure is similar to the labeled tree structure of He et al. [14]. As in [14], the data structure stores $P_{T}$ in the compressed structure of Ferragina and Venturini [10], the tree T without the colors in the data structure of Farzan and Munro [8], and additional structures described below. Recall that the structure of T keeps the balanced parenthesis string of T.

Using the tree decomposition of Lemma 2, the tree T is partitioned into mini-trees of size at most $L^{'} = ⌈ \log^{2} n ⌉$ , and every

Proof of parts 2 and 3 of Theorem 1

Our data structure stores the rank-select structure of Belazzougui and Navarro [5] on $P_{T}$ , the tree structure of Farzan and Munro [8] on the tree T without the colors, and additional information that will be described below.

Our structure is similar to the structure of Gawrychowski et al. [12]. We next give a short description of the structure of [12]. For a color α, let $Z_{α}$ be the set of all α-nodes and their ancestors, and let $Y_{α}$ be the set of all nodes $x \in Z_{α}$ such that either x has color α, or x

References (20)

S. Alstrup et al.
Optimal on-line decremental connectivity in trees
Inf. Process. Lett.
(1997)
J. Barbay et al.
Adaptive searching in succinctly encoded binary relations and tree-structured documents
Theor. Comput. Sci.
(2007)
P. Ferragina et al.
A simple storage scheme for strings achieving entropy bounds
Theor. Comput. Sci.
(2007)
D. Tsur
Succinct representation of labeled trees
Theor. Comput. Sci.
(2015)
I. Abraham et al.
Approximate nearest neighbor search in metrics of planar graphs
S. Alstrup et al.
Minimizing diameters of dynamic trees
D. Belazzougui et al.
Optimal lower and upper bounds for representing sequences
ACM Trans. Algorithms
(2015)
P. Bille et al.
Compressed subsequence matching and packed tree coloring
Algorithmica
(2017)
S. Chechik
Improved distance oracles and spanners for vertex-labeled graphs
A. Farzan et al.
A uniform paradigm to succinctly encode various families of trees
Algorithmica
(2014)

There are more references available in the full text version of this article.

Cited by (6)

Connectivity Labeling in Faulty Colored Graphs
2024, arXiv
Near-Optimal Distance Oracles for Vertex-Labeled Planar Graphs
2021, Leibniz International Proceedings in Informatics, LIPIcs
Near-optimal distance oracles for vertex-labeled planar graphs
2021, arXiv
On the complexity of the (approximate) nearest colored node problem
2018, Leibniz International Proceedings in Informatics, LIPIcs
On the complexity of the (approximate) nearest colored node problem
2018, arXiv
Compendious and Optimized Succinct Data Structures for Big Data Store
2018, SSRN

View full text

Succinct data structures for nearest colored node in a tree

Highlights

Abstract

Introduction

Section snippets

Preliminaries

Proof of part 1 of Theorem 1

Proof of parts 2 and 3 of Theorem 1

Inf. Process. Lett.

Theor. Comput. Sci.

Theor. Comput. Sci.

Theor. Comput. Sci.

Approximate nearest neighbor search in metrics of planar graphs

Minimizing diameters of dynamic trees

Optimal lower and upper bounds for representing sequences

ACM Trans. Algorithms

Compressed subsequence matching and packed tree coloring

Algorithmica

Improved distance oracles and spanners for vertex-labeled graphs

A uniform paradigm to succinctly encode various families of trees

Algorithmica