# Synchronizing Data Words for Register Automata 

Karin Quaas<br>Universität Leipzig

Mahsa Shirmohammadi<br>CNRS \& IRIF


#### Abstract

Register automata (RAs) are finite automata extended with a finite set of registers to store and compare data from an infinite domain. We study the concept of synchronizing data words in RAs: does there exist a data word that sends all states of the RA to a single state?

For deterministic RAs with $k$ registers ( $k$-DRAs), we prove that inputting data words with $2 k+1$ distinct data from the infinite data domain is sufficient to synchronize. We show that the synchronization problem for DRAs is in general PSPACE-complete, and it is NLOGSPACE-complete for 1-DRAs. For nondeterministic RAs (NRAs), we show that Ackermann $(n)$ distinct data (where $n$ is the size of the RA) might be necessary to synchronize. The synchronization problem for NRAs is in general undecidable, however, we establish Ackermann-completeness of the problem for 1-NRAs. Another main result is the NEXPTIME-completeness of the length-bounded synchronization problem for NRAs, where a bound on the length of the synchronizing data word, written in binary, is given. A variant of this last construction allows to prove that the length-bounded universality problem for NRAs is co-NEXPTIME-complete.


## 1 Introduction

Given a deterministic finite automaton (DFA), a synchronizing word is a word that sends all states of the automaton to a unique state. Synchronizing words for finite automata have been studied since the 1970s [8, 25, 30, 23] and are the subject of one of the most well known open problems in automata theory-the Černý conjecture. This conjecture states that the length of a shortest synchronizing word for a DFA with $n$ states is at most $(n-1)^{2}$. Synchronizing words moreover have applications in planning, control of discrete event systems, biocomputing, and robotics [3, 30, 15. More recently the notion has been generalized from automata to games [21, 28, 20] and infinite-state systems [14, 9 , with applications to modelling complex systems such as distributed data networks or real-time embedded systems.

In this paper we are interested in synchronizing data words for register automata. Data words are sequences of pairs, where the first element of each pair is taken from a finite alphabet and the second element is taken from an infinite data domain, such as the natural numbers or ASCII strings. Data words have applications in querying and reasoning about data models with complex structural properties, e.g., XML and graph databases [1, 16, 5, 2]. For reasoning about data words, various formalisms have been considered, including first-order logic for data words [4, 6], extensions of linear temporal logic [22, 12, 11, 13], data automata [7, 4, register automata [19, 26, 24, 11] and extensions thereof, e.g. [29, 17, 10].

Register automata (RAs) are a generalization of finite automata for processing data words. RAs are equipped with a finite set of registers that can store data values. While processing a data word such an automaton can store the datum at the current position in one of its registers; it can also test the current datum for equality with data already stored in its registers. In applications, RAs allow for handling parameters such as user names, passwords, identifiers of connections, sessions, etc. RAs come in many variants, including one-way, two-way, deterministic, nondeterministic, and alternating. For alternating one-way RAs, classical language-theoretic decision problems, such as emptiness, universality and inclusion are undecidable. In this paper, we focus on the class of one-way nondeterministic RAs, which have a decidable emptiness problem [19, and the subclass of nondeterministic RAs with a single register, which has a decidable universality problem [11.

Semantically, an RA defines an infinite-state transition system due to the unbounded domain for the data stored in the registers. Synchronizing words were introduced for infinite-state systems with infi-
nite branching in [14, 28; in particular, the notion of synchronizing words is motivated and studied for weighted automata and timed automata. In some infinite-state settings, such as nested-word automata, finding the right definition of synchronizing word is however more challenging 9$]$. We define the synchronization problem for RAs within the framework suggested in [14, 28: given an RA $\mathcal{R}$ over a finite alphabet $\Sigma$ and an infinite data domain $D$, does there exist a data word $w \in(\Sigma \times D)^{+}$and some state $q_{w}$ such that the word $w$ sends each of the infinitely many states of $\mathcal{R}$ to $q_{w}$ ? Note that the state $q_{w}$ depends on the word $w$; we call such a data word a synchronizing data word.

Contribution. The problem of finding synchronizing data words for RAs poses new challenges in the area of synchronization. It is natural to ask how many distinct data are necessary and sufficient to synchronize an RA, which we refer to as the data efficiency of synchronizing data words. We show that the data efficiency is polynomial in the number of registers for deterministic RAs (DRAs). For nondeterministic RAs (NRAs), we provide an example that shows that the data efficiency may be $\operatorname{Ackermann}(n)$, where $n$ is the number of states of the NRA. Remarkably, the data efficiency is tightly related to the complexity of deciding the synchronization problem. For DRAs, we prove that for all automata $\mathcal{R}$ with $k$ registers, if $\mathcal{R}$ has a synchronizing data word, then it also has one with data efficiency at most $2 k+1$. We provide a family $\left(\mathcal{R}_{k}\right)_{k \in \mathbb{N}}$ of DRAs with $k$ registers, for which indeed a polynomial data efficiency (in $k$ ) is necessary to synchronize. This bound is the base of an (N)PSPACE-algorithm for DRAs; we prove a matching PSPACE lower bound by ideas carried over from timed settings [14. We show that the synchronization problems for DRAs with a single register (1-DRAs) and for DFAs are NLOGSPACE-interreducible, implying that the problem is NLOGSPACE-complete for 1-DRAs.

For NRAs, a reduction from the non-universality problem yields the undecidability of the synchronization problem. For single-register NRAs (1-NRAs), we prove Ackermann-completeness of the problem by a novel construction proving that the synchronization problem and the non-universality problem for 1-NRAs are polynomial-time interreducible. We believe that this technique is useful in studying synchronization in all nondeterministic settings, requiring careful analysis of the size of the construction.

Another main contribution is to prove NEXPTIME-completeness of the length-bounded synchronization problem for NRAs: given a bound on the length (written in binary), does there exist a synchronizing data word with length at most the given bound? For the lower bound, we present a reduction from the membership problem of $\mathcal{O}\left(2^{n}\right)$-time bounded nondeterministic Turing machines. The crucial ingredient in this reduction is a family of RAs implementing binary counters. A variant of our construction yields a proof for co-NEXPTIME-completeness of the length-bounded universality problem for NRAs; the lengthbounded universality problem asks whether all data words of length at most a given bound (written in binary) are in the language of the automaton. We further make a connection to the emptiness problem of single-register alternating RAs.

An extended abstract of this article has appeared in the Proceedings of the 41st International Symposium on Mathematical Foundations of Computer Science, (MFCS) 2016 [?]. In comparison with the extended abstract, here we simplify two of the main constructions and add detailed proofs of all results. The main improvement is giving a simpler NEXPTIME-hardness reduction for the length-bounded synchronization problem for NRAs.

## 2 Preliminaries

A deterministic finite-state automaton (DFA) is a tuple $\mathcal{A}=\langle Q, \Sigma, \Delta\rangle$, where $Q$ is a finite set of states, $\Sigma$ is a finite alphabet, and $\Delta: Q \times \Sigma \rightarrow Q$ is a transition function that is totally defined. The function $\Delta$ extends to finite words in a natural way: $\Delta(q, w a)=\Delta(\Delta(q, w), a)$ for all words $w \in \Sigma^{*}$ and letters $a \in \Sigma$; it extends to all sets $S \subseteq Q$ by $\Delta(S, w)=\bigcup_{q \in S} \Delta(q, w)$.

Data Words and Register automata. For the rest of this paper, fix an infinite data domain $D$. Given a finite alphabet $\Sigma$, a data word over $\Sigma$ is a finite words over $\Sigma \times D$. For a data word $w=$ $\left(a_{1}, d_{1}\right)\left(a_{2}, d_{2}\right) \cdots\left(a_{n}, d_{n}\right)$, the length $|w|$ of $w$ is $n$. We use data $(w)=\left\{d_{1}, \ldots, d_{n}\right\} \subseteq D$ to refer to the set of data values occurring in $w$, and we define the data efficiency of $w$ to be $|\operatorname{data}(w)|$.

Let $R$ be a finite set of register variables. We define register constraints $\phi$ over $R$ by the grammar

$$
\phi::=\text { true }|=r| \phi \wedge \phi \mid \neg \phi,
$$

where $r \in R$. We denote by $\Phi(R)$ the set of all register constraints over $R$. We may use $\neq r$ for the inequality constraint $\neg(=r)$. A register valuation is a mapping $\nu: R \rightarrow D$ that assigns a data value to each register; we sometimes write $\nu=\left(\nu\left(r_{1}\right), \cdots, \nu\left(r_{k}\right)\right) \in D^{k}$, where $R=\left\{r_{1}, \cdots, r_{k}\right\}$. The satisfaction relation of register constraints is defined on $D^{k} \times D$ as follows: $(\nu, d)$ satisfies the constraint $=r$ if $\nu(r)=d$; the other cases follow. For example, $\left(\left(d_{1}, d_{2}, d_{1}\right), d_{2}\right)$ satisfies $\left.\left(\left(=r_{1}\right) \wedge\left(=r_{2}\right)\right) \vee\left(\neq r_{3}\right)\right)$ if $d_{1} \neq d_{2}$. For the set up $\subseteq R$ and $d \in D$, we define the update $\nu[$ up $:=d]$ of valuation $\nu$ by $(\nu[$ up $:=d])(r)=d$ if $r \in$ up, and $(\nu[$ up $:=d])(r)=\nu(r)$ otherwise.

A register automaton (RA) is a tuple $\mathcal{R}=\langle L, R, \Sigma, T\rangle$, where $L$ is a finite set of locations, $R$ is a finite set of registers, $\Sigma$ is a finite alphabet and $T \subseteq L \times \Sigma \times \Phi(R) \times 2^{R} \times L$ is a transition relation. We may use $\ell \xrightarrow{\phi a \text { up } \downarrow} \ell^{\prime}$ to show transitions $\left(\ell, a, \phi\right.$, up,$\left.\ell^{\prime}\right) \in T$. We call $\ell \xrightarrow{\phi a \text { up } \downarrow} \ell^{\prime}$ an $a$-transition and $\phi$ the guard of this transition. A guard true is vacuously true and may be omitted. Likewise we may omit up if up $=\emptyset$. We may write $r \downarrow$ when up $=\{r\}$ is a singleton set. For NRAs with only one register, we may shortly write $=$ and $\neq$ for the guards $=r$ and $\neq r$, respectively, and $\downarrow$ for the update $\downarrow r$.

A configuration of $\mathcal{R}$ is a pair $(\ell, \nu) \in L \times D^{|R|}$ of a location $\ell$ and a register valuation $\nu$. We describe the behaviour of $\mathcal{R}$ as follows: Given a configuration $q=(\ell, \nu)$ and some input $(a, d) \in \Sigma \times D$ an $a$-transition $\ell \xrightarrow{\phi \quad a \text { up } \downarrow} \ell^{\prime}$ may be fired from $q$ if $(\nu, d)$ satisfies the constraint $\phi$; then $\mathcal{R}$ moves to the successor configuration $q^{\prime}=\left(\ell^{\prime}, \nu^{\prime}\right)$, where $\nu^{\prime}=\nu[$ up $:=d]$ is the update of $\nu$. By post $(q,(a, d))$, we denote the set of all successor configurations $q^{\prime}$ of $q$ on input $(a, d)$. We extend post to sets $S \subseteq L \times D^{|R|}$ of configurations by $\operatorname{post}(S,(a, d))=\bigcup_{q \in S} \operatorname{post}(q,(a, d))$; and we extend post to words by $\operatorname{post}(S, w \cdot(a, d))=$ $\operatorname{post}(\operatorname{post}(S, w),(a, d))$ for all words $w \in(\Sigma \times D)^{*}$, and all inputs $(a, d) \in \Sigma \times D$.

A run of $\mathcal{R}$ over the data word $w=\left(a_{1}, d_{1}\right)\left(a_{2}, d_{2}\right) \cdots\left(a_{n}, d_{n}\right)$ is a sequence of configurations $q_{0} q_{1} \ldots q_{n}$, where $q_{i} \in \operatorname{post}\left(q_{i-1},\left(a_{i}, d_{i}\right)\right)$ for all $1 \leq i \leq n$. If $\mathcal{R}$ reaches a configuration $q=(\ell, x)$ during processing a word $w$, we may say that an $x$-token is in $\ell$ (or simply a token is in $\ell$ ).

In the rest of the paper, we consider complete RAs, meaning that for all configurations $q \in L \times D^{|R|}$ and all inputs $(a, d) \in \Sigma \times D$, there is at least one successor: $|\operatorname{post}(q,(a, d))| \geq 1$. We also classify the RAs into deterministic RAs (DRAs) and nondeterministic (NRAs), where an RA is deterministic if $|\operatorname{post}(q,(a, d))| \leq 1$ for all configurations $q$ and all inputs $(a, d)$. A $k-N R A(k-D R A$, respectively) is an NRA (DRA, respectively) with $|R|=k$.

Synchronizing words and synchronizing data words. Synchronizing words are a well-studied concept for DFAs, see, e.g., [30. Informally, a synchronizing word leads the automaton from every state to the same state. Formally, the word $w \in \Sigma^{+}$is synchronizing for a DFA $\mathcal{A}=\langle Q, \Sigma, \Delta\rangle$ if there exists some state $q \in Q$ such that $\Delta(Q, w)=\{q\}$. The synchronization problem for DFAs asks, given a DFA $\mathcal{A}$, whether there exists some synchronizing word for $\mathcal{A}$.

The synchronization problem for DFAs is in NLOGSPACE by using the pairwise synchronization technique: given a DFA $\mathcal{A}=\langle Q, \Sigma, \Delta\rangle$, it is known that $\mathcal{A}$ has a synchronizing word if and only if for all pairs of states $q, q^{\prime} \in Q$, there exists a word $v$ such that $\Delta(q, v)=\Delta\left(q^{\prime}, v\right)$ (see 30 for more details). The pairwise synchronization algorithm initially sets $S_{|Q|}=Q$. For $i=|Q|-1, \cdots, 1$, the algorithm repeats the following two steps: (a) For two distinct states $q, q^{\prime} \in S_{i+1}$, find $v_{i}$ such that $\Delta\left(q, v_{i}\right)=\Delta\left(q^{\prime}, v_{i}\right)$. (b) Set $S_{i}=\Delta\left(S_{i+1}, v_{i}\right)$ (and repeat the loop). The word $w=v_{|Q|-1} \cdots v_{1}$ is synchronizing for $\mathcal{A}$.

We introduce synchronizing data words for RAs. Given an RA $\mathcal{R}=\langle L, R, \Sigma, T\rangle$, a data word $w \in$ $(\Sigma \times D)^{+}$is synchronizing for $\mathcal{R}$ if there exists some configuration $q_{w}=(\ell, \nu)$ such that $\operatorname{post}\left(L \times D^{|R|}, w\right)=$ $\left\{q_{w}\right\}$. Intuitively, no matter what is the starting location and register valuation, by inputting the data word $w, \mathcal{R}$ will be in the unique successor configuration $q_{w}$. This configuration $q_{w}$ depends on $w$. The synchronization problem for RAs asks, given an RA $\mathcal{R}$ over a data domain $D$, whether there exists some synchronizing data word for $\mathcal{R}$. The length-bounded synchronization problem for RAs decides, given an RA $\mathcal{R}$ and a bound $N \in \mathbb{N}$ written in binary, whether there exists some synchronizing data word $w$ for $\mathcal{R}$ satisfying $|w| \leq N$.

## 3 Synchronizing data words for DRAs

In this section, we first show that the synchronization problems for 1-DRAs and DFAs are NLOGSPACEinterreducible, implying that the problem is NLOGSPACE-complete for 1-DRAs. Next, we prove that


Figure 1: A DRA with registers $r_{1}, r_{2}, r_{3}$ and the single letter $a$ (omitted from transitions) that can be synchronized in the configuration (synch, $x_{4}$ ) by the data word $w_{\text {synch }}=\left(a, x_{1}\right)\left(a, x_{2}\right)\left(a, x_{3}\right)\left(a, x_{4}\right)$ if $\left\{x_{1}, x_{2}, x_{3}, x_{4}\right\} \subseteq D$ is a set of 4 distinct data.
the problem for $k$-DRAs, in general, can be decided in PSPACE; a reduction similar to a timed setting, as in [14], provides the matching lower bound. To obtain the complexity upper bounds, we prove that inputting words with data efficiency $2|R|+1$ is sufficient to synchronize a DRA.

The concept of synchronization requires that all runs of an RA, whatever the initial configuration (initial location and register valuations), end in the same configuration ( $\ell_{\text {synch }}, \nu_{\text {synch }}$ ), only depending on the synchronizing data word $w_{\text {synch }}$, formally $\operatorname{post}\left(L \times D^{|R|}, w_{\text {synch }}\right)=\left\{\left(\ell_{\text {synch }}, \nu_{\text {synch }}\right)\right\}$. While processing a synchronizing data word, the infinite set of configurations of RAs must necessarily shrink to a finite set of configurations. The DRA $\mathcal{R}$ with 3 registers depicted in Figure 1 illustrates this phenomenon. Consider the set $\left\{x_{1}, x_{2}, x_{3}\right\} \subseteq D$ of distinct data values: starting from any of the infinite configurations in $\{$ init $\} \times D^{3}$, when processing the data word $\left(a, x_{1}\right)\left(a, x_{2}\right)\left(a, x_{3}\right), \mathcal{R}$ will be in a configuration in the finite set $\left\{\left(\ell_{3},\left(x_{1}, x_{2}, x_{3}\right)\right),\left(\ell_{3}^{\prime},\left(x_{1}, x_{2}, x_{3}\right)\right\}\right.$. We use this observation to provide a linear bound on the number of distinct data values that is sufficient for synchronizing DRAs.

In Lemma 1 below, we prove that data words over only $|R|$ distinct data values are sufficient to shrink the infinite set of all configurations of DRAs to a finite set. We establish this result based on the following two key facts:
(1) When processing a synchronizing data word $w_{\text {synch }}$ from a configuration $(\ell, \nu)$ with some register $r \in R$ such that $\nu(r) \notin \operatorname{data}\left(w_{\text {synch }}\right)$, the register $r$ must be updated. Observe that such updates must happen at inequality-guarded transitions, which themselves must be accessible by inequality-guarded transitions (possibly with no update). As an example, consider the DRA $\mathcal{R}$ in Figure 11, and assume $d_{1}, d_{2} \notin \operatorname{data}\left(w_{\text {synch }}\right)$. The two runs of $\mathcal{R}$ starting from (init, $d_{1}, d_{1}, d_{1}$ ) and (init, $d_{2}, d_{2}, d_{2}$ ) first take the transition init $\xrightarrow{\neq r_{1} a r_{1} \downarrow} \ell_{1}^{\prime}$ updating register $r_{1}$. Next, the two runs must take $\ell_{1}^{\prime} \xrightarrow{\text { else } a r_{2} \downarrow} \ell_{2}^{\prime}$ to update $r_{2}$ and $\ell_{2}^{\prime} \xrightarrow{\text { else } a r_{3} \downarrow} \ell_{3}^{\prime}$ to update $r_{3}$; otherwise these two runs would never be synchronized in a single configuration.
(2) Moreover, to shrink the set $L \times D^{|R|}$, for every $\ell \in L$, one can find a word $w_{\ell}$ that leads the DRA from $\{\ell\} \times D^{|R|}$ to some finite set. Since $\mathcal{R}$ is deterministic, appending some prefix or suffix to $w_{\ell}$ achieves the same objective. This allows us to use a variant of the pairwise synchronization technique to shrink the infinite set $L \times D^{|R|}$ to a finite set, by successively inputting $w_{\ell}$ for a location $\ell$ that appears with infinitely many data in the current successor set of $L \times D^{|R|}$.

Lemma 1. For all DRAs for which there exist synchronizing data words, there exists some data word $w$ such that data $(w) \leq|R|$ and $\operatorname{post}\left(L \times D^{|R|}, w\right) \subseteq L \times(\operatorname{data}(w))^{|R|}$.

Proof. Let $\mathcal{R}=\langle L, R, \Sigma, T\rangle$ be a DRA on the data domain $D$ with $k \geq 1$ registers. Let $v$ be a synchronizing data word for $\mathcal{R}$ with $N=|\operatorname{data}(v)|$ distinct data. Suppose that $k<N$; otherwise the statement of the lemma trivially holds.

For all $1 \leq i \leq k$, we say that $x_{i}$ is the $i$-th datum in the synchronizing data word $v=\left(a_{1}, d_{1}\right)\left(a_{2}, d_{2}\right) \cdots\left(a_{n}, d_{n}\right)$ if there exists $j \leq k$ such that $x_{i}=d_{j}, x_{i} \notin\left\{d_{1}, \cdots, d_{j-1}\right\}$ and $\left|\left\{d_{1}, \cdots, d_{j}\right\}\right|=i$. For every $i \leq k$,
denote by $\langle L, i\rangle$ the set

$$
\langle L, i\rangle=L \times\left\{\nu \in D^{k}\left|\exists R^{\prime} \subseteq R \cdot\right| R^{\prime} \mid \geq i \cdot \forall r \in R^{\prime} \cdot \nu(r) \in\left\{x_{1}, \cdots, x_{i}\right\}\right\}
$$

We Claim that for all locations $\ell \in L$ and all $1 \leq i \leq k$, there exists some data word $u_{i}$ such that

- data $\left(u_{i}\right) \subseteq\left\{x_{1}, x_{2}, \cdots, x_{i}\right\}$, and
- $\operatorname{post}\left(\{\ell\} \times D^{k}, u_{i}\right) \subseteq\langle L, i\rangle$, meaning that after reading $u_{i}$ all reached configurations have at least $i$ registers with values from $\left\{x_{1}, x_{2}, \cdots, x_{i}\right\}$.

For $\ell \in L$, let $w_{\ell}=u_{k}$ satisfy the above condition. Set $S_{0}=L \times D^{k}$ and $w_{0}=\varepsilon$. Then, for all $i=$ $1, \cdots,|L|$, repeat the following: if there exists some $\ell \in L$ such that $\{\ell\} \times\left(D \backslash\left\{x_{1}, \cdots, x_{k}\right\}\right)^{k} \cap S_{i-1} \neq \emptyset$, then set $w_{i}=w_{\ell}$ and $S_{i}=\operatorname{post}\left(S_{i-1}, w_{i}\right)$. Otherwise set $w_{i}=w_{i-1}$ and $S_{i}=S_{i-1}$. Observe that $w=\left(w_{i}\right)_{1 \leq i \leq|L|}$ proves the statement of Lemma. It remains to prove the Claim.

Proof of Claim. Let $\hat{\ell}$ be some location in the DRA $\mathcal{R}$. The proof is by an induction on $i$.
Base of induction. Let wait $=\{\hat{\ell}\} \times(D \backslash \operatorname{data}(v))^{k}$ be the set of configurations with location $\hat{\ell}$ such that the data stored in all $k$ registers is not in $\operatorname{data}(v)$. Note that for all configurations $(\hat{\ell}, \nu) \in$ wait, the unique run of $\mathcal{R}$ starting in ( $\hat{\ell}, \nu$ ) on (a prefix of) $v$ consists of the same sequence of the following transitions:
 update,

- followed by a transition $\xrightarrow{\Lambda_{r \in R} \neq r \text { up } \downarrow}$, with inequality guard on all registers and with an update for some non-empty set up $\subseteq R$.

Otherwise, the two runs starting from any pair of configurations $\left(\hat{\ell}, \nu_{1}\right),\left(\hat{\ell}, \nu_{2}\right) \in$ wait with unequal valuations $\nu_{1} \neq \nu_{2}$ would end up in distinct configurations, say $\left(\ell, \nu_{1}^{\prime}\right),\left(\ell, \nu_{2}^{\prime}\right)$ with $\nu_{1}^{\prime} \neq \nu_{2}^{\prime}$. This is a contradiction to the fact that the data word $v$ is synchronizing.

Now let the inequality-guarded transition $\xrightarrow{\Lambda_{r \in R} \neq r \text { up } \downarrow}$, updating the registers in up, be fired at the $j$ th input $\left(a_{j}, d_{j}\right)$ while reading $v$; see Figure 2. We prove that the data word $u_{1}=\left(a_{1}, x_{1}\right)\left(a_{2}, x_{1}\right) \cdots\left(a_{j}, x_{1}\right)$ with $\operatorname{data}\left(u_{1}\right)=\left\{x_{1}\right\}$ guides $\{\hat{\ell}\} \times D^{k}$ to a subset in which each configuration has some register with value $x_{1}$ : $\operatorname{post}\left(\{\hat{\ell}\} \times D^{k}, u_{1}\right) \subseteq\langle L, 1\rangle$. This phenomenon is depicted in Figure 3 and can be argued as follows. Observe that $x_{1}=d_{1}$ is the first input datum; thus after inputting ( $a_{1}, x_{1}$ ) the set of successors is a disjoint union of two branches:

- either at least one register $r$ has datum $x_{1}$ after the transition $\xrightarrow{\bigvee_{r \in R}=r a_{1}}$. All the following successors in this branch, on input $\left(a_{2}, x_{1}\right)\left(a_{3}, x_{1}\right) \cdots\left(a_{j}, x_{1}\right)$, preserve the datum $x_{1}$ in the register $r$;
- or none of the registers is assigned $x_{1}$ after the transition $\xrightarrow{\wedge_{r \in R} \neq r a_{1}}$. By inputting $\left(a_{2}, x_{1}\right)\left(a_{3}, x_{1}\right) \cdots\left(a_{j}, x_{1}\right)$, all the following successors in this branch, thus, take inequality-guarded transitions, and would not update any registers, except for the last transition $\xrightarrow{\Lambda_{r \in R} \neq r \text { up } \downarrow}$ fired by $\left(a_{j}, x_{1}\right)$.

The above argument proves that $u_{1}$ with $\operatorname{data}\left(u_{1}\right) \subseteq\left\{x_{1}\right\}$ is such that $\operatorname{post}\left(\{\hat{\ell}\} \times D^{k}, u_{1}\right) \subseteq\langle L, 1\rangle$. The base of induction holds.
Step of induction. Assume that the induction hypothesis holds for $i-1$, namely, there exists some word $u_{i-1}$ with data $\left(u_{i-1}\right) \subseteq\left\{x_{1}, \cdots, x_{i-1}\right\}$ such that $\operatorname{post}\left(\{\hat{\ell}\} \times D^{k}, u_{i-1}\right) \subseteq\langle L, i-1\rangle$. To construct $u_{i}$, we define the concept of a symbolic state: we say $(\ell$, up, $\nu, j)$ is a symbolic state if $\ell \in L$, the set up $\subseteq R$ of registers is such that $\mid$ up $\mid \geq \min (j, k)$ and $\nu \in\left\{x_{1}, \cdots, x_{j}\right\}^{k}$ and $j \leq N$. The semantics of $(\ell$, up $, \nu, j)$ is the following set:

$$
\llbracket(\ell, \text { up }, \nu, j) \rrbracket=\{\ell\} \times\left\{\nu^{\prime} \in D^{k} \mid \nu^{\prime}(r)=\nu(r) \text { if } r \in \text { up }\right\} .
$$



Figure 2: Runs of $\mathcal{R}$ over the data Figure 3: Runs of $\mathcal{R}$ over the data word $u_{1}=$ word $\left(a_{1}, d_{1}\right)\left(a_{2}, d_{2}\right) \cdots\left(a_{j}, d_{j}\right) . \quad\left(a_{1}, x_{1}\right)\left(a_{2}, x_{1}\right) \cdots\left(a_{j}, x_{1}\right)$

Denote by $\Gamma$ the set of all such symbolic states ( $\ell$, up, $\nu, i-1$ ). By definition, the set $\Gamma$ is finite. Now we can construct $u_{i}$ as follows. Let $S_{0}=\operatorname{post}\left(\{\hat{\ell}\} \times D^{k}, u_{i-1}\right)$ and $w_{0}=u_{i-1}$. Recall that $S_{0} \subseteq\langle L, i-1\rangle$ and observe that $S_{0} \subseteq \bigcup_{q \in \Gamma} \llbracket q \rrbracket$. Start with $j=0$ and, while $S_{j} \neq \emptyset$, pick a symbolic state $q=(\ell$, up, $\nu, i-1)$ such that $\llbracket q \rrbracket \cap S_{j} \neq \emptyset$ and construct a word $u_{q}$ (as explained in the details below) such that

- data $\left(u_{q}\right)=\left\{x_{1}, x_{2}, \cdots, x_{i}\right\}$, and
- $\operatorname{post}\left(\llbracket q \rrbracket, u_{q}\right) \subseteq\langle L, i\rangle$.

Let $S_{j+1}=\operatorname{post}\left(S_{j} \backslash \llbracket q \rrbracket, u_{q}\right)$ and $w_{j+1}=w_{j} \cdot u_{q}$. Repeat the loop for $j+1$. Observe that $u_{i}=w_{j^{*}}$, where $j^{*} \leq\left|S_{0}\right|$ is such that $S_{j^{*}}=\emptyset$, satisfies the induction statement.

Below, given a symbolic state $q=(\ell$, up, $\nu, i-1)$, the aim is to construct the data word $u_{q}$. Without loss of generality, we assume that $\mid$ up $\mid=i-1$; otherwise $u_{q}=u_{i-1}$. Let

$$
\text { wait }=\llbracket(\ell, \text { up }, \nu, i-1) \rrbracket \cap\{\ell\} \times\left\{\nu^{\prime} \mid \nu^{\prime}(r) \in D \backslash \operatorname{data}(v) \text { if } r \notin \operatorname{up}\right\}
$$

be the set of all configurations in the symbolic state $q$, where all data stored in the registers $r \notin$ up are not in data $(v)$. Similarly to the induction base, no matter what the register valuation in a configuration in wait looks like, the unique run of $\mathcal{R}$ on the synchronizing word $v=\left(a_{1}, d_{1}\right)\left(a_{2}, d_{2}\right) \cdots\left(a_{n}, d_{n}\right)$ starting in that configuration takes the same sequence of transitions. Since $\nu \in\left\{x_{0}, \cdots, x_{i-1}\right\}^{k}$, after inputting successive data from data $(v)$, all successors of configurations in wait are elements of a symbolic state. For all $0 \leq j \leq n$, let the symbolic state $q^{j}=\left(\ell^{j}\right.$, up $\left.^{j}, \nu^{j}, N\right)$ be such that $\llbracket q^{0} \rrbracket=\llbracket q \rrbracket \cap$ wait, and $\operatorname{post}\left(\llbracket q^{j-1} \rrbracket,\left(a_{j}, d_{j}\right)\right) \subseteq \llbracket q^{j} \rrbracket$ if $j \geq 1$.
In the sequel, we argue that there exists some $1 \leq m \leq n$ such that, in the sequence of transitions from one symbolic state to another symbolic state over the prefix $\left(a_{1}, d_{1}\right)\left(a_{2}, d_{2}\right) \cdots\left(a_{m}, d_{m}\right)$ of $v$ (the first $m$ inputs), the following holds:

- on inputting $\left(a_{j}, d_{j}\right)$ for all $1 \leq j<m$, the transition $\xrightarrow{\left(\Lambda_{r \in \Lambda_{j}}=r\right) \wedge\left(\Lambda_{r \notin \Lambda_{j}} \neq r\right) a_{j} \Gamma_{j} \downarrow}$ with $\Lambda_{j}, \Gamma_{j} \subseteq$ up is taken from $q^{j-1}$ to $q^{j}$. It implies that $\nu^{j-1}(r)=d_{j}$ for all $r \in \Lambda_{j}$, and $\nu^{j}(r)=d_{j}$ for all $r \in \Gamma_{j}$.
- and on inputting $\left(a_{m}, d_{m}\right)$, the transition $\left.\xrightarrow[r \in \Lambda_{m}]{ }=r\right) \wedge\left(\wedge_{r \notin \Lambda_{m}} \neq r\right) a_{m} \Gamma_{m} \downarrow$, that is taken from $q^{m-1}$ to $q^{m}$, is such that $\Lambda_{m} \subseteq$ up $^{m}$ whereas $\Gamma_{m} \nsubseteq$ up $^{m}$.

Now from the prefix $\left(a_{1}, d_{1}\right)\left(a_{2}, d_{2}\right) \cdots\left(a_{m}, d_{m}\right)$ of $v$, i.e., the first $m$ inputs, and from the set of data $\left\{x_{1}, x_{2}, \cdots, x_{i}\right\}$, we construct the word $u_{q}=\left(a_{1}, y_{1}\right)\left(a_{2}, y_{2}\right) \cdots\left(a_{m}, y_{m}\right)$ for $q=(\ell$, up, $\nu, i-1)$ as follows: for all $1 \leq j \leq m$,

- if $\Lambda_{j} \neq \emptyset$, i.e., some register $r \in$ up already stores the datum $d_{j}$, then $y_{j}=d_{j}$.
- if $\Lambda_{j}=\emptyset$, i.e., none of the registers $r \in$ up stores the datum $d_{j}$, then $y_{j}=d$ where $d \in$ $\left\{x_{1}, x_{2}, \cdots, x_{i}\right\} \backslash\left\{\nu^{j-1}(r) \mid r \in \operatorname{up}\right\}$. The existence of such $d$ is guaranteed since |up $\mid=i-1$ and $\left|\left\{x_{1}, x_{2}, \cdots, x_{i}\right\}\right|=i$. Moreover, since the transitions $\xrightarrow{\left(\Lambda_{r \in \text { up }} \neq r\right) a_{j} \Gamma_{j} \downarrow}$ have inequality guards for all registers, then changing the datum from $d_{j}$ to $y_{j}$ would result only in taking the same transition.

Observe that $\operatorname{data}\left(u_{q}\right) \subseteq\left\{x_{1}, \cdots, x_{i}\right\}$. As a result, all registers that are updated along the runs of $\mathcal{R}$ over $u_{q}$ store some datum from $\left\{x_{1}, \cdots, x_{i}\right\}$. This argument shows that post $\left(\llbracket q \rrbracket, u_{q}\right) \subseteq\langle L, i\rangle$. This concludes the step of induction, and completes the proof.

After reading some word that shrinks the infinite set of configurations of DRAs to a finite set $S$ of configurations, we generalize the pairwise synchronization technique 30 to finally synchronize configurations in $S$. By this generalization, we achieve the following Lemma 2, for which the detailed proof can be found in Appendix 6

Lemma 2. For all DRAs for which there exist synchronizing data words, there exists a synchronizing data word $w$ such that $|w| \leq 2|R|+1$.

Given a 1-DRA $\mathcal{R}$, the synchronization problem can be solved as follows: (1) check that from each location $\ell$ an update on the single register is achieved by going through inequality-guarded transitions, which can be done in NLOGSPACE. Lemma 1 ensures that feeding $\mathcal{R}$ consecutively with a single datum $x \in D$ is sufficient for this phase and the set of successors of $L \times D$ would be a subset of $L \times\{x\}$. Next (2) pick an arbitrary set $\{x, y, z\}$ of data including $x$, by Lemma 2 and the pairwise synchronization technique, the problem reduces to the synchronization problem for DFAs where data in registers and input data extend locations and the alphabet: $Q=L \times\{x, y, z\}$ and $\Sigma \times\{x, y, z\}$. Since a 1-DRA, where all transitions update the register and are guarded with true, is equivalent to a DFA, we obtain the next theorem.

Theorem 3. The synchronization problem for 1-DRAs is NLOGSPACE-complete.
We provide a family of DRAs, for which a linear bound on the data efficiency of synchronizing data words, depending on the number of registers, is necessary. This necessary and sufficient bound is crucial to establish membership of synchronizing DRAs in PSPACE.

Lemma 4. There is a family of single-letter $\operatorname{DRAs}\left(\mathcal{R}_{n}\right)_{n \in \mathbb{N}}$, with $n=|R|$ registers and $\mathcal{O}(n)$ locations, such that all synchronizing data words have data efficiency $\Omega(n)$.

Proof. The family of DRAs $\mathcal{R}_{n}(n \in \mathbb{N})$ is defined over an infinite data domain $D$. The DRA $\mathcal{R}_{n}$ has $n$ registers and a single letter $a$. The structure of $\mathcal{R}_{n}$ is composed of two distinguished locations init and synch and two chains, where each chain has $n$ locations: $\ell_{1}, \ell_{2}, \cdots, \ell_{n}$ and $\ell_{1}^{\prime}, \ell_{2}^{\prime}, \cdots, \ell_{n}^{\prime}$. The DRA $\mathcal{R}_{3}$ is shown in Figure 1 The only transition in synch is a self-loop with update on all $n$ registers, thus $\mathcal{R}_{n}$ can only be synchronized in synch. There are two transitions in init, each going to one of the chains:

$$
\text { init } \xrightarrow{=r_{1}} a l l r_{1} \downarrow \ell_{1} \quad \text { and } \quad \text { init } \xrightarrow{\neq r_{1}} a \quad a r_{1} \downarrow>\ell_{1}^{\prime} .
$$

Then, $\operatorname{post}\left(\{\right.$ init $\left.\} \times D^{n},(a, x)\right)=\left\{\ell_{1}, \ell_{1}^{\prime}\right\} \times\left(\{x\} \times D^{n-1}\right)$ for all $x \in D$.
From $\left\{\ell_{1}, \ell_{1}^{\prime}\right\} \times\left(\{x\} \times D^{n-1}\right)$, informally speaking, in both chains the respective $i$-th locations are simultaneously reached after inputting $i$ distinct data: for all $1 \leq i<n$, in each $\ell_{i}$ and $\ell_{i}^{\prime}$ there are two transitions. One transition is a self-loop, with a satisfied equality guard on at least one of the updated registers $r_{1}, \ldots, r_{i}$ so far. The other transition goes to the next location $\ell_{i+1}$ in the chain, with an inequality guard on all updated registers $r_{1}, r_{2}, \cdots, r_{i}$ so far, and an update on the next register $r_{i+1}$.

$$
\ell_{i} \xrightarrow{\bigvee_{r \in\left\{r_{1}, \cdots, r_{i}\right\}}\left(=r_{i}\right) a} \ell_{i} \quad \text { and } \quad \ell_{i} \xrightarrow{\bigwedge_{r \in\left\{r_{1}, \cdots, r_{i}\right\}}\left(\neq r_{i}\right) a r_{i+1} \downarrow} \ell_{i+1}
$$

$$
\ell_{i}^{\prime} \xrightarrow{\bigvee_{r \in\left\{r_{1}, \cdots, r_{i}\right\}}\left(=r_{i}\right) a} \ell_{i}^{\prime} \quad \text { and } \quad \ell_{i}^{\prime} \xrightarrow{\bigwedge_{r \in\left\{r_{1}, \cdots, r_{i}\right\}}\left(\neq r_{i}\right) a r_{i+1} \downarrow} \ell_{i+1}^{\prime}
$$

At the last locations $\ell_{n}$ and $\ell_{n}^{\prime}$ of the two chains, there is one transition with inequality guards on all registers leaving the chain to synch, and there is one transition which is, again, a self-loop with an equality constraint for at least one of the registers.

$$
\ell_{n} \xrightarrow{\bigwedge_{r \in R}\left(\neq r_{i}\right) a \quad} \quad R \downarrow \text { synch } \quad \text { and } \quad \ell_{n} \xrightarrow{\text { else } a} \ell_{n} \quad \ell_{n}^{\prime} \xrightarrow{\bigwedge_{r \in R}\left(\neq r_{i}\right) a \quad} \quad R \downarrow \text { synch } \quad \text { and } \quad \ell_{n}^{\prime} \xrightarrow{\text { else } a} \ell_{n}^{\prime}
$$

By construction, we see that $n+1$ distinct data values must be read for reaching synch from the infinite set $\{$ init $\} \times D^{n}$. Since $\mathcal{R}_{n}$ can only be synchronized in synch, all synchronizing data words must have data efficiency at least $n+1 \in \Omega(n)$.

It remains to prove that $\mathcal{R}_{n}$ has indeed some synchronizing word. Let $\left\{x_{1}, x_{2}, \cdots, x_{n+1}\right\}$ be a set of $n+1$ distinct data values and $w_{\text {synch }}=\left(a, x_{1}\right)\left(a, x_{2}\right) \cdots\left(a, x_{n}\right)\left(a, x_{n+1}\right)$. For the configuration space $L=\left\{\right.$ init, synch, $\left.\ell_{1}, \cdots, \ell_{n}, \ell_{1}^{\prime}, \cdots, \ell_{n}^{\prime}\right\}$, observe that $\operatorname{post}\left(L \times D^{n}, w_{\text {synch }}\right)=\left\{\left(\operatorname{synch}, x_{n+1}\right)\right\}$ and $\left|\operatorname{data}\left(w_{\text {synch }}\right)\right|=n+1$. The proof is complete.

Theorem 5. The synchronization problem for $k-D R A s$ is PSPACE-complete.
Proof. (Sketch) The synchronization problem for $k$-DRA is in PSPACE using the following co-(N)PSPACE algorithm: (1) pick a set $X=\left\{x_{1}, x_{2}, \cdots, x_{2 k+1}\right\}$ of distinct data values. (2) guess some location $\ell \in L$ and check if there is no word $w \in\left(\Sigma \times\left\{x_{1}, x_{2}, \cdots, x_{k}\right\}\right)^{*}$ with length $|w| \leq 2^{k|L||\Sigma|}$ such that along firing transitions that arer inequality-guarded on all $k$ registers, some registers are not updated. If (2) is satisfied, then return "no" (meaning that there is no synchronizing data word for the input $k$-DRA). Otherwise, (3) guess two configurations $q_{1}, q_{2} \in L \times X^{k}$ such that there is no word $w \in(\Sigma \times X)^{*}$ with length $|w| \leq 2^{(2 k+1)|L||\Sigma|}$ such that $\left|\operatorname{post}\left(\left\{q_{1}, q_{2}\right\}, w\right)\right|=1$. If (3) is satisfied, then the algorithm returns "no"; otherwise return "yes".

For PSPACE-hardness, we adapt an established reduction (see, e.g., [14) from the non-emptiness problem for $k$-DRA, see Appendix 6. The result then follows by PSPACE-completeness of the nonemptiness problem for $k$-DRA [11].

## 4 Synchronizing data words for NRAs

In this section, we study the synchronization problems for NRAs. We slightly update a result in [14] to present a general reduction from the non-universality problem to the synchronization problem for NRAs. This reduction proves the undecidability result for the synchronization problem for $k$-NRAs, and Ackermann-hardness in 1-NRAs. We then prove that for 1-NRAs, the synchronization and nonuniversality problems are indeed interreducible, which completes the picture by Ackermann-completeness of the synchronization problem for 1-NRAs.

In the nondeterministic synchronization setting, we present two kinds of counting features, which are useful for later constructions. For the first one, we define a family $\left(\mathcal{R}_{\text {counter }(n)}\right)_{n \in \mathbb{N}}$ of 1 -NRAs with size only linear in $n$, where an input datum $x \in D$ must be read $2^{n}$ times to achieve synchronization.

Lemma 6. There is a family of $1-N R A s\left(\mathcal{R}_{\operatorname{counter}(n)}\right)_{n \in \mathbb{N}}$ with $\mathcal{O}(n)$ locations, such that for all synchronizing data words $w$, some datum $d \in \operatorname{data}(w)$ appears in $w$ at least $2^{n}$ times.

Proof. (Sketch) The 1-NRA $\mathcal{R}_{\text {counter (n) }}$ shown in Figure 4 encodes a binary counter that ensures that in every synchronizing data word $w$ some datum $x \in \operatorname{data}(w)$ appears at least $2^{n}$ times. The location synch has self-loops on all letters, thus, $\mathcal{R}_{\text {counter (n) }}$ can only be synchronized in location synch. Generally speaking, the counting involves an initializing process and several incrementing processes. The initializing process is started by firing a $\star$-transition, which places a token, let us say: an $x$-token, into location zero. This sets the counter to 0 . Note that firing $\star$-transitions is the only way to guide tokens out of reset; hence, whenever there is some token in reset, a new initializing process must be started. We use this to enforce a new initializing process whenever some transition is fired that is incorrect with respect to the incrementing process.


Figure 4: A partial picture of the 1-NRA $\mathcal{R}_{\text {counter ( } n \text { ) }}$ (with $n \geq 3$ ) implementing a binary counter. In order to avoid crossing edges in the figure, we use two copies of the same location reset. All locations have inequality-guarded self-loops for all letters in $\Sigma \backslash\{\star\}$. All missing equality-guarded $\star$-transitions are directed to zero. For all $0 \leq i<n$, missing equality-guarded \#-transitions from $2_{c}^{i}$ are guided to synch with an update on the register. All other non-depicted equality-guarded transitions are directed to reset, and inequality-guarded transitions are self-loops.

An incrementing process can be set off by inputting the datum $x$ via equality guards. The numbers $1 \leq m \leq 2^{n}$ are represented by placing a copy of the $x$-token in the locations corresponding to the binary representation of $m$. An $x$-token in location $2^{i}$ (in $2_{c}^{i}$, respectively) means that the $i$-th least significant in the binary representation is set to 1 (to 0 , respectively). First, a Bit $_{0}$-transition places a copy of the $x$-token in each of $\left\{2_{c}^{n}, \ldots, 2_{c}^{2}, 2_{c}^{1}, 2^{0}\right\}$ to represent $0 \ldots 001$. In each incrementation step the $x$-tokens are re-placed by firing specific $\mathrm{Bit}_{i}$-transitions $(0 \leq i \leq n)$, following the standard procedure of binary incrementation. At the end, when a copy of the $x$-token locates in each of $\left\{2^{n}, 2_{c}^{n-1}, \ldots, 2_{c}^{0}\right\}$ (representing 10. . . 0), the \#-transitions guide all of these tokens to location synch and finally synchronize $\mathcal{R}_{\text {counter }}$. We give a detailed explanation of the structure of $\mathcal{R}_{\text {counter (n) }}$ in Appendix 7

We present a second kind of counting features in RAs that explains the hardness of synchronizing NRAs, even with a single register. In Lemma 7 , we define a family of 1-NRAs (with only $\mathcal{O}(n)$ locations), where tower $(n)$ distinct data must be read to gain synchronization. Recall from [27] that the function tower is at level three of the infinite Ackermann hierarchy $\left(A_{k}\right)_{k \in \mathbb{N}}$ of fast-growing functions $A_{i}: \mathbb{N} \rightarrow$ $\mathbb{N}$, inductively defined by $A_{1}(n)=2 n$ and $A_{k+1}(n)=A_{k}^{n}(1)=\underbrace{A_{k}\left(\ldots\left(A_{k}\right.\right.}_{n \text { times }}(n)) \ldots)$. Hence, applying doub $\stackrel{\text { def }}{=} A_{1}$, exp $\stackrel{\text { def }}{=} A_{2}$, and tower $\stackrel{\text { def }}{=} A_{3}$, respectively, on some natural number $n$ results in some number that is double, exponential, and tower, respectively, in $n$. The function $A_{\omega}(n)=A_{n}(n)$ is a non-primitive recursive Ackermann-like function, defined by diagonalization.

Lemma 7. There is a family of $1-N R A s\left(\mathcal{R}_{\operatorname{tower}(n)}\right)_{n \in \mathbb{N}}$ with $O(n)$ locations, such that $|\operatorname{data}(w)| \geq$ tower( $n$ ) for all synchronizing data words $w$.

Proof. The domain of the family of 1-NRAs $\left(\mathcal{R}_{\operatorname{tower}(n)}\right)_{n \in \mathbb{N}}$ is the natural numbers $\mathbb{N}$. The alphabet of $\mathcal{R}_{\text {tower (n) }}$ is $\Sigma=\{\#, \star$, rep, doub, exp, tow $\}$. The structure of $\mathcal{R}_{\text {tower }(n)}$ is composed of $n$ locations data $_{1}$, data $_{1,2}, \cdots$, data $_{1,2, \cdots, n}$ and 6 more locations reset, synch, store, rep, waitDoub, waitExp. The general structure of $\mathcal{R}_{\text {tower }(n)}$ is partially depicted in Figure 5. The NRA $\mathcal{R}_{\operatorname{tower}(n)}$ is such that $|\operatorname{data}(w)| \geq$ tower $(n)$ for all synchronizing data words $w$.

All transitions in synch are self-loops with an update on the register synch $\stackrel{\Sigma r \downarrow}{ }$ synch; thus, $\mathcal{R}_{\text {tower }(n)}$ can only be synchronized in synch. Moreover, synch is only accessible from store by a \#-transition. Assuming $w$ is one of the shortest synchronizing words, we see that $\operatorname{post}(L \times D, w)=\{(\operatorname{synch}, x)\}$, where $w$ ends with $(\#, x)$.

From all locations $\ell \in L \backslash\{$ synch $\}$, we have $\ell \xrightarrow{\star r \downarrow}$ data $_{1}$; we say that $\star$-transitions reset $\mathcal{R}_{\text {tower }(n)}$. Moreover, the only outgoing transition in location reset is the $\star$-transition. Thus, a reset must occur in order to synchronize $\mathcal{R}_{\text {tower (n) }}$. After this forced reset, say on reading $(\star, 1)$, the set of reached


Figure 5: A partial illustration of the 1-NRA $\mathcal{R}_{\text {tower }(n)}$ for $n \geq 3$. All $\star$-transitions are guided to data ${ }_{1}$ with an update on the register. All other missing non-depicted transitions are directed to reset.
configurations is $\left\{\left(\right.\right.$ data $\left._{1}, 1\right),($ synch, 1$\left.)\right\}$. Since resetting is inefficient, we try to avoid it; we call all transitions leading to reset inefficient.
For all locations data ${ }_{1, \cdots, i}$ with $1 \leq i<n$, we define the two transitions

$$
\operatorname{data}_{1, \cdots, i} \xrightarrow{\neq r \text { rep }} \operatorname{data}_{1, \cdots, i+1} \quad \text { and } \quad \operatorname{data}_{1, \cdots, i} \xrightarrow{\neq r \text { rep } r \downarrow} \text { data }_{1, \cdots, i+1} .
$$

All other transitions in data ${ }_{1, \cdots, i}$ are inefficient and directed to reset. Below, we rename data ${ }_{1,2, \cdots, n}$ to waitTow. We partially depict the transitions from waitTow, waitExp, waitDoub, rep and store in Figure 5 All transitions are inefficient, except

- waitTow $\xrightarrow{=r \text { tow }}$ waitExp, waitTow $\xrightarrow{\neq r \text { tow }}$ waitTow, and waitTow $\xrightarrow{\sigma}$ waitTow for all $\sigma \in\{$ doub, exp, rep $\}$.
- waitExp $\xrightarrow{=r \text { exp }}$ waitDoub, waitExp $\xrightarrow{\text { doub }}$ waitExp and waitExp $\xrightarrow{\text { rep }}$ waitExp.
- waitDoub $\xrightarrow{=r \text { doub }} r$ rep, waitDoub $\xrightarrow{\neq r \text { doub }}$ waitDoub and waitDoub $\xrightarrow{\neq r \text { rep }}$ waitDoub,
- rep $\xrightarrow{\neq r \text { rep }}$ store and rep $\xrightarrow{\neq r \text { rep } r \downarrow}$ store,
- store $\xrightarrow{\text { tow }}$ waitExp, store $\xrightarrow{\exp }$ waitDoub, store $\xrightarrow{\neq r \text { doub }}$ store and store $\xrightarrow{\neq r \text { rep }}$ store, and
- store $\xrightarrow{\# r \downarrow}$ synch.

We remark that store $\xrightarrow{\# r \downarrow}$ synch is the only \#-transition that is not inefficient. This implies that for efficiently synchronizing $\mathcal{R}_{\text {tower }(n)}$, one needs to re-move all produced tokens to store before firing a \#-transition. The main issue in re-moving produced tokens, however, is that some inequality-guarded transitions are unavoidable, and these transitions may replicate the tokens. For example, if one token is in data ${ }_{1}$, firing two transitions data ${ }_{1} \xrightarrow{\neq r \text { rep }}$ data $_{1,2}$ and data ${ }_{1} \xrightarrow{\neq r \text { rep } r \downarrow}$ data $_{1,2}$ replicates one token to two tokens in data ${ }_{1,2}$. Using this, one can implement doubling, exponentialization, and towering of distinct tokens, as explained in the following.

Doubling: Assume that there are $n$ distinct tokens $\{1,2, \ldots, n\}$ in waitDoub. Then the only efficient transition is waitDoub $\xlongequal{=r \text { doub }}$ waitRep. In particular, all $\{\#, \exp$, tow $\}$-transitions activate a reset. As a result, as long as some token is in waitDoub, $\{\#$, exp, tow $\}$-transitions should be avoided for the sake of efficiency. This implies that for all $1 \leq i \leq n$, the $i$-token in waitDoub can leave the location only individually on the input (doub, $i$ ). Now, inputting (doub, $i$ ) moves the $i$-token to waitRep. Here the $i$ token must immediately move on to store via the inequality-guarded rep-transitions, which will replicate the $i$-token into two tokens. Note that we must fire rep-transitions with some "fresh" datum $j$ such that $j \notin\{1, \ldots, n\}$, otherwise a reset is evoked. (For simplicity, we use $j=i+n$ by convention.) It can now
be easily seen that the only efficient way to guide all $n$ tokens out of waitDoub is by inputting the data word

$$
w_{\operatorname{doub}(n)}=(\text { doub }, 1)(\text { rep }, n+1)(\text { doub }, 2)(\text { rep }, n+2) \ldots(\text { doub }, n)(\text { rep }, 2 n),
$$

which puts $2 n$ distinct tokens into store.
Exponentialization: Assume there are $n$ distinct tokens $\{1,2, \ldots, n\}$ in waitExp. The only efficient transition is waitExp $\xrightarrow{=r \exp }$ waitDoub. In particular, all $\{\#$, tow $\}$-transitions activate a reset, and should be avoided as long as some token is in waitExp. This implies that for all $1 \leq i \leq n$, the $i$-token in waitExp can leave the location only individually on the input (exp, i). Now, inputting (exp, 1) moves the 1-token to waitDoub. From above we know that the only efficient way for guiding a single token in waitDoub towards synchronization is by inputting the data word $w_{\text {doub(1) }}$, resulting in two distinct tokens in store: 1 and 2 . We can now proceed to remove the 2 -token from waitExp by inputting (exp, 2 ). Note that this also guides the $\{1,2\}$-tokens residing in store to waitDoub. Again, for efficient synchronization, we must input the data word $w_{\text {doub(2) }}$, which results in four distinct tokens $\{1,2,3,4\}$ in store. It is now easy to see that the only efficient way to guide all $n$ tokens out of waitExp is by inputting the data word

$$
w_{\exp (n)}=(\exp , 1) \cdot w_{\operatorname{doub}(1)} \cdot(\exp , 2) \cdot w_{\operatorname{doub}(2)} \cdot(\exp , 3) \cdot w_{\operatorname{doub}(4)} \cdot \ldots \cdot(\exp , n) \cdot w_{\operatorname{doub}\left(2^{n-1}\right)}
$$

which puts $2^{n}$ distinct tokens into store.
Towering: Assume there are $n$ distinct tokens $\{1,2, \ldots, n\}$ in waitTow. The only efficient transition is wait Exp $\xrightarrow{=r \text { tow }}$ waitExp. In particular, firing \#-transitions activates a reset, and should be avoided as long as some token is in waitTow. This implies that for all $1 \leq i \leq n$, the $i$-token in waitTow can leave the location only individually on the input (tow, $i$ ). Now, inputting (exp, 1) moves the 1-token to waitExp. From above we know that the only efficient way for guiding a single token in wait Tow towards synchronization is by inputting the data word $w_{\exp (1)}$, resulting in two distinct tokens in store: 1 and 2. We can now proceed to remove the 2 -token from waitTow by inputting (tow, 2). Note that this also guides the $\{1,2\}$-tokens residing in store to waitExp. Again, for efficient synchronization, we must input the data word $w_{\exp (2)}$, which results in four distinct tokens $\{1,2,3,4\}$ in store. It is now easy to see that the only efficient way to guide all $n$ tokens out of waitTow is by inputting the data word

$$
w_{\text {tow }(n)}=(\text { tow }, 1) \cdot w_{\exp (1)} \cdot(\text { tow }, 2) \cdot w_{\exp (2)} \cdot(\text { tow }, 3) \cdot w_{\exp (4)} \cdot \ldots \cdot(\text { tow }, n) \cdot w_{\exp (\operatorname{tower}(n-1))}
$$

which puts tower $(n)$ distinct tokens into store.
Now, after the (forced) initial reset by firing $\star$-transitions, it is easy to see that the only data word that advances in synchronizing is (rep, 2 ) (rep, 3 ) $\cdots$ (rep, $n$ ). It replicates the 1 -token to $n$ distinct tokens $1,2, \cdots, n$, which are placed into waitTow. From above we know that the only efficient way to guide all $n$ tokens out of waitTow is by inputting $w_{\text {tow }(n)}$, which places tower $(n)$ distinct tokens into store. We can now fire \#-transitions to synchronize $\mathcal{R}_{\operatorname{tower}(n)}$ without evoking a reset, but note that due to the equality guard at the \#-transition from store to synch, each of the tower ( $n$ ) distinct tokens in store can move to synch only individually. This implies $|\operatorname{data}(w)| \geq \operatorname{tower}(n)$ for all synchronizing words $w$.

We can now use similar ideas as in Lemma 7 for defining a family of 1 -NRAs $\mathcal{R}_{A_{n}(m)}(n, m \in \mathbb{N})$ such that all synchronizing data words of $\mathcal{R}_{A_{n}(m)}$ have data efficiency at least $A_{n}(m)$, where $A_{n}$ is at level $n$ of the Ackermann hierarchy. This provides a good intuition that the synchronization problem for NRAs must be Ackermann-hard, even if the NRA has a single register. In the following, we prove that the synchronization problem and the non-universality problem for NRAs are interreducible.

Let us first define the non-universality problem for RAs. To define the language of a given NRA $\mathcal{R}$, we equip it with an initial location $\ell_{\mathrm{in}}$ and a set $L_{\mathrm{f}}$ of accepting locations, where, without loss of generality, we assume that all outgoing transitions from $\ell_{\text {in }}$ update all registers. The language $L(\mathcal{R})$ is the set of all data words $w \in(\Sigma \times D)^{*}$, for which there is a run from $\left(\ell_{\mathrm{in}}, \nu_{\mathrm{in}}\right)$ to $\left(\ell_{\mathrm{f}}, \nu_{\mathrm{f}}\right)$ such that $\ell_{\mathrm{f}} \in L_{\mathrm{f}}$ and $\nu_{\mathrm{in}}, \nu_{f} \in D^{|R|}$. The non-universality problem asks, given an RA, whether there exists some data word $w$ over $\Sigma$ such that $w \notin L(\mathcal{R})$. We adopt an established reduction in [14] to provide the following Lemma.

Lemma 8. The non-universality problem is reducible to the synchronization problem for NRAs.

The detailed proof can be found in Appendix 7. As an immediate result of Lemma 8 and the undecidability of the non-universality problem for NRAs (Theorems 2.7 and 5.4 in [11]), we obtain the following theorem.

Theorem 9. The synchronization problem for NRAs is undecidable.
Next, we present a reduction showing that, for 1-NRAs, the synchronization problem is reducible to the non-universality problem, providing the tight complexity bounds for the synchronizing problem.
Lemma 10. The synchronization problem is reducible to the non-universality problem for 1-NRAs.
Proof. We establish a reduction from the synchronization problem to the non-universality problem for 1NRAs as follows. Given a 1 -NRA $\mathcal{R}=\langle L, R, \Sigma, T\rangle$, we construct a 1 -NRA $\mathcal{R}_{\text {comp }}$ equipped with an initial location and a set of accepting locations such that $\mathcal{R}$ has some synchronizing word if, and only if, there exists some data word that is not in $L\left(\mathcal{R}_{\text {comp }}\right)$.

First, we see that an analogue of Lemma holds for 1-NRAs: for all 1-NRAs with some synchronizing data word, there exists some word $w$ with data efficiency 1 such that $\operatorname{post}(L \times D, w) \subseteq L \times \operatorname{data}(w)$. For all locations $\ell \in L$, such a data word must update the register by firing an inequality-guarded transition that is reached only via inequality-guarded transitions; this can be checked in NLOGSPACE. Given $\mathcal{R}$, we assume that such a data word $w$ always exists; otherwise, we define $\mathcal{R}_{\text {comp }}$ to be a 1 -NRA with a single (initial and accepting) location equipped with self-loops for all letters, so that $L\left(\mathcal{R}_{\text {comp }}\right)=(\Sigma \times D)^{*}$. Given $\operatorname{data}(w)=\{x\}$, we say that $\mathcal{R}$ has some synchronizing word $v$ if $\operatorname{post}(L \times\{x\}, v)$ is a singleton.

Second, we define a data language lang such that data words in this language are encodings of the synchronizing process. Let $L=\left\{\ell_{1}, \ell_{2}, \cdots, \ell_{n}\right\}$ be the set of locations and $x, y$ two distinct data. Informally, each data word in lang starts with the

- initial block: a delimiter $(\star, y)$, the sequence $\left(\ell_{1}, x\right),\left(\ell_{2}, x\right), \cdots,\left(\ell_{n}, x\right)$ and an input $(a, d) \in \Sigma \times D$ as the beginning of a synchronizing word. The initial block is followed by several
- normal blocks: the delimiter $(\star, y)$, the set of successor configurations reached from the configurations and the input of the previous block, and the next input $\left(a^{\prime}, d^{\prime}\right)$ of the synchronizing data word. The data word finally ends with the
- final block: the delimiter $(\star, y)$, a single successor configuration reached from the configurations and the input of the previous block, and the delimiter $(\star, y)$.

Formally, the language lang is defined over the alphabet $\Sigma_{\text {lang }}=\Sigma \cup L \cup\{\star\}$ where $\star \notin \Sigma \cup L$. It contains all data words $u$ that satisfy the following membership conditions:

1. The data words $u$ starts with $(\star, y)\left(\ell_{1}, x\right),\left(\ell_{2}, x\right), \cdots,\left(\ell_{n}, x\right)$ for some $x, y \in D$ with $y \neq x$; this condition guarantees the correctness of the encoding for the initial block.
2. Let $\operatorname{proj}(u)$ be the projection of $u$ into $\Sigma_{\text {lang }}$ (i.e., omitting the data values). Then there exists some $\ell_{\text {synch }} \in L$ where $\operatorname{proj}(u) \in\left(\star L^{+} \Sigma\right)^{+} \star \ell_{\text {synch }} \star$. This condition guarantees the right form of data words to be encodings of synchronizing processes.

The next two conditions guarantee the uniqueness of the delimiter:
3. The letter $\star$ in $u$ occurs only with datum $y$.
4. No other letter in $u$ occurs with datum $y$.

The next three conditions guarantee that all the successors that can be reached from configurations and inputs in each block are correctly inserted in the next block. For all $(\ell, x) \in L \times D$ and $(a, d) \in \Sigma \times D$ in the same block,
5. if $x=d$ and there exists a transition $\ell \xrightarrow{=r a} \ell^{\prime}$ (with or without update), then $\left(\ell^{\prime}, x\right)$ must be in the next block.
6. if $x \neq d$ and there exists a transition $\ell \xrightarrow{\neq r a} \ell^{\prime}$, then $\left(\ell^{\prime}, x\right)$ must be in the next block.
7. if $x \neq d$ and there exists a transition $\ell \xrightarrow{\neq r a r \downarrow} \ell^{\prime}$ then $\left(\ell^{\prime}, d\right)$ must be in the next block.

By construction, the NRA $\mathcal{R}$ has some synchronizing data word if, and only if, lang $\neq \emptyset$. Below, we construct a $1-$ NRA $\mathcal{R}_{\text {comp }}$ that accepts the complement of lang. Then, the NRA $\mathcal{R}$ has some synchronizing data word if, and only if, there exists some data word that is not in $L\left(\mathcal{R}_{\text {comp }}\right)$.

The 1-NRA $\mathcal{R}_{\text {comp }}$ is the union of several 1-NRAs that are in the family of 1-NRAs $\mathbf{R}_{1}, \mathbf{R}_{2}, \cdots, \mathbf{R}_{7}$, where an 1-NRA is in the family $\mathbf{R}_{i}$ if it violates the $i$-th condition among the membership conditions in lang.

1. Family $\mathbf{R}_{1}$ : we add a 1-NRA that accepts data words not starting with $(\star, y)\left(\ell_{1}, x\right), \cdots,\left(\ell_{n}, x\right)$.

2. Family $\mathbf{R}_{2}$ : we add a DFA that accepts data words $u$ such that $\operatorname{proj}(u)$ is not in the regular language $\left(\star L^{+} \Sigma\right)^{+} \star \ell_{\text {synch }} \star$.
3. Family $\mathbf{R}_{3}$ : we add a 1-NRA that accepts data words in which two delimiters $\star$ have different data.

4. Family $\mathbf{R}_{4}$ : we add a 1-NRA that accepts data words in which the datum of first $\star$ is not used only by occurrences of $\star$.
5. Family $\mathbf{R}_{5}$ : for all transitions $\ell \xrightarrow{=r a} \ell^{\prime}$, we add a 1-NRA that only accepts data words such that one block contains some $(\ell, x)$ and $(a, d)$ with $x=d$ where the next block does not have $\left(\ell^{\prime}, x\right)$.

6. Family $\mathbf{R}_{6}$ : for all transitions $\ell \xrightarrow{\neq r a} \ell^{\prime}$, we add a 1-NRA that only accepts data words such that one block contains some $(\ell, x)$ and $(a, d)$ with $x \neq d$ where the next block does not have $\left(\ell^{\prime}, x\right)$.

7. Family $\mathbf{R}_{7}$ : for all transitions $\ell \xrightarrow{\neq r a r \downarrow} \ell^{\prime}$, we add a 1-NRA that only accepts data words such that one block contains some $(\ell, x)$ and $(a, d)$ with $x \neq d$ where the next block does not have $\left(\ell^{\prime}, d\right)$.


Figure 6: An RA with synchronizing data word $(a, x)(b, y)(b, z)$ with three distinct data values $x, y, z$. The approach of using a unique data value to shrink the infinite set of configurations to a finite subset only yields synchronizing data words of length greater than 3 .


The proof is complete.
By Lemmas 8 and 10 and Ackermann-completeness of the non-universality problem for 1-NRA, which follows from Theorem 2.7 and the proof of Theorem 5.2 in [11], and the result for counter automata with incrementing errors in [18], we obtain the following theorem.

Theorem 11. The synchronization problem for 1-NRAs is Ackermann-complete.

## 5 Length-Bounded synchronizing data words for NRAs

As proved in the previous section, the synchronization problem for NRAs is in general undecidable. In this section, we study the length-bounded synchronization problem for NRAs, in which the synchronizing data words are required to be shorter than a given length (written in binary).

To decide the synchronization problem in 1-RAs, both in the deterministic and nondeterministic setting, we rely on Lemma 1. With this lemma at hand, it was sufficient to search for synchronizing data words that first input a single datum $x$ (chosen arbitrary) as many times as necessary to have the set of successor configurations included in $L \times\{x\}$. In the next step, this obtained set of successor configurations was synchronized in a singleton. However, the shortest synchronizing data words do not always follow this pattern, for an example see Figure 6. Observe that the data word $(a, x)(b, y)(b, z)$ is synchronizing with length 3 (not exceeding the bound 3 ). However, all synchronizing data words that repeat a datum such as $x$, to first bring the RA to a finite set, have length at least 4 . The example shows that one cannot rely on the techniques developed in Section 4 to decide the length-bounded synchronization problem for NRA.

In this section, we prove
Theorem 12. The length-bounded synchronization problem for NRAs is NEXPTIME-complete.
The NEXPTIME-membership of the length-bounded synchronization problem is straightforward: guess a data word $w$ shorter than the given length (that is written in binary and thus may be exponential in the length) and check in EXPTIME whether $w$ is synchronizing. Our main contribution is to prove the NEXPTIME-hardness of this problem, for which in turn, by Lemma 8, it is sufficient to show that the length-bounded universality problem is co-NEXPTIME-complete. The length-bounded universality problem asks, given an RA and $N \in \mathbb{N}$ encoded in binary, whether all data words $w$ with $|w| \leq N$ are in the language of the automaton.

Theorem 13. The length-bounded universality problem for NRAs is co-NEXPTIME-complete.
Proof. The length-bounded universality problem for NRAs can be solved in co-NEXPTIME, by guessing a (possibly exponentially long) data word, and check whether the guessed word is a witness for nonuniversality of the RA.

We prove that the complement of the length-bounded universality problem is NEXPTIME-hard. The proof is a reduction from the membership problem of $\mathcal{O}\left(2^{n}\right)$-time bounded nondeterministic Turing machines: given a nondeterministic Turing machine $\mathcal{M}$ and an input word $x$, decide whether $\mathcal{M}$ accepts $x$ within time bound $2^{|x|}$. This problem is NEXPTIME-complete.

Given a nondeterministic Turing machine $\mathcal{M}$ and an input $x$ of length $n$, we construct an NRA $\mathcal{R}$ equipped with an initial location and a set of accepting locations, and a bound $N$ (encoded in binary) such that there exists a witness of non-universality $w$ (i.e., $w \notin L(\mathcal{R}))$ with $|w| \leq N$ if, and only if, $\mathcal{M}$ has some accepting computation on $x$ within time bound $2^{n}$.

Let $\mathcal{M}$ have the set $Q$ of control states and the tape alphabet $\Gamma$. Let us recall that a configuration of $\mathcal{M}$ is a word in the language $\Gamma^{*}(Q \times \Gamma) \Gamma^{*}$, where each letter in $(Q \times \Gamma) \cup \Gamma$ encodes a single cell and the position of the reading/writing head. A computation $\rho$ of $\mathcal{M}$ is a sequence $c_{0} c_{1} c_{2} \cdots$ of configurations that respects the transition function of the Turing machine. Without loss of generality, we assume that $\mathcal{M}$ has a self-loop on all accepting states. Hence for the input $x \in \Gamma^{*}$ of length $n$, all accepting computations $\rho$ of $\mathcal{M}$ are sequences of length exactly $2^{n}$, and all configurations $c_{i}$ along such a computation are words $c_{i} \in \Gamma^{*}(Q \times \Gamma) \Gamma^{*}$ of length at most $2^{n}$. In the following, we pad the configurations shorter than $2^{n}$ with - at the tail such that the length of all such configurations become equal to $2^{n}$.

Let $\Sigma_{\mathcal{M}}:=\Sigma \cup \Sigma^{\prime}$, where

$$
\Sigma=(Q \times \Gamma) \cup \Gamma \cup(Q \dot{\times} \Gamma) \cup \dot{\Gamma} \cup\{\square, \dot{\square}, \#, \star\}
$$

be such that $\square, \dot{\square}, \#, \star \notin \Gamma$. Here, $(Q \dot{\times} \Gamma)$ and $\dot{\Gamma}$ denote a dotted version of letters in $Q \times \Gamma$ and $\Gamma$; formally

$$
\{(q, a) \mid(q, a) \in(Q \times \Gamma)\} \quad \text { and } \quad\{\dot{a} \mid a \in \Gamma\}
$$

and $\Sigma^{\prime}$ will be defined later. Let $K=2^{3 n}+2^{2 n}+1$. Given a computation $\rho=c_{1} \cdots c_{2^{n}}$, we define $u(\rho) \in \Sigma^{K}$, roughly speaking, such that

1. It consists of $2^{n}$ copies of $\rho$ (with some extra delimiters).
2. Between all consecutive copies of $\rho$ there is a delimiter, and $u(\rho)$ starts and ends with $\star$, too. Hence, there are $2^{n}+1$ occurrences of $\star$ in $u(\rho)$.
3. In each copy of $\rho$, there is a \# delimiter between consecutive configurations. Since there are $2^{n}$ configurations in (each copy of) $\rho$, the number of $\#$ in $u(\rho)$ is $2^{n}\left(2^{n}-1\right)$.
4. In the $i$-th copy of $\rho$, the letter for the $i$-th cell of every participating configuration $c_{i}$ is dotted, all other letters are non-dotted. Hence, in each copy of $\rho$ there are exactly $2^{n}$ dotted letters (one in each configuration of $\rho$ ), with distance $2^{n}+1$.
5. The distance between two $\star$ delimiters is $2^{2 n}+2^{n}-1$, due to the fact that $\rho$ consists of $2^{n}$ configurations, each of which has $2^{n}$ tape cells in turn and is separated from the next configuration by a \# delimiter.

Figure 7 illustrates an example of $u(\rho)$. Observe that for all $\rho=c_{0} c_{1} \cdots c_{2^{n}}$, we have

$$
|u(\rho)|=\underbrace{2^{n}}_{\text {number of copies of } \rho} \overbrace{2^{n}}^{\text {number of configurations } c_{i} \text { in } \rho} \underbrace{2^{n}}_{\text {length of } c_{i}}+\overbrace{2^{n}\left(2^{n}-1\right)}^{\text {number of } \#}+\underbrace{\left(2^{n}+1\right)}_{\text {number of } \star}=K .
$$

We define a data language lang over the alphabet $\Sigma$ such that data words in this language are faithful encodings of computations $\rho$ of $\mathcal{M}$ over the input word $x$. In particular, the language contains all data words $v$ that satisfy the following conditions:


Figure 7: Partial encoding of $u(\rho)$ for an accepting $2^{2}$-time bounded computation $\rho$ of a Turing machine on $a_{1} a_{2}$.
6. Let $\operatorname{proj}(v)$ be the projection of $v$ into $\Sigma$ (i.e., omitting the data values). There exists some accepting computation $\rho$ of $\mathcal{M}$ on the input $x$ such that $\operatorname{proj}(v)=u(\rho)$.
7. The letters $\star$ and \# occur only with a unique datum, say datum 0 (and no other letter occurs with that datum).
8. For all occurrences of $\star$, for all $1 \leq i \leq 2^{2 n}+2^{n}-1$, all letters at the $i$-th positions after each $\star$ must carry the same datum, say datum $i$. Except for occurrences of \#, the datum $i$ is exclusive for the $i$-th positions after occurrences of $\star$.

Given a data word $v \in$ lang such that $\operatorname{proj}(v)=u(\rho)$ for some computation $\rho$, condition (8) and previous conditions on $u(\rho)$ entail that for all $1 \leq j, k \leq 2^{n}$ the $j$-th tape cell in the $k$-th configuration $c_{k}$ of all copies of $\rho$ in $v$ carries the same datum (revisit Figure 7). Observe that all data words $v \in \operatorname{lang}$ use exactly $2^{2 n}+1$ distinct data values.

By definition of lang, we see that lang is non-empty if, and only if, there is an accepting computation $\rho$ of $\mathcal{M}$ over $x$. Recall that $\Sigma_{\mathcal{M}}=\Sigma \cup \Sigma^{\prime}$ (where $\Sigma^{\prime}$ is defined later). Below, we construct a 1-NRA $\mathcal{R}$ over alphabet $\Sigma_{\mathcal{M}}$ such that the language accepted by $\mathcal{R}$ (projected into $\Sigma$, ignoring $\Sigma^{\prime}$ letters) is the complement of lang. At the end, we examine the existence of $N \in \mathcal{O}(K)$ such that $\mathcal{M}$ has an accepting computation over $x$ if, and only if, $\mathcal{R}$ is (length-bounded) non-universal with respect to the bound $N$.

The 1-NRA $\mathcal{R}$ is the union of several $1-$ NRAs and DFAs that we describe in the following. Each of these automata violates one of the necessary conditions for data words $v$ to be in lang.

- We add a DFA that accepts data words $v$ such that $\operatorname{proj}(v)$ is not in the regular language $(\star L)^{*} \star$ where $L$ is defined by

$$
\left((\Gamma+\dot{\Gamma})^{*}((Q \times \Gamma)+(Q \dot{\times} \Gamma))(\Gamma+\dot{\Gamma})^{*}(\square+\dot{\square})^{*} \#\right)^{*}
$$

- We add a DFA that accepts data words $v$ such that $\operatorname{proj}(v)$ does not start with

$$
\star\left(\left(q_{\text {init }} \cdot a_{1}\right) a_{2} a_{3} \ldots a_{n} \square^{*} \#\right),
$$

where $q_{\text {init }}$ is the initial control state of $\mathcal{M}$ and $x=a_{1} a_{2} \cdots a_{n}$ is the input. This regular expression also guarantees that in the first copy of $\rho$, the first cell is dotted.

- We add a DFA that accepts data words $w$ containing at least two dotted letters between two consecutive \#.
- We add a 1-NRA that accepts data words in which some delimiter occurs with some datum different from the datum for the first $\star$.
- We add a 1-NRA that accepts data words in which some other letter appears with the datum dedicated to delimiters $\star$ and $\#$.
- We add 1-NRA that accepts data words in which there are two letters (other than \#) between two consecutive $\star$ that carry the same datum.

- We add a 1-NRA that accepts data words $v$ such that there are two consecutive \# whose distance is not exactly $2^{n}$ (ignoring the occurrences of $\star$ ). For this we use a variant of $\mathcal{R}_{\text {counter }(n)}$ implementing a binary counter introduced in Section 4 For accepting data words $v$ such that the distance between two consecutive $\#$ is less than $2^{n}$, we add a transition

$$
2_{c}^{n} \xrightarrow{\#} \ell_{\mathrm{f}},
$$

and for accepting those words that the distance is more than $2^{n}$, we add a transition

$$
2^{n} \xrightarrow{\Sigma} \ell_{\mathrm{f}}
$$

Here, $\ell_{\mathrm{f}}$ is an accepting location with a self-loop for every letter in $\Sigma$.
For the next four $1-$ NRAs we can use simple variants of $\mathcal{R}_{\text {counter }(n)}$ :

- We add a $1-$ NRA that accepts data words $v$ such that between two consecutive $\star$, the letter \# does not occur exactly $2^{n}-1$ times.
- We add a $1-$ NRA that accepts data words $v$ such that $\star$ does not occur exactly $2^{n}+1$ times.
- We add a 1-NRA that accepts data words $v$ such that the distance between two consecutive dotted letters is not exactly $2^{n}+1$, if no delimiter $\star$ is seen between these two letters. We add another 1-NRA that accepts data words $v$ such that the distance between two consecutive dotted letters is not exactly $2^{n}+2$ if $\star$ is seen.
- We add a 1 -NRA that accepts data words $v$ such that the letters with $2^{2 n}+2^{n}-1$ distance carry different data.

To implement the above binary counters with 1-NRAs, we finally define

$$
\Sigma^{\prime}=\left\{\operatorname{Bit}_{i}^{d}, \operatorname{Bit}_{i}^{\#}, \operatorname{Bit}_{i}^{\star}, \operatorname{Bit}_{i}, \operatorname{Bit}_{i}^{x}, \operatorname{Bit}_{i+n}^{x} \mid 0 \leq i \leq n\right\},
$$

where

- letters $\mathrm{Bit}_{0}^{d}, \ldots, \mathrm{Bit}_{n}^{d}$ for counting the distance between two consecutive \#. The counter takes into account only letters in $\Sigma \backslash\{\star\}$, ignoring the occurrences of $\star$ and other Bit $_{i}$-letters from $\Sigma^{\prime}$. The 1 -NRA detects whether the distance is less or greater than $2^{n}$.
- letters $\mathrm{Bit}_{0}^{\#}, \ldots, \mathrm{Bit}_{n}^{\#}$ for counting the occurrences of $\#$. The $1-\mathrm{NRA}$ detects whether the number of \# between two consecutive $\star$ is less or greater than $2^{n}-1$.
- letters $\mathrm{Bit}_{0}^{\star}, \ldots, \mathrm{Bit}_{n}^{\star}$ for counting the occurrences of $\star$ (to check against $2^{n}+1$ ).
- letters $\mathrm{Bit}_{0}, \ldots, \mathrm{Bit}_{n}$ for counting the distance between two consecutive dotted letters (to check against $2^{n}+1$ or $2^{n}+2$ ).
- letters $\operatorname{Bit}_{0}^{x}, \ldots, \operatorname{Bit}_{2 n}^{x}$ for counting the distance between two letters that carry the same datum (to check against $2^{2 n}+2^{n}+1$ ).

We construct all these gadgets such that the Bit-letters always carry the same datum as the delimiters.
The union of all above 1-NRAs and DFAs accepts all data words except those $v$ such that $\operatorname{proj}(v)=$ $(\star \rho)^{2^{n}} \star$ (that in addition respect the uniqueness conditions on data appearing in $v$ ). Finally, we add NRAs that check whether $\rho=c_{1} \cdots c_{2^{n}}$ in such $v$ is not a faithful computation of $\mathcal{M}$, or it is not an accepting computation. To this aim, for all words $\sigma_{1} \sigma_{2} \sigma_{3} \in((Q \times \Gamma) \cup \Gamma)^{3}$ of length three such that $\sigma_{1} \sigma_{2} \sigma_{3}$ can appear at some position $i$ in a valid configuration $c$ of $\mathcal{M}$, we define $\operatorname{Post}\left(\sigma_{1} \sigma_{2} \sigma_{3}\right)$ to be the set of words $u \in((Q \times \Gamma) \cup \Gamma)^{3}$ that can appear in a successor configuration of $c$ in the same position $i$ (according to the rules of $\mathcal{M})$.

- For all words $\left.\dot{\sigma}_{1} \sigma_{2} \sigma_{3} \in(Q \dot{\times} \Gamma) \cup \dot{\Gamma}\right)((Q \times \Gamma) \cup \Gamma)^{2}$ that starts with a dotted letter, we add a 1-NRA that accepts data words that for some occurrence of the subword $\left(\dot{\sigma}_{1}, d_{1}\right)\left(\sigma_{2}, d_{2}\right)\left(\sigma_{3}, d_{3}\right)$ with some data $d_{1}, d_{2}, d_{3}$, the subword $\tau_{1} \dot{\tau}_{2} \tau_{3}$ (ignoring the data values) with exactly $2^{2 n}+2^{n+1}+1$ distance is not in $\operatorname{Post}\left(\sigma_{1} \sigma_{2} \sigma_{3}\right)$. Observe that the subword $\dot{\sigma_{1}} \sigma_{2} \sigma_{3}$ is intuitively indicating some part of some configuration $c$ in some copy of $\rho$, and $\tau_{1} \dot{\tau}_{2} \tau_{3}$ with distance $2^{2 n}+2^{n+1}+1$ is a subword of the successor configuration of $c$ in the next copy of $\rho$.
The following NRA is for the case $\left(q_{\text {init }}, a_{1}\right) a_{2}$. To implement this 1-NRA, we rely on the previous conditions that two letters (apart from the delimiters) with the same datum have the exact distance $2^{2 n}+2^{n+1}+1$ (checked with a parallel 1-NRA).
- We add a DFA that accepts data words $v$ such that the last configuration in $\rho$ does not contain a letter in $\left(Q_{\mathrm{f}} \times \Gamma\right) \cup\left(Q_{\mathrm{f}} \times \Gamma\right)$, where $Q_{\mathrm{f}}$ is the set of accepting control states of $\mathcal{M}$.

To complete the proof, we examine the existence of $N \in \mathcal{O}(K)$ such that $\mathcal{M}$ has an accepting computation over $x$ if, and only if, $\mathcal{R}$ is (length-bounded) non-universal with respect to the bound $N$. Given the shortest witness $w \in \Sigma_{\mathcal{M}}^{+}$of non-universality of $\mathcal{R}$, the projection $v$ of $w$ into $\Sigma$ encodes an accepting computation of $\mathcal{M}$ over $x$, and subsequently has length exactly $K$. The extra letters of $w$ compared to $v$ are to implement the five needed counters faithfully. However, these letters do not increase the length of $w$ much more than $K$ : for instance, the condition for counting the occurrences of $\#$ requires that we accompany every $\#$ with a single $\mathrm{Bit}_{i}^{\#}$-letter. Hence,

Note that $N$ is still exponential in $n$.
The construction of $\mathcal{R}$ is complete and the NEXPTIME-hardness follows from the sketched reduction. Note that the result already holds for 1-NRAs.

There is a natural reduction from the non-universality problem for 1-NRAs to the emptiness problem for single-register alternating RAs (1-ARAs). The trivial NEXPTIME membership (guess and check) and Theorem 12 lead to the NEXPTIME-completeness of the length-bounded emptiness problem for 1-ARAs.

Acknowledgements We thank Sylvain Schmitz for helpful discussions on well-structured systems and non-elementary complexity classes. We thank James Worrell for inspiring discussions, especially drawing our attention to a trick that simplified the NEXPTIME-hardness construction. We appreciate the anonymous reviewers for their insightful comments and suggestions.

## References

[1] R. Angles and C. Gutierrez. Survey of graph database models. ACM Comput. Surv., 40(1):1:1-1:39, Feb. 2008.
[2] P. Barceló, L. Libkin, A. W. Lin, and P. T. Wood. Expressive languages for path queries over graph-structured data. ACM Trans. Database Syst., 37(4):31:1-31:46, Dec. 2012.
[3] Y. Benenson, R. Adar, T. Paz-Elizur, Z. Livneh, and E. Shapiro. DNA molecule provides a computing machine with both data and fuel. Proc. National Acad. Sci. USA, 100:2191-2196, 2003.
[4] M. Bojańczyk, A. Muscholl, T. Schwentick, L. Segoufin, and C. David. Two-variable logic on words with data. In 21th IEEE Symposium on Logic in Computer Science (LICS 2006), 12-15 August 2006, Seattle, WA, USA, Proceedings, pages 7-16. IEEE Computer Society, 2006.
[5] M. Bojanczyk and P. Parys. Xpath evaluation in linear time. J. ACM, 58(4):17:1-17:33, July 2011.
[6] A. Bouajjani, P. Habermehl, Y. Jurski, and M. Sighireanu. Rewriting systems with data. In E. Csuhaj-Varjú and Z. Ésik, editors, Fundamentals of Computation Theory, 16th International Symposium, FCT 2007, Budapest, Hungary, August 27-30, 2007, Proceedings, volume 4639 of Lecture Notes in Computer Science, pages 1-22. Springer, 2007.
[7] P. Bouyer, A. Petit, and D. Thérien. An algebraic approach to data languages and timed languages. Inf. Comput., 182(2):137-162, 2003.
[8] J. Černý. Poznámka k homogénnym experimentom s konečnými automatmi. Matematicko-fyzikálny časopis, 14(3):208-216, 1964.
[9] D. Chistikov, P. Martyugin, and M. Shirmohammadi. Synchronizing automata over nested words. In Foundations of Software Science and Computation Structures - 19th International Conference, FOSSACS 2016, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2016, Eindhoven, The Netherlands, April 2-8, 2016, Proceedings, volume 9634 of Lecture Notes in Computer Science, pages 252-268. Springer, 2016.
[10] L. Clemente and S. Lasota. Timed pushdown automata revisited. In 30th Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2015, Kyoto, Japan, July 6-10, 2015, pages 738749. IEEE, 2015.
[11] S. Demri and R. Lazic. LTL with the freeze quantifier and register automata. ACM Trans. Comput. Log., 10(3), 2009.
[12] S. Demri, R. Lazic, and D. Nowak. On the freeze quantifier in constraint LTL: decidability and complexity. Inf. Comput., 205(1):2-24, 2007.
[13] S. Demri, R. Lazic, and A. Sangnier. Model checking memoryful linear-time logics over one-counter automata. Theor. Comput. Sci., 411(22-24):2298-2316, 2010.
[14] L. Doyen, L. Juhl, K. G. Larsen, N. Markey, and M. Shirmohammadi. Synchronizing words for weighted and timed automata. In V. Raman and S. P. Suresh, editors, 34th International Conference on Foundation of Software Technology and Theoretical Computer Science, FSTTCS 2014, December 15-17, 2014, New Delhi, India, volume 29 of LIPIcs, pages 121-132. Schloss Dagstuhl - LeibnizZentrum fuer Informatik, 2014.
[15] L. Doyen, T. Massart, and M. Shirmohammadi. Infinite synchronizing words for probabilistic automata. In Mathematical Foundations of Computer Science 2011-36th International Symposium, MFCS 2011, Warsaw, Poland, August 22-26, 2011. Proceedings, volume 6907 of Lecture Notes in Computer Science, pages 278-289. Springer, 2011.
[16] D. Figueira. Satisfiability of downward xpath with data equality tests. In Proceedings of the Twentyeighth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS '09, pages 197-206, New York, NY, USA, 2009. ACM.
[17] D. Figueira. Alternating register automata on finite words and trees. Logical Methods in Computer Science, 8(1), 2012.
[18] D. Figueira, S. Figueira, S. Schmitz, and P. Schnoebelen. Ackermannian and primitive-recursive bounds with Dickson's lemma. In Proceedings of the 26th Annual IEEE Symposium on Logic in Computer Science, LICS 2011, June 21-24, 2011, Toronto, Ontario, Canada, pages 269-278. IEEE Computer Society, 2011.
[19] M. Kaminski and N. Francez. Finite-memory automata. Theor. Comput. Sci., 134(2):329-363, 1994.
[20] J. Kretínský, K. G. Larsen, S. Laursen, and J. Srba. Polynomial time decidability of weighted synchronization under partial observability. In 26th International Conference on Concurrency Theory, CONCUR 2015, Madrid, Spain, September 1.4, 2015, volume 42 of LIPIcs, pages 142-154. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2015.
[21] K. G. Larsen, S. Laursen, and J. Srba. Synchronizing strategies under partial observability. In CONCUR 2014 - Concurrency Theory - 25th International Conference, CONCUR 2014, Rome, Italy, September 2-5, 2014. Proceedings, volume 8704 of Lecture Notes in Computer Science, pages 188-202. Springer, 2014.
[22] A. Lisitsa and I. Potapov. Temporal logic with predicate lambda-abstraction. In 12th International Symposium on Temporal Representation and Reasoning (TIME 2005), 23-25 June 2005, Burlington, Vermont, USA, pages 147-155. IEEE Computer Society, 2005.
[23] P. V. Martyugin. Complexity of problems concerning carefully synchronizing words for PFA and directing words for NFA. In Computer Science - Theory and Applications, 5th International Computer Science Symposium in Russia, CSR 2010, Kazan, Russia, June 16-20, 2010. Proceedings, volume 6072 of Lecture Notes in Computer Science, pages 288-302. Springer, 2010.
[24] F. Neven, T. Schwentick, and V. Vianu. Finite state machines for strings over infinite alphabets. ACM Trans. Comput. Log., 5(3):403-435, 2004.
[25] J. Pin. Sur les mots synthronisants dans un automate fini. Elektronische Informationsverarbeitung und Kybernetik, 14(6):297-303, 1978.
[26] H. Sakamoto and D. Ikeda. Intractability of decision problems for finite-memory automata. Theor. Comput. Sci., 231(2):297-308, 2000.
[27] S. Schmitz. Complexity hierarchies beyond elementary. ACM Trans. Comput. Theory, 8(1):3:1-3:36, 2016.
[28] M. Shirmohammadi. Phd thesis: Qualitative analysis of probabilistic synchronizing systems. 2014.
[29] N. Tzevelekos. Fresh-register automata. In T. Ball and M. Sagiv, editors, Proceedings of the 38th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2011, Austin, TX, USA, January 26-28, 2011, pages 295-306. ACM, 2011.
[30] M. V. Volkov. Synchronizing automata and the cerny conjecture. In C. Martín-Vide, F. Otto, and H. Fernau, editors, Language and Automata Theory and Applications, Second International Conference, LATA 2008, Tarragona, Spain, March 13-19, 2008. Revised Papers, volume 5196 of Lecture Notes in Computer Science, pages 11-27. Springer, 2008.

## Appendix

## 6 Proofs for Deterministic Register Automata

Lemma 2. For all DRAs for which there exist synchronizing data words, there exists a synchronizing data word $w$ such that $|w| \leq 2|R|+1$.

Proof. Let $\mathcal{R}=\langle L, R, \Sigma, T\rangle$ be a DRA on the data domain $D$ and with $k \geq 1$ registers. Recall that we denote by data $(w)$ the data occurring in data words $w$; for configurations $q=(\ell, \nu)$ we use the same notation data $(q)=\{\nu(r) \mid r \in R\}$ to denote the data appearing in the valuation of $q$. Let $\pi: Y_{1} \rightarrow Y_{2}$ be a bijection on data where $Y_{1}, Y_{2} \subseteq D$. For every configuration $q=(\ell, \nu)$, define $\pi(q)=\left(\ell, \nu^{\prime}\right)$, where $\nu^{\prime}$ satisfies $\nu^{\prime}(r)=\pi(\nu(r))$ for all $r \in R$. For every data word $w=\left(a_{1}, d_{1}\right) \ldots\left(a_{n}, d_{n}\right)$, define $\pi(w)=\left(a_{1}, \pi\left(d_{1}\right)\right) \ldots\left(a_{n}, \pi\left(d_{n}\right)\right)$. Note that the application of $\pi$ on $q$ and $w$ preserves the reachability property, i.e., $\operatorname{post}(\pi(q), \pi(w))=\left\{\pi\left(q^{\prime}\right) \mid q^{\prime} \in \operatorname{post}(q, w)\right\}$.

Assuming that $\mathcal{R}$ has some synchronizing data word, we first prove the following claim by an induction.
Claim. For all pairs of configurations $q_{1}, q_{2}$, if there exists $w$ such that $\left|\operatorname{post}\left(\left\{q_{1}, q_{2}\right\}, w\right)\right|=1$, then

- for all sets $X=\left\{x_{1}, x_{2}, \cdots, x_{2 k+1}\right\} \subset D$ with data $\left(q_{1}\right), \operatorname{data}\left(q_{2}\right) \subseteq X$,
- there exists some data word $w_{q_{1}, q_{2}} \in(\Sigma \times X)^{*}$ such that $\left|\operatorname{post}\left(\left\{q_{1}, q_{2}\right\}, w_{q_{1}, q_{2}}\right)\right|=1$.

Note that by $|X|=2 k+1$, the data efficiency of $w_{q_{1}, q_{2}}$ is at most $2 k+1$.
Proof of Claim. Let $q_{1}$ and $q_{2}$ be two configurations of $\mathcal{R}$ and define data $\left(q_{1}, q_{2}\right)=\operatorname{data}\left(q_{1}\right) \cup \operatorname{data}\left(q_{2}\right)$. Since $\mathcal{R}$ has some synchronizing data words, there exists $w$ such that $\left|\operatorname{post}\left(\left\{q_{1}, q_{2}\right\}, w\right)\right|=1$. The proof is by an induction on the length of $w$.
Base of induction. Assume $w=(a, d)$ have length $|w|=1$. Let $X$ be any arbitrary set of data such that $|X|=2 k+1$ and $\operatorname{data}\left(q_{1}, q_{2}\right) \subseteq X$. There are two cases:

- $d \in X$ : This entails that $\operatorname{data}(w) \subseteq X$. Observe that $w_{q_{1}, q_{2}}=w$ satisfies the induction statement.
- $d \notin X$ : Since $\left|\operatorname{data}\left(q_{1}, q_{2}\right)\right| \leq 2 k$, there exists data $x \neq d$ such that $x=X \backslash \operatorname{data}\left(q_{1}, q_{2}\right)$. Since $x \neq d$, we can define the bijection $\pi:\{d\} \cup \operatorname{data}\left(q_{1}, q_{2}\right) \rightarrow\{x\} \cup \operatorname{data}\left(q_{1}, q_{2}\right)$ such that $\pi(d)=x$ and $\pi\left(d^{\prime}\right)=d^{\prime}$ for all $d^{\prime} \in \operatorname{data}\left(q_{1}, q_{2}\right)$. Observe that $\pi\left(q_{i}\right)=q_{i}$ for all $i \in\{1,2\}$. Then

$$
\left|\operatorname{post}\left(\left\{q_{1}, q_{2}\right\},(a, d)\right)\right|=\mid \operatorname{post}\left(\left\{\pi\left(q_{1}\right), \pi\left(q_{2}\right)\right\},(a, \pi(d))\left|=\left|\operatorname{post}\left(\left\{q_{1}, q_{2}\right\},(a, x)\right)\right| .\right.\right.
$$

This and the assumption $\left|\operatorname{post}\left(\left\{q_{1}, q_{2}\right\},(a, d)\right)\right|=1$ yield $\left|\operatorname{post}\left(\left\{q_{1}, q_{2}\right\},(a, x)\right)\right|=1$. The word $w_{q_{1}, q_{2}}=$ $(a, x)$ satisfies the induction statement.

The base of induction hence holds.
Step of induction. Assume that the induction hypothesis holds for $i-1$. Consider some word $(a, d) \cdot w$ such that $|w|=i-1$ and $\left|\operatorname{post}\left(\left\{q_{1}, q_{2}\right\},(a, d) \cdot w\right)\right|=1$.

Consider some set $X$ which has cardinality $2 k+1$ and data $\left(q_{1}, q_{2}\right) \subseteq X$, we construct the data word $w_{q_{1}, q_{2}}$ as follows. Let $p_{1}=\operatorname{post}\left(q_{1},(a, d)\right)$ and $p_{2}=\operatorname{post}\left(q_{2},(a, d)\right)$, and let $\operatorname{data}\left(p_{1}, p_{2}\right)=\operatorname{data}\left(p_{1}\right) \cup$ data $\left(p_{2}\right)$. Due to the fact that $p_{1}, p_{2}$ are successors of $q_{1}, q_{2}$ after inputting $(a, d)$, we know that if $d \in \operatorname{data}\left(q_{1}, q_{2}\right)$ then $d \in \operatorname{data}\left(p_{1}, p_{2}\right)$. There are two cases:

- $d \in \operatorname{data}\left(q_{1}, q_{2}\right)$ or $d \notin \operatorname{data}\left(p_{1}, p_{2}\right)$. These guarantee that data $\left(p_{1}, p_{2}\right) \subseteq \operatorname{data}\left(q_{1}, q_{2}\right)$ if $d \in$ $\operatorname{data}\left(q_{1}, q_{2}\right)$, and that $\operatorname{data}\left(p_{1}, p_{2}\right)=\operatorname{data}\left(q_{1}, q_{2}\right)$ if $d \notin \operatorname{data}\left(p_{1}, p_{2}\right)$. As a result, $\operatorname{data}\left(p_{1}, p_{2}\right) \subseteq X$. By induction hypothesis, there exists some data word $w_{p_{1}, p_{2}}$ over data domain $X$ such that $\left|\operatorname{post}\left(\left\{p_{1}, p_{2}\right\}, w_{p_{1}, p_{2}}\right)\right|=1$. For $w_{q_{1}, q_{2}}=(a, d) \cdot w_{p_{1}, p_{2}}$ the statement of induction holds, as $\left|\operatorname{post}\left(\left\{q_{1}, q_{2}\right\}, w_{q_{1}, q_{2}}\right)\right|=1$.
- $d \notin \operatorname{data}\left(q_{1}, q_{2}\right)$ and $d \in \operatorname{data}\left(p_{1}, p_{2}\right)$. Without loss of generality, we assume that $d \notin X$. Otherwise $d \in X$ would imply data $\left(p_{1}, p_{2}\right) \subseteq X$, and we simply let $w_{q_{1}, q_{2}}=w_{p_{1}, p_{2}}$. Since $\left|\operatorname{data}\left(q_{1}, q_{2}\right)\right| \leq 2 k$, there exists some datum $x \neq d$ such that $x \in X \backslash \operatorname{data}\left(q_{1}, q_{2}\right)$. Since $x \neq d$, we can define the bijection $\pi:\{d\} \cup \operatorname{data}\left(q_{1}, q_{2}\right) \rightarrow\{x\} \cup \operatorname{data}\left(q_{1}, q_{2}\right)$ such that $\pi(d)=x$ and $\pi\left(d^{\prime}\right)=d^{\prime}$ for all $d^{\prime} \in \operatorname{data}\left(q_{1}, q_{2}\right)$. Since $\operatorname{data}\left(p_{1}, p_{2}\right) \backslash\{d\} \subseteq \operatorname{data}\left(q_{1}, q_{2}\right)$, having $d$ in the domain of $\pi$, the bijection $\pi$ ranges over data $\left(p_{1}, p_{2}\right)$. By induction hypothesis, there exists some data word $w_{p_{1}, p_{2}}$ over data domain $(X \backslash\{x\}) \cup(\{d\})$ such that $\left|\operatorname{post}\left(\left\{p_{1}, p_{2}\right\}, w_{p_{1}, p_{2}}\right)\right|=1$. Then, $\left|\operatorname{post}\left(\left\{\pi\left(p_{1}\right), \pi\left(p_{2}\right)\right\}, \pi\left(w_{p_{1}, p_{2}}\right)\right)\right|=$ 1. For all $1 \leq i \leq 2$, we have $\pi\left(p_{i}\right) \in \operatorname{post}\left(q_{i},(a, x)\right)$ since $p_{i} \in \operatorname{post}\left(q_{i},(a, d)\right)$ and $x=\pi(d)$. By above arguments, we conclude that $\mid \operatorname{post}\left(\left(\left\{q_{1}, q_{2}\right\},(a, x) \pi\left(w_{p_{1}, p_{2}}\right) \mid=1\right.\right.$. As $\{x\} \cup \operatorname{data}\left(\left\{q_{1}, q_{2}\right\}\right) \subseteq X$, thus the data word $w_{q_{1}, q_{2}}=(a, x) \pi\left(w_{p_{1}, p_{2}}\right)$ satisfies the statement of induction.

The above arguments prove that in all cases, there exists $w_{q_{1}, q_{2}} \in(\Sigma \times X)^{*}$ that merges two configurations $q_{1}$ and $q_{2}$ into a singleton, which completes the proof of Claim.

Since $\mathcal{R}$ has some synchronizing data word, using Lemma 1 we know that there exists some word $w$ with data efficiency $k$ such that $\operatorname{post}\left(L \times D^{k}, w\right) \subseteq L \times \operatorname{data}(w)^{k}$. Consider some set $X=\left\{x_{1}, x_{2}, \cdots, x_{2 k+1}\right\} \subset$ $D$ such that data $(w) \subseteq X$. We use the pairwise synchronization technique as follows. Define $S_{n}=L \times X^{k}$ and $n=|L|(2 k+1)^{k}$, i.e., $\left|S_{n}\right|=n$. For all $i=n-1, \cdots, 1$ repeat the following:

1. Take a pair of configurations $q_{1}, q_{2} \in S_{i+1}$. By the Claim above, one can find some word $w_{q_{1}, q_{2}} \in(\Sigma \times X)^{*}$ such that $\left|\operatorname{post}\left(\left\{q_{1}, q_{2}\right\}, w_{q_{1}, q_{2}}\right)\right|=1$,
2. Define $v_{i}=w_{q_{1}, q_{2}}$ and $S_{i}=\operatorname{post}\left(S_{i+1}, v_{i}\right)$.

Note that by determinism of $\mathcal{R}$, for every $i \in\{1, \cdots, n-1$,$\} , we have \left|S_{i}\right| \leq\left|S_{i+1}\right|-1$. Thus the word $w_{\text {synch }}=w \cdot v_{n-1} \cdots v_{2} \cdot v_{1}$ is a synchronizing data word for $\mathcal{R}$. Since data $(w) \subseteq X$ and data $\left(v_{i}\right) \subseteq X$ for all $i \in\{1, \cdots, n-1\}$, the data efficiency of $w_{\text {synch }}$ is at most $2 k+1$. The proof is complete.

Lemma 5. The synchronization problem for $k-D R A s$ is PSPACE-complete.
Proof. We prove PSPACE-hardness by a reduction from the non-emptiness problem for $k$-DRA. Let $\mathcal{R}=(L, R, \Sigma, T)$ be a $k$-DRA equipped with an initial location $\ell_{i}$ and an accepting location $\ell_{f}$, where, without loss of generality, we assume that all outgoing transitions from $\ell_{i}$ update all registers, and that $\ell_{f}$ has no outgoing edges. We also assume that $\mathcal{R}$ is complete, otherwise, we add some non-accepting location and direct all undefined transitions to it.

The reduction is such that from $\mathcal{R}$ we construct another $k$-DRA $\mathcal{R}_{\text {syn }}$ such that the language of $\mathcal{R}$ is not empty if, and only if, $\mathcal{R}_{\text {syn }}$ has some synchronizing data word. We define $\mathcal{R}_{\text {syn }}=\left(L_{\text {syn }}, R, \Sigma_{\text {syn }}, T_{\text {syn }}\right)$ as follows. The set of locations is $L_{\text {syn }}=L \cup\{$ reset $\}$, where reset $\notin L$ is a new location; the alphabet is $\Sigma_{\text {syn }}=\Sigma \cup\{\star\}$, where $\star \notin \Sigma$. To define $T_{\text {syn }}$, we add the following transitions to $T$.

- $\ell_{f} \xrightarrow{a R \downarrow} \ell_{f}$ for all letters $a \in \Sigma_{\text {syn }}$,
- $\ell_{i} \xrightarrow{\star R \downarrow} \ell_{i}$
- reset $\xrightarrow{a R \downarrow} \ell_{i}$ for all letters $a \in \Sigma_{\text {syn }}$,
- $\ell \xrightarrow{\star R \downarrow}$ reset for all $\ell \in L_{\text {syn }}$ except for reset, $\ell_{i}, \ell_{f}$.

Note that $\mathcal{R}_{\text {synch }}$ is indeed deterministic and complete. To establish the correctness of the reduction, we prove that the language of $\mathcal{R}$ is not empty if, and only if, $\mathcal{R}_{\text {syn }}$ has a synchronizing data word.

First, assume that the language of $\mathcal{R}$ is not empty. Then there exists a data word $w=\left(a_{1}, d_{1}\right) \ldots\left(a_{n}, d_{n}\right)$ such that $w \in L(\mathcal{R})$. Hence there exists a run starting from $\left(\ell_{i}, \nu_{i}\right)$ and ending in $\left(\ell_{f}, \nu_{f}\right)$ for some $\nu_{i}, \nu_{f} \in D^{|R|}$. The data word $(\star, d)(\star, d) w(\star, d)$ for some $d \in D$ synchronizes $\mathcal{R}_{\text {syn }}$ in location $\ell_{f}$.

Second, assume that $\mathcal{R}_{\text {syn }}$ has some synchronizing data word. Let $w \in\left(\Sigma_{\text {syn }} \times D\right)^{*}$ be one of the shortest data synchronizing data words. All transitions in $\ell_{f}$ are self-loops with update on all registers;

Hence, $\mathcal{R}_{\text {syn }}$ can only be synchronized in $\ell_{f}$. Hence, we also have $\operatorname{post}\left(\left(\ell_{i}, \nu_{i}\right), w\right)=\left\{\left(\ell_{f}, \nu_{f}\right)\right\}$ (for some $\left.\nu_{i}, \nu_{f} \in D^{|R|}\right)$. By the fact that $w$ is a shortest synchronizing data word, we can infer that the corresponding run does not contain any $\star$-transitions except for two self-loops in $\ell_{i}$ in the very beginning. Hence there exists a run from $\left(\ell_{i}, \nu_{i}\right)$ to $\ell_{f}$ and thus $L(\mathcal{R}) \neq \emptyset$.

## 7 Proofs for Non-deterministic Register Automata

Lemma 6, There is a family of $1-N R A s\left(\mathcal{R}_{\text {counter }(n)}\right)_{n \in \mathbb{N}}$ with $\mathcal{O}(n)$ locations, such that for all synchronizing data words $w$, some datum $d \in \operatorname{data}(w)$ appears in $w$ at least $2^{n}$ times.

Proof. The family of 1-NRAs $\left(\mathcal{R}_{\operatorname{counter}(n)}\right)_{n \in \mathbb{N}}$ is defined as follows. We define the alphabet of RA $\mathcal{R}_{\text {counter }(n)}$ by $\Sigma=\left\{\#, \star, \operatorname{Bit}_{0}, \operatorname{Bit}_{1}, \cdots, \operatorname{Bit}_{n}\right\}$. The structure of $\mathcal{R}_{\text {counter }(n)}$ is composed of three distinguished locations synch, reset, zero and locations $2^{n}, 2^{n-1}, \cdots, 2^{1}, 2^{0}$ and $2_{c}^{n}, 2_{c}^{n-1}, \cdots, 2_{c}^{1}, 2_{c}^{0}$. The general structure of $\mathcal{R}_{\text {counter }(n)}$ is partially depicted in Figure 4 The RA $\mathcal{R}_{\text {counter }(n)}$ is constructed such that for all synchronizing data words $w$, some datum $x \in \operatorname{data}(w)$ appears in $w$ at least $2^{n}$ times. A counting feature is thus embedded in $\mathcal{R}_{\text {counter }(n)}$ : intuitively, the set of all reached configurations represents the counter value. Starting from $\{($ zero, $x)\}$, the first increment results in $\left\{2_{c}^{n}, \cdots, 2_{c}^{2}, 2_{c}^{1}, 2^{0}\right\} \times\{x\}$, where location $2^{i}$ means that the $i$-th least significant bit in the binary representation of the counter value is set to 1 , and location $2_{c}^{i}$ means that the $i$-th bit is set to 0 . Informally, we say that there is an $x$-token in every reached location. Here, $2_{c}^{n}, \cdots, 2_{c}^{2}, 2_{c}^{1}, 2^{0}$ have $x$-tokens. A sequence of counter increments is encoded by re-placing the $x$-tokens, as shown in the following sequence of sets of locations: $\left\{2_{c}^{n}, \cdots, 2_{c}^{2}, 2^{1}, 2_{c}^{0}\right\}$, $\left\{2_{c}^{n}, \cdots, 2_{c}^{2}, 2^{1}, 2^{0}\right\},\left\{2_{c}^{n}, \cdots, 2_{c}^{3}, 2^{2}, 2_{c}^{1}, 2_{c}^{0}\right\}$, etc. The transitions of $\mathcal{R}_{\text {counter }(n)}$ are defined in such a way that, starting from $\{($ zero,$x)\}$, either $2^{i}$ or $2_{c}^{i}$ have tokens, but never both of them at the same time. We now present a detailed explanation of the structure of $\mathcal{R}_{\text {counter }(n)}$.

All transitions in synch are self-loops with an update on the register synch $\stackrel{\Sigma r \downarrow}{\longrightarrow}$ synch. Thus, $\mathcal{R}_{\text {counter (n) }}$ can only be synchronized in synch. Moreover, synch is only accessible by \#-transitions. Similarly, all transitions except for those with label $\star$, are self-loops in location reset; thus, $\mathcal{R}_{\text {counter }(n)}$ can only be synchronized by leaving reset by reading $\star$. We use this also to avoid transitions which are incorrect with respect to the binary incrementing process: all incorrect actions are guided to reset to enforce another $\star$. Assuming $w$ to be one of the shortest synchronizing words, we see that $\operatorname{post}(L \times D, w)=$ $\{(\operatorname{synch}, x)\}$, where $w$ starts with $(\star, x)$ and ends with $(\#, x)$.

The counting involves an initializing process and several incrementing processes.

- initializing the counter to zero: the $\star$-transitions are devised to place a token in zero: from all locations $\ell \in L \backslash\{$ synch $\}$ we have $\ell \xrightarrow{\star r \downarrow}$ zero. This sets the counter to 0 .
- incrementing the counter: we use $\mathrm{Bit}_{0}, \ldots, \mathrm{Bit}_{n}$-transitions with equality guards to control the increment. Intuitively, an equality-guarded $\mathrm{Bit}_{i}$-transition is taken to set the $i$-th bit in the binary representation of the counter value according to the standard rules of binary incrementation.
Initially, the token in zero splits in $2^{0}$ and $2_{c}^{n}, \cdots 2_{c}^{1}$ to represent $0 \cdots 01$, by taking the transitions zero $\xrightarrow{=r \text { Bit }_{0}} 2^{0}$ and zero $\xrightarrow{=r \text { Bit }_{0}} 2_{c}^{j}$ for all $1 \leq j \leq n$. Equality-guarded Bit $_{i}$-transitions for $i \in\{1, \ldots, n\}$ are incorrect for zero and thus guided to reset. Whenever data different from $x$ is processed, $\mathcal{R}_{\text {counter ( } n \text { ) }}$ takes self-loops (omitted in Figure 4) and keeps the $x$-tokens unmoved.
The equality-guarded $\mathrm{Bit}_{i}$-transitions should only be taken if the $i$-th bit is not set, or, equivalently, if the location $2^{i}$ contains no token. This is guaranteed by a $\mathrm{Bit}_{i}$-transition $2^{i} \xrightarrow{=r \mathrm{Bit}_{i}}$ reset, for every $0 \leq i \leq n$, which results in an incorrect transition and should be avoided. (Otherwise the counting process has to restart from 0.) In Figure 4, we depict the corresponding transitions for $i=2$ and $i=n$.
Further, we need to guarantee that for all $i \geq 1$ a $\operatorname{Bit}_{i}$-transition is taken only if all less significant bits are set, or, equivalently, if all locations $2^{i-1}, \cdots 2^{0}$ contain a token. This is ensured by a $\mathrm{Bit}_{i^{-}}$ transition $2_{c}^{j} \xrightarrow{=r \operatorname{Bit}_{i}}$ reset, for every $0 \leq j<i$, which again results in an incorrect transition. See, e.g., the transition $2_{c}^{2} \xrightarrow{=r \mathrm{Bit}_{i}}$ reset in Figure 4 for every $3 \leq i \leq n$.

Finally, $\mathrm{Bit}_{i}$-transitions must produce tokens in $2^{i}$ and $2_{c}^{0}, \cdots 2_{c}^{i-1}$, thus $2_{c}^{i} \xrightarrow{=r \mathrm{Bit}_{i}} 2^{i}$ and $2^{j} \xrightarrow{=r \mathrm{Bit}_{i}}$ $2_{c}^{j}$ for all $0 \leq j<i$. All tokens in locations $2^{j}$ and $2_{c}^{j}$, respectively, for $j>i$ remain where they are, which is implemented by equality-guarded $\mathrm{Bit}_{i}$-self-loops in $2^{j}$ and $2_{c}^{j}$, respectively.

By construction, it is easy to see that Bit $_{i}$-transitions are the only way to produce a token in $2^{i}$, which can be fired if $2_{c}^{i}$ has a token. The Bit $_{i}$-transitions then consume the token in $2_{c}^{i}$. This guarantees that after the first $\star$-transition, which puts a token into zero, the two locations $2^{i}$ and $2_{c}^{i}$ will never have a token at the same time.

Finally, all equality-guarded \#-transitions in $2_{c}^{n}$ and $2^{i}$ for all $0 \leq i<n$ are sent to reset. In contrast, all \#-transitions in $2^{n}$ and $2_{c}^{i}$ for all $0 \leq i<n$ are sent to synch, with an update on the register. This guarantees that the counter must correctly count from 0 to $10 \cdots 0$, meaning that at least one datum $x$ appears at least $2^{n}$ times while synchronizing $\mathcal{R}_{\text {counter }(n)}$.

Lemma 8. The non-universality problem is reducible to the synchronization problem for NRAs.
Proof. The reduction is based on the construction presented in Theorem 17 in [14].
Let $\mathcal{R}=\langle L, R, \Sigma, T\rangle$ be an NRA equipped with an initial location $\ell_{\text {in }}$ and a set $L_{\mathrm{f}}$ of accepting locations, where, without loss of generality, we assume that all outgoing transitions from $\ell_{\text {in }}$ update all registers. We also assume that $\mathcal{R}$ is complete, otherwise, we add some non-accepting location and direct all undefined transitions to it.

We construct an NRA $\mathcal{R}_{\text {syn }}$ such that there exists some data word that is not in $L(\mathcal{R})$ if, and only if, $\mathcal{R}_{\text {syn }}$ has some synchronizing data word. We define $\mathcal{R}_{\text {syn }}=\left\langle L_{\text {syn }}, R, \Sigma_{\text {syn }}, T_{\text {syn }}\right\rangle$ as follows. The set of locations is $L_{\text {syn }}=L \cup\{$ reset, synch $\}$ where synch, reset $\notin L$ are two new locations. The alphabet is $\Sigma_{\text {synch }}=\Sigma \cup\{\#, \star\}$ where $\#, \star \notin \Sigma$. The transition relation $T_{\text {syn }}$ is the union of $T$ and set containing the following transitions:

- synch $\xrightarrow{a R \downarrow}$ synch for all letters $a \in \Sigma_{\text {syn }}$,
- reset $\xrightarrow{\star R \downarrow} \ell_{\text {in }}$ and reset $\xrightarrow{a R \downarrow}$ reset for all letters $a \in \Sigma_{\text {syn }} \backslash\{\star\}$,
- $\ell \xrightarrow{\star R \downarrow} \ell_{\text {in }}$ for all locations $\ell \in L$,
- $\ell \xrightarrow{\# R \downarrow}$ synch for all non-accepting locations $\ell \in L \backslash L_{\mathrm{f}}$,
- $\ell \xrightarrow{\# R \downarrow}$ reset for all accepting locations $\ell \in L_{\mathrm{f}}$.

Next, we prove the correctness of the reduction.
First, assume there exists a data word $w=\left(a_{1}, d_{1}\right) \ldots\left(a_{n}, d_{n}\right)$ such that $w \notin L(\mathcal{R})$. Hence, all runs starting in $\left(\ell_{\mathrm{in}}, \nu_{i}\right)$ with $\nu_{i} \in D^{|R|}$ end in some configuration $(\ell, \nu)$ with $\ell \notin L_{\mathrm{f}}$. The data word $(\star, d) \cdot w \cdot(\#, d)$ with $d \in D$ synchronizes $\mathcal{R}_{\text {syn }}$ in location synch, proving that $\mathcal{R}_{\text {syn }}$ has some synchronizing data word.

Second, assume that $\mathcal{R}_{\text {syn }}$ has some synchronizing data word. All transitions in synch are self-loops with update on all registers; thus, $\mathcal{R}_{\text {syn }}$ can only synchronize in synch. Moreover, synch is only accessible with \#-transitions; assuming $w$ is one of the shortest synchronizing data words, we see that $\operatorname{post}(L \times$ $D, w)=\{($ synch,$\nu))\}$ for some $\nu \in D^{|R|}$. From all locations $\ell \in L$ we have $\ell \xrightarrow{\star R \downarrow} \ell_{\text {in }}$; we say that $\star$-transitions reset $\mathcal{R}_{\text {syn }}$. Moreover, the only outgoing transition in location reset is the $\star$-transition. Thus, a reset followed by some $\#$ must occur while synchronizing. Let $w=w_{0}\left(\star, d_{\star}\right) w_{1}\left(\#, d_{\#}\right) w_{2}$, where $w_{1} \in(\Sigma \times D)^{+}$is the data word between the last occurrence of $\star$ and the first following occurrence of $\#$, and $w_{2} \in\left(\Sigma^{\prime} \backslash\{\star\}\right)^{*}$. We prove that $w_{1} \notin L(\mathcal{R})$. By contradiction, assume that $w_{1}$ is in the language; thus, there exist valuations $\nu_{i}, \nu_{f} \in D^{|R|}$ such that $\mathcal{R}_{\text {syn }}$ has a run over $w_{1}$, i.e., starting in $\left(\ell_{\mathrm{in}}, \nu_{i}\right)$ and ending in $\left(\ell_{f}, \nu_{f}\right)$ where $\ell_{f} \in L_{\mathrm{f}}$. In fact, since all outgoing transitions in $\ell_{\text {in }}$ update all registers, then for all valuations $\nu_{i}, \mathcal{R}_{\text {syn }}$ has an accepting run over $w_{1}$.

Note that $w_{0}$ cannot be a synchronizing word for $\mathcal{R}_{\text {syn }}$, because this would contradict the assumption that $w$ is one of the shortest synchronizing data word. It implies that there must be some configuration $q$
such that $\operatorname{post}_{\mathcal{R}_{\text {syn }}}\left(q, w_{0}\right)$ contains some configuration $(\ell, \nu)$ with $\ell \neq \operatorname{synch}$. From $(\ell, \nu)$, inputting the next $\left(\star, d_{\star}\right)$ (that is after $w_{0}$ in synchronizing word $w$ ), we reach $\left(\ell_{\mathrm{in}},\left\{d_{\star}\right\}^{|R|}\right)$. Since for all valuations $\nu_{i}$, starting in $\left(\ell_{\text {in }}, \nu_{i}\right), \mathcal{R}_{\text {synch }}$ has an accepting run over $w_{1}$, it must have an accepting run from $\left(\ell_{\text {in }},\left\{d_{\star}\right\}^{|R|}\right)$ to some accepting configuration $\left(\ell_{f}, \nu_{f}\right)$ too. Reading the last \# (that is after $w_{1}$ in synchronizing word $w$ ), reset is reached. Since $w_{2}$ does not contain any $\star$, reset is never left, meaning that $\mathcal{R}_{\text {syn }}$ cannot synchronize in synch, a contradiction. The proof is complete.

Note that the reduction preserves the number of registers in the NRAs.

