Skip to main content
Log in

Lower and upper bounds for single-scanner snapshot implementations

  • Published:
Distributed Computing Aims and scope Submit manuscript

Abstract

We present a collection of upper and lower bounds on the complexity of asynchronous, wait-free, linearizable, single-scanner snapshot implementations from read–write registers. We argue that at least m registers are needed to implement a single-scanner snapshot with m components and we prove that, in space-optimal implementations, SCANS execute \(\varOmega (m^2)\) steps. We present an algorithm that runs in \(O(m^2)\) steps and uses \(m+1\) registers. We also present three implementations (namely, T-Opt, RT and RT-Opt) that beat the \(\varOmega (m^2)\) lower bound by using more registers. Specifically, T-Opt has step complexity O(1) for UPDATE and O(m) for SCAN. This step complexity is optimal, but the number of registers that T-Opt uses is unbounded. We then present interesting recycling techniques to bound the number and the size of registers used, resulting in RT and RT-Opt. Specifically, RT-Opt, which has optimal step complexity, uses O(mn) bounded-size registers, where n is the total number of processes. Our implementations are the first with step complexities that are (linear or quadratic) functions only of m (and not of n). Moreover, T-Opt and RT-Opt are the first implementations with optimal step complexity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

Similar content being viewed by others

References

  1. Afek, Y., Attiya, H., Dolev, D., Gafni, E., Merritt, M., Shavit, N.: Atomic snapshots of shared memory. J. ACM 40(4), 873–890 (1993)

    Article  MATH  Google Scholar 

  2. Anderson, J.H.: Composite registers. Distrib. Comput. 6(3), 141–154 (1993)

    Article  MATH  Google Scholar 

  3. Anderson, J.H.: Multi-writer composite registers. Distrib. Comput. 7(4), 175–195 (1994)

    Article  Google Scholar 

  4. Aspnes, J.: Time-and space-efficient randomized consensus. In: Proceedings of the Ninth Annual ACM Symposium on Principles of Distributed Computing, pp. 325–331. ACM (1990)

  5. Aspnes, J., Herlihy, M.: Wait-free data structures in the asynchronous PRAM model. In: Proceedings of 2nd ACM Symposium on Parallel Algorithms and Architectures, pp. 340–349 (1990)

  6. Attiya, H., Ellen, F., Fatourou, P.: The complexity of updating snapshot objects. J. Parallel Distrib. Comput. 71(12), 1570–1577 (2011)

    Article  MATH  Google Scholar 

  7. Attiya, H., Lynch, N., Shavit, N.: Are wait-free algorithms fast? J. ACM 41(4), 725–763 (1994)

    Article  MATH  Google Scholar 

  8. Attiya, H., Rachman, O.: Atomic snapshots in \(O(n \log n)\) operations. SIAM J. Comput. 27(2), 319–340 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  9. Burns, J., Lynch, N.: Bounds on shared memory for mutual exclusion. Inf. Comput. 107(2), 171–184 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  10. Dwork, C., Waarts, O.: Simple and efficient bounded concurrent timestamping and the traceable use abstraction. J. ACM 46(5), 633–666 (1999)

  11. Ellen, F., Fatourou, P., Ruppert, E.: Time lower bounds for implementations of multi-writer snapshots. J. ACM 54(6), 30 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  12. Fatourou, P., Fich, F., Ruppert, E.: Space-optimal multi-writer snapshot objects are slow. In: Proceedings of the Twenty-First Annual Symposium on Principles of Distributed Computing, pp. 13–20. ACM (2002)

  13. Fatourou, P., Fich, F., Ruppert, E.: A tight time lower bound for space-optimal implementations of multi-writer snapshots. In: Proceedings of the 35th ACM Symposium on Theory of Computing, pp. 259–268 (2003)

  14. Fatourou, P., Fich, F., Ruppert, E.: Time-space tradeoffs for implementations of snapshots. In: Proceedings of the 38th ACM Symposium on Theory of Computing (2006)

  15. Fich, F., Herlihy, M., Shavit, N.: On the space complexity of randomized synchronization. J. ACM 45(5), 843–862 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  16. Gafni, E., Merritt, M., Taubenfeld, G.: The concurrency hierarchy, and algorithms for unbounded concurrency. In: Proceedings of the Twentieth Annual ACM Symposium on Principles of Distributed Computing, pp. 161–169. ACM (2001)

  17. Gawlick, R., Lynch, N., Shavit, N.: Concurrent timestamping made simple. In: Proceedings of the Israel Symposium on the Theory of Computing and Systems, LNCS, vol. 601, pp. 171–183 (1992)

  18. Herlihy, M., Wing, J.M.: Linearizability: a correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst. 12(3), 463–492 (1990)

    Article  Google Scholar 

  19. Inoue, M., Chen, W., Masuzawa, T., Tokura, N.: Linear time snapshots using multi-writer multi-reader registers. In: 8th International Workshop on Distributed Algorithms, LNCS, vol. 857, pp. 130–140 (1994)

  20. Israeli, A., Shaham, A., Shirazi, A.: Linear-time snapshot implementations in unbalanced systems. Math. Syst. Theory 28(5), 469–486 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  21. Jayanti, P.: F-arrays: implementation and applications. In: Proceedings of the 21th ACM Symposium on Principles of Distributed Computing, pp. 270–279 (2002)

  22. Jayanti, P.: An optimal multi-writer snapshot algorithm. In: Proceedings of the 37th ACM Symposium on Theory of Computing, pp. 723–732 (2005)

  23. Jayanti, P., Tan, K., Toueg, S.: Time and space lower bounds for nonblocking implementations. SIAM J. Comput. 30(2), 438–456 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  24. Kirousis, L.M., Spirakis, P., Tsigas, P.: Reading many variables in one atomic operation: solutions with linear or sublinear complexity. IEEE Trans. Parallel Distrib. Syst. 5(7), 688–696 (1994)

    Article  Google Scholar 

  25. Mullender, S.: Distributed Systems. Addison-Wesley, Boston (1994)

    Google Scholar 

  26. Peterson, G.L.: Concurrent reading while writing. ACM Trans. Program. Lang. Syst. (TOPLAS) 5(1), 46–55 (1983)

    Article  MATH  Google Scholar 

  27. Rianny, Y., Shavit, N., Touitou, D.: Towards a practical snapshot algorithm. Theor. Comput. Sci. 269(1–2), 163–201 (2003)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This research has been supported by the European Commission through the TransForm, Euroserver, and HiPEAC3 projects. It has also been supported by the ARISTEIA Action of the Operational Programme Education and Lifelong Learning through the GreenVM project, and by the project “Computer Science Studies at the University of Ioannina” of the Operational Program for Education and Initial Vocational Training funded by the 3rd Community Support Framework and the Hellenic Ministry of Education. We would like to thank Faith Ellen for her valuable comments on an early version of this work. We would also like to thank Prasad Jayanti for pointing out some related work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nikolaos D. Kallimanis.

Additional information

Preliminary versions of this work appeared in the Proceedings of the 25th Annual ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing (PODC), 2006, pp. 228–237 and in the Proceedings of the 26th Annual ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing (PODC), 2007, pp. 33–42.

Appendix: Proofs of Lemmas 1, 2, and 3

Appendix: Proofs of Lemmas 12, and 3

Consider any implementation of an m-component multi-writer snapshot object shared by \(n > m+1\) processes from a set of m multi-writer read/write registers. The statements of Lemmas 35 and 36 and their proofs are slightly modified versions of similar lemmas that appear in [12]. Lemmas 373839 and 40 and their proofs are exactly the same as their analogs from [12]. For the shake of simplicity, our proofs below assume that there is a unique process \(p_s\) that performs SCAN operations in the system. (We remark that the lemmas hold even if SCAN operations are executed by different processes provided that no pair of SCANS overlap.)

For the shake of simplicity, in this section, we assume that an execution is a sequence of steps. Fix any execution \(\alpha \) of a single-scanner, multi-writer, m-component snapshot implementation from m registers starting from \(C_0\).

Lemma 35

Suppose that, in configuration C, a set \(P_O\) of at most \(n-2\) processes covers a set of registers O, and all processes not in \(P_O\) are inactive. Furthermore, suppose there is some component \(A_i\) such that no process has a pending UPDATE to \(A_i\) in configuration C. Consider an execution starting from C in which the processes in \(P_O\) execute a step each to perform their writes and, immediately afterwards, the scanner \(p_s\) performs a solo execution in which it finishes its pending operation (if it has one) and then performs a complete SCAN. Let \(v\) be the value that this SCAN returns for component \(A_i\). Then, for all \(p \not \in P_O \cup \{ p_s \}\) and all \(v' \ne v\), the solo execution by p of \(\mathtt{UPDATE}(i,v')\) starting from C must perform a write to a register outside O.

Proof

Suppose not. Let \(C'\) be the configuration obtained from C when the processes in \(P_O\) perform one step each and let \(\beta \) be the solo execution by \(p_s\) starting from \(C'\). Let \(C''\) be the configuration obtained when p performs a solo execution of \(\mathtt{UPDATE}(i, v')\) starting from C and then the processes in \(P_O\) execute a step each to perform their writes. By our assumption, p does not write to any register outside O, so each register has the same value in \(C''\) that it has in \(C'\). Furthermore, \(p_s\) is in the same state in \(C'\) and \(C''\). Therefore, the solo execution \(\beta \) by \(p_s\) starting from \(C''\) is legal and \(p_s\)’s SCAN returns the value \(v\) for component \(A_i\). However, the execution \(\beta \) starting from \(C''\) must return the value \(v'\ne v\) for component \(A_i\), since p completed its UPDATE \((i,v')\) before the SCAN began and no process has a pending UPDATE to \(A_i\) at C. This is a contradiction. \(\square \)

For any configuration C and for any set of processes \(P^\prime \), the set of components with a pending UPDATE in C by a process in \(P'\) is denoted \(CPU(C, P')\).

Definition 1

Consider any integer \(\ell \), where \(1 \le \ell \le m < n\). A configuration C is \(\ell \)-fatal if there exists a subset O of \(\ell \) registers and a set \(P_O\) of \(\ell \) processes such that \(P_O\) covers O in C and \(|CPU(C, P_O)| < \ell \).

Lemma 36

No implementation for n processes of an m-component snapshot object from m registers has a reachable \(\ell \)-fatal configuration, for \(1 \le \ell \le m < n-1\).

Proof

Suppose the lemma is false. Let \(\ell \) be the largest integer such that there is a reachable \(\ell \)-fatal configuration, \(C_1\). Then there is a set O of \(\ell \) registers and a set \(P_O\) of \(\ell \) processes such that \(P_O\) covers O and \(| CPU(C_1,P_O)| < \ell \). Let C be the configuration obtained from \(C_1\) by running all processes not in \(P_O\) until they are inactive. Since it holds that \(| CPU(C,P_O)| = | CPU(C_1,P_O)| < \ell \le m\), there exists a component \(A_i \not \in CPU(C,P_O)\).

Let \(p \not \in P_O\) be any process other than \(p_s\). This process exists because \(|P_O| = \ell \) and \(1 \le \ell < n-1\). Consider the execution starting from C in which the processes in \(P_O\) execute a step each to perform their writes, \(p_s\) finishes its pending operation (if any), and then \(p_s\) performs a complete SCAN. Let \(v\) be the value that this SCAN returns for component \(A_i\). By Lemma 35, for all \(v' \ne v\), the solo execution of \(\mathtt{UPDATE}(i, v')\) to \(A_i\) by p starting from C contains a write to a register \(R \not \in O\).

If \(\ell =m\), then we have a contradiction, since all registers are in O. Otherwise, \(l < m\). In this case, let \(C_2\) be the reachable configuration obtained by performing p’s solo execution of \(\mathtt{UPDATE}(i,v')\) starting from C until just before p writes to R for the first time. Let \(O' = O \cup \{R\}\) and let \(P_O' = P_O \cup \{p\}\). Then \(| O' | = | P_O' | = \ell +1\), \(P_O'\) covers \(O'\) in \(C_2\), and \(CPU(C_2, P_O') = CPU(C, P_O) \cup \{A_i\}\), so \(| CPU(C_2, P_O') | < \ell +1\). Thus, \(C_2\) is a reachable \((\ell +1)\)-fatal configuration, contradicting the maximality of \(\ell \). \(\square \)

Lemma 37

SCAN operations never perform writes.

Proof

Suppose there is an execution of a SCAN operation by process \(p_s\) that contains a write to a register R. Consider the configuration C that occurs just before this write is performed. Since \(\{q\}\) covers \(\{R\}\) and \(CPU(C,\{q\})\) is empty, this configuration is 1-fatal, contradicting Lemma 36. \(\square \)

A solo SCAN starting from \(C_0\) returns \(\bot \) for every component. For each process \(p_i\) other than \(p_s\), each component \(A_j\), and each possible value \(v\ne \bot \), consider the solo execution of an UPDATE of component \(A_j\) with value \(v\) by process \(p_i\) starting from \(C_0\). Since all processes are inactive in \(C_0\), we can apply Lemma 35 with \(O = P_O = \emptyset \) to see that this execution by \(p_i\) contains at least one write to a register. Denote by \(R_i(j,v)\) the first register written by \(p_i\) and denote by \(\rho _i(j,v)\) the prefix of this execution up to, but not including this first write. (The sequence \(\rho _i(j,v)\) may be empty.)

Lemma 38

Consider any component \(A_j\). For any processes \(p_{i_1}\) and \(p_{i_2}\) other than \(p_s\), and for any non-\(\bot \) values \(v_{1}\) and \(v_{2}\), \(R_{i_1}(j,v_{1}) = R_{i_2}(j,v_{2})\).

Proof

Assume first that \(p_{i_1} \ne p_{i_2}\). Consider the execution \(\rho _{i_1}(j,v_1) \cdot \rho _{i_2}(j,v_2)\) starting from \(C_0\) and let C be the resulting configuration. This execution is legal since \(p_{i_1}\) performs no writes during \(\rho _{i_1}(j,v_1)\). Note that \(\{ p_{i_1}, p_{i_2} \}\) covers \(\{ R_{i_1}(j,v_1), R_{i_2}(j,v_2) \}\) in C and \(CPU(C, \{ p_{i_1}, p_{i_2} \}) = \{A_j\}\). If \(R_{i_1}(j,v_1) \ne R_{i_2}(j,v_2)\), then C is 2-fatal. This contradicts Lemma 36. Hence \(R_{i_1}(j,v_1) = R_{i_2}(j,v_2)\).

Assume now that \(p_{i_1} = p_{i_2}\). Let \(p_i\) be any other process. By the argument above, \(R_{i}(j,v_1) = R_{i_1}(j,v_1)\) and \(R_{i}(j,v_1) =R_{i_2}(j,v_2)\). Hence \(R_{i_1}(j,v_1) =R_{i_2}(j,v_2)\). \(\square \)

Lemma 38 allows us to define \(R_j\) to be the register \(R_i(j,v)\) covered by each process \(p_i\) other than \(p_s\), immediately after it executes \(\rho _i(j,v)\) starting from \(C_0\), for any value \(v\ne \bot \). That is, every process (other than \(p_s\)) does its first write to \(R_j\) when it performs any solo UPDATE to \(A_j\) (with a non-\(\bot \) value) starting from \(C_0\).

Lemma 39

Let \(\alpha \) be an execution starting from \(C_0\) in which some process other than \(p_s\) takes no steps. Then, for each \(j \in \{ 1,\ldots , m\}\), UPDATE operations to component \(A_j\) in \(\alpha \) write only to \(R_j\).

Proof

Suppose there is a process \(p_i\) other than \(p_s\) that performs a write to a register \(R \ne R_j\) during the execution of an UPDATE to component \(A_j\) in \(\alpha \). Let \(\alpha '\) denote the prefix of \(\alpha \) up to, but not including this write by \(p_i\) to register R.

Let \(p_k\) be a process other than \(p_s\) that takes no steps in \(\alpha \) and let \(v\) be a non-\(\bot \) value. Consider the execution \(\rho _k(j, v) \cdot \alpha '\) and let \(C'\) be the resulting configuration. This execution is legal since \(p_k\) performs no writes during \(\rho _k(j,v)\). Note that \(\{ p_i, p_k \}\) covers \(\{ R, R_j \}\) in \(C'\) and since it holds that \(CPU(C', \{ p_i, p_k \}) = \{A_j\}\), it follows that \(C'\) is 2-fatal. This contradicts Lemma 36. \(\square \)

The next result shows that processes, which perform UPDATE operations to different snapshot components must write to different registers.

Lemma 40

\(R_{j_1} \ne R_{j_2}\) for distinct \(j_1, j_2 \in \{ 1, \ldots , m \}\).

Proof

To derive a contradiction, suppose \(R_{j_1} = R_{j_2}\) for some \(j_1 \ne j_2\). Let \(p_{k_1}\) and \(p_{k_2}\) be two distinct processes other than \(p_s\). Let \(v\) be some non-\(\bot \) value. Let C be the configuration that results when \(\rho _{k_2}(j_2,v)\) is performed by \(p_{k_2}\) starting from \(C_0\). In configuration C, \(\{ p_{k_2}\}\) covers \(\{R_{j_2} \}\), all other processes are inactive, and no process has a pending UPDATE to \(A_{j_1}\). Let \(C'\) be the configuration obtained from C by allowing \(p_{k_2}\) to do its pending write. A solo SCAN by process \(p_s\) starting from \(C'\) returns \(\bot \) for component \(A_{j_1}\), since no UPDATES to \(A_{j_1}\) have been started in this execution. Let \(\alpha \) be the solo execution of UPDATE \((j_1,v)\) by \(p_{k_1}\) starting from C. By Lemma 35, \(p_{k_1}\) must write to some register other than \(R_{j_1}= R_{j_2}\) during \(\alpha \).

Since \(p_{k_2}\) performs no writes during \(\rho _{k_2}(j_2,v)\), it is also the case that \(\alpha \) is a legal execution starting from \(C_0\). Process \(p_{k_2}\) takes no steps during \(\alpha \), so Lemma 39 implies that \(p_{k_1}\) writes only to \(R_{j_1}\) during \(\alpha \). This is a contradiction. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fatourou, P., Kallimanis, N.D. Lower and upper bounds for single-scanner snapshot implementations. Distrib. Comput. 30, 231–260 (2017). https://doi.org/10.1007/s00446-016-0286-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00446-016-0286-7

Keywords

Navigation