Specification and space complexity of collaborative text editing
Introduction
Collaborative text editing systems, like Google Docs [10], [9], Apache Wave [1], or wikis [17], allow users at multiple sites to concurrently edit the same document. To achieve high responsiveness and availability, such systems often replicate the document in geographically distributed sites or on user devices. A user can modify the document at a nearby replica, which propagates the modifications to other replicas asynchronously. This propagation can be done either via a centralized server or peer-to-peer. An essential feature of a collaborative editing system is that all changes eventually propagate to all replicas and get incorporated into the document in a consistent way. In particular, such systems aim to guarantee eventual consistency: if users stop modifying the document, then the replicas will eventually converge to the same state [36], [35].
Fig. 1(a) gives an example scenario of a document edited at several replicas. First, replica inserts x at the first position (zero-indexed) into the empty list. This insertion then propagates to replica , which inserts a to the left of x, and to , which inserts b to the right of x. Later the modifications made by and propagate to all other replicas, including ; when the latter reads the list, it observes axb. In this scenario, the desired system behavior is straightforward, but sometimes this is not the case. To illustrate this point, consider the scenario in Fig. 1(b) (also known as the TP2 puzzle [21]), where deletes x from the list before the insertions of a and b propagate to it. One might expect the read by to return ab, given the orderings ax and xb established at other replicas. However, some implementations allow ba as a response [27]; e.g., we demonstrate in Appendix A that this is the case in the Jupiter protocol [20] used in public collaboration systems [37].
Users would like to have highly available protocols, which respond to their operations immediately, without performing any communication. There have been a number of proposals of highly available collaborative editing protocols, using techniques such as operation transformations (OT) [12], [25], [29], [30] and conflict-free replicated data types (also known as CRDTs) [23], [39], [26]. It is challenging to specify the desired behavior of collaborative editing protocols, without referring to the actual implementation, in particular, for identifying the visible operations that should be reflected in the responses of operations. Abstracting away implementation details is essential for studying inherent properties and limitations. Instead, existing specifications [30], [18] refer to implementation details, e.g., messages sent and received; this does not provide a common ground on which to compare different implementation approaches for the same specification [32]. In addition, several of the protocols have been shown not to satisfy even the basic expectation of eventual consistency [15].
We introduce an implementation-independent specification of a replicated object. The object allows its clients to insert and delete elements into the list at different replicas and thereby captures the core aspects of collaborative text editing [12] (Section 3). Our specification has two flavors. The strong specification ensures that orderings of list elements observed by different clients are consistent. This includes transitive consequences of orderings over subsequently deleted elements, such as x in Fig. 1(b), and thus, the strong specification disallows the response ba for the read in this figure. The weak specification does not take transitivity into account, thus allowing the read in Fig. 1(b) to return either ba or ab. We show that both of these specifications ensure eventual consistency.
We prove that the strong specification is correctly implemented by a variant of the RGA (Replicated Growable Array) protocol [26], which is in the style of replicated data types [23] (Section 4). The protocol represents the list as a tree, with read operations traversing the tree in a deterministic order. Inserting an element a right after an element x (as in Fig. 1(b)) adds a as a child of x in the tree. Deleting an element x just marks it as such; the node of x is left in the tree, creating a so-called tombstone. Keeping the tombstone enables the protocol to correctly incorporate insertions of elements received from other replicas that are ordered right after x (e.g., that of b in Fig. 1(b)).
The simplicity of handling deletions via tombstones in the RGA protocol comes with a high space overhead. More precisely, the metadata overhead [6] of a list implementation is the ratio between the size of a replica's state (in bits) and the size of the user-observable content of the state, i.e., the list that will be read in this state. As we show, the metadata overhead of the RGA protocol is , where D is the number of deletions issued by clients and k is the total number of operations (Section 4). The number of deletions can be high. For example, a 2009 study [39] indicates that the “George W. Bush” Wikipedia page has about 500 lines. However, since modifications are usually handled as deleting the original line and then inserting the revised line, the page had accumulated about 1.6 million deletions.2
Other CRDT protocols do not keep tombstones, e.g., Treedoc [23], Logoot [39] or WOOT [22], but the replica state contains metadata, e.g., labels, that grows linearly in the number of deletions. OT protocols [12] pay metadata overhead by logging unacknowledged updates in each replica.
Our main result is that this overhead is indeed, in some sense, inherent. We prove that any push-based protocol that implements the weak list specification for replicas incurs a metadata overhead of , where D is the number of deletions. In a push-based protocol, each replica propagates list updates to its peers as soon as possible, and merges remote updates into its state as soon as they arrive (we give a precise definition in Section 5). This assumption captures the operation of all highly-available protocols that we are aware of and it includes both CRDT and OT protocols. The lower bound holds even if the network guarantees causal atomic broadcast [11].
We first establish our lower bound for the peer-to-peer model. Client/server protocols attempt to save space on replicas by keeping it on a central server. Replicas communicate only with the server and not directly with each other, and so the server does more than merely relay messages between replicas. Using the fact that the lower bound holds for a network with causal atomic broadcast, we extend it to show that, in a push-based client/server list protocol, the metadata overhead at the clients is still . This shows that relying on a central server does not reduce the metadata overhead at the clients.
We prove our lower bound using an inductive information-theoretic argument, which we consider to be the novel technical contribution of this paper. For every -bit string w, we construct a particular execution of the protocol such that, at its end, the user-observable state of some replica is a list of size bits. We then show that, given , we can decode w by exercising the protocol in a black-box manner. This implies that all states must be distinct and, since there are of them, one of these states must take at least d bits. The procedure that decodes w from is nontrivial and represents the key insight of our proof. It recovers w one bit at a time using a “feedback loop” between two processes: one performs a black-box experiment on the protocol to recover the next bit of w, and the other reconstructs the corresponding steps of the execution ; the messages sent in the reconstructed part of then form the basis for the experiment to decode the next bit of w.
The class of push-based protocols, to which our metadata overhead lower bound proof applies, is specified with low-level properties that partially restrict implementation details of the protocol. We show, however, that under a weaker network model (defined in Section 7), if a protocol has invisible reads (which do not change the state of the replica), then being push-based follows from guaranteeing the higher-level property of eventual consistency. Hence, our lower bound also applies to protocols satisfying the latter property.
Section snippets
System model
We are concerned with highly available implementations of a replicated object [5], [6], which supports a set of operations . Such an implementation consists of replicas that receive and respond to user operations on the object and use message passing to communicate changes to the object's state. The high availability property sets this model apart from standard message-passing models: we require that replicas respond to user operations immediately—without performing any communication—so that
Collaborative text editing
Following Ellis and Gibbs [12], we model the collaborative text editing problem (henceforth, simply collaborative editing) as the problem of implementing a highly available replicated object whose elements are from some universe U. Users can insert elements, remove elements and read the list using the following operations, which form :
- •
for and : inserts a at position k in the list (starting from 0) and returns the updated list. For k exceeding the list size, we assume an
An implementation of the strong list specification
We now present an implementation of the object, which is a reformulation of the RGA (Replicated Growable Array) protocol [26], and prove that it implements the strong list specification.
Push-based protocols
Our lower bound results hold for push-based protocols, a class of protocols that contains the protocols of several collaborative editing systems [20], [23], [26], [29], including the RGA protocol of Section 4. Informally, a replica in a push-based protocol propagates list updates to its peers as soon as possible and merges remote updates into its state as soon as they arrive (as opposed to using a more sophisticated mechanism, such as a consensus protocol).
We define push-based protocols
Lower bounds on metadata overhead
Here we show a lower bound on the worst-case metadata overhead (Definition 11) of push-based protocols satisfying the weak (or strong) list specification.
The proof of this result is nontrivial. It relies on an inductive coding argument, in which a string w is encoded in an execution. Later, the string w is recovered (decoded) bit by bit, using a feedback loop between two replica: one performs a black-box experiment on the protocol to recover the next bit of w, and the other reconstructs the
Protocols with invisible reads in the presence of disconnections
Our metadata overhead lower bound proof applies to push-based protocols. However, being push-based is a low-level property, which partially specifies implementation details of the protocol. In this section, we show that under a weaker network model (discussed below), for a common class of protocols being push-based follows from guaranteeing the higher-level property of eventual visibility. Hence, our lower bound also applies to protocols satisfying the latter property.
We consider protocols with
Related work
Previous attempts at specifying the behavior of replicated list objects [30], [18] have been informal and imprecise: they typically required the execution of an operation at a remote replica to preserve the effect of the operation at its original replica, but they have not formally defined the notions of the effect and its preservation.
Burckhardt et al. [6] have previously proposed a framework for specifying replicated data types (on which we base our list specifications) and proved lower
Conclusion
This paper provides a precise specification of the replicated object, which models the core functionality of collaborative text editing systems. We define a strong list—and show that it is implemented by the RGA protocol [26]—as well as a weak list, which is implemented by the Jupiter protocol [20], [38], underlying public collaboration systems [37].
We prove a lower bound of , where D is the number of deletions, on the metadata overhead of push-based list protocols, which model the
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
We would like to thank Marc Shapiro and Pascal Urso for comments that helped improve the paper. Attiya and Morrison were supported by the Israel Science Foundation (grant 1749/14) and by Yad Hanadiv Foundation. Gotsman was supported by an ERC grant RACCOON. Yang was supported by an Institute for Information & communications Technology Promotion (IITP) grant funded by the Korea government (MSIP, No. R0190-15-2011). Zawirski was supported by an EU project SyncFree.
References (40)
- et al.
Evaluating CRDTs for real-time document editing
- et al.
Supporting adaptable granularity of changes for massive-scale collaborative editing
- et al.
Specification and complexity of collaborative text editing
- et al.
Limitations of highly-available eventually-consistent data stores
IEEE Trans. Parallel Distrib. Syst.
(2017) - et al.
Replicated data types: specification, verification, optimality
- et al.
Introduction to Reliable and Secure Distributed Programming
(2011) - et al.
Performance of real-time collaborative editors at large scale: user perspective
What's different about the new Google Docs: conflict resolution
What's different about the new Google Docs: making collaboration fast
Total order broadcast and multicast algorithms: taxonomy and survey
ACM Comput. Surv.
Concurrency control in groupware systems
Verifying strong eventual consistency in distributed systems
How do user groups cope with delay in real-time collaborative note taking
Formal design and verification of operational transformation algorithms for copies convergence
Theor. Comput. Sci.
Clocks, and the ordering of events in a distributed system
Commun. ACM
The Wiki Way: Quick Collaboration on the Web
Preserving operation effects relation in group editors
Scalable XML collaborative editing with undo
High-latency, low-bandwidth windowing in the Jupiter collaboration system
Cited by (2)
Self-compressing object sequence for consistency maintenance in co-editors
2022, Software - Practice and ExperienceMulti-core accelerated CRDT for large-scale and dynamic collaboration
2022, Journal of Supercomputing