Neuroshard: towards automatic multi-objective sharding with deep reinforcement learning
Large databases whose data does not fit on a single server need to shard their rows across multiple different database instances. Distributed transactions are significantly more expensive than local transactions, so a popular approach is to collect a trace of past accesses to the database and model it as a graph (or a hypergraph), and solve an NP-Hard partitioning problem with an objective of minimizing the fanout, or the number of database instances that need to participate in each query. Due to the large amount of data that needs to be sharded, this problem cannot be solved optimally, and therefore, databases use heuristic partitioning algorithms, which can be fairly effective in practice. However, fanout is only one objective that affects performance. Other important objectives include load balancing, which ensures that no single database instance becomes too overloaded, or equalizing the write traffic for each database to avoid lock contention and I/O amplification. Designing heuristics for more than one objective is difficult and error-prone.
We present Neuroshard, the first system that learns shard assignments directly from the workload, and optimizes for multiple sharding objectives simultaneously. Neuroshard represents past queries as a neural hypergraph, and uses Deep Reinforcement Learning with Multi-Task learning to generate a learned partitioner that is able to optimize for multiple objectives in parallel. We implement Neuroshard on a distributed database that uses MariaDB, and got very promising initial results showing that this approach can achieve our versatility and scalability goals, in contrast to baseline approaches that optimize for only one objective which can work well in one context but perform poorly in another.
June 2022
Published: 11 August 2022
SIGMOD/PODS '22: International Conference on Management of Data
June 17, 2022
Pennsylvania, Philadelphia
