A distributed data exchange engine for polystores

  • Abdulrahman Kaitoua

    Abdulrahman Kaitoua is a Senior big data architect and a team lead in the innovation and research team of GK-Software SE company in Berlin, Germany. He received his Ph. D. with honor in Information Technology from Politecnico di Milano in 2017.

    , Tilmann Rabl

    Tilmann Rabl is a full professor and Chair of the Data Engineering Systems Group at Hasso Plattner Institute and the University of Potsdam. He is also cofounder of the startup bankmark.

    and Volker Markl

    Volker Markl is a Full Professor and Chair of the DIMA Group at TU Berlin and an Adjunct Full Professor at the University of Toronto. He is Director of the Intelligent Analytics for Massive Data Research Group at DFKI and Director of the Berlin Big Data Center.


There is an increasing interest in fusing data from heterogeneous sources. Combining data sources increases the utility of existing datasets, generating new information and creating services of higher quality. A central issue in working with heterogeneous sources is data migration: In order to share and process data in different engines, resource intensive and complex movements and transformations between computing engines, services, and stores are necessary.

Muses is a distributed, high-performance data migration engine that is able to interconnect distributed data stores by forwarding, transforming, repartitioning, or broadcasting data among distributed engines’ instances in a resource-, cost-, and performance-adaptive manner. As such, it performs seamless information sharing across all participating resources in a standard, modular manner. We show an overall improvement of 30 % for pipelining jobs across multiple engines, even when we count the overhead of Muses in the execution time. This performance gain implies that Muses can be used to optimise large pipelines that leverage multiple engines.


Funding statement: This work has been supported through grants by the German Ministry for Education and Research as s BIFOLD (01IS18025A and 01IS18037).

About the authors

