ABSTRACT
In this vision paper, we introduce an idea of a framework that would enable us to model, represent, and manage multi-model data in a unified and abstract way. Its core idea exploits constructs provided by category theory, which is sufficiently general but still simple enough to cover any of the logical data models used in contemporary databases. Focusing on promising features and taking into account mature and verified principles, we overview the key parts of the framework and outline open questions and research directions that need to be further investigated. The ultimate objective is to pursue the idea of a self-tuning system that would permit us to collapse the traditionally understood conceptual and logical layers into just a single model allowing for unified handling of schemas, data instances, as well as queries.
- Suad Alagić and Philip A. Bernstein. 2002. A Model Theory for Generic Schema Management. In Database Programming Languages. Springer, 228–246.Google Scholar
- Paolo Atzeni, Francesca Bugiotti, Luca Cabibbo, and Riccardo Torlone. 2020. Data Modeling in the NoSQL World. Computer Standards and Interfaces 67 (2020), 103–149.Google ScholarDigital Library
- Michael Barr and Charles Wells. 1990. Category Theory for Computing Science. Vol. 49. Prentice Hall New York.Google Scholar
- Francesco Basciani, Juri Di Rocco, Davide Di Ruscio, Alfonso Pierantonio, and Ludovico Iovino. 2020. TyphonML: A Modeling Environment to Develop Hybrid Polystores. In MODELS ’20 (Virtual Event, Canada). ACM, Article 2, 5 pages. https://doi.org/10.1145/3417990.3421999Google Scholar
- P.P. Chen. 1976. The Entity-Relationship Model – Toward a Unified View of Data. ACM Transactions on Database Systems 1, 1 (March 1976), 9–36. https://doi.org/10.1145/320434.320440Google ScholarDigital Library
- Qingsong Guo, Jiaheng Lu, Chao Zhang, Calvin Sun, and Steven Yuan. 2020. Multi-Model Data Query Languages and Processing Paradigms. In CIKM ’20. ACM, 3505–3506. https://doi.org/10.1145/3340531.3412174Google Scholar
- Andrea Hillenbrand, Maksym Levchenko, Uta Störl, Stefanie Scherzinger, and Meike Klettke. 2019. MigCast: Putting a Price Tag on Data Model Evolution in NoSQL Data Stores. In SIGMOD ’19(Amsterdam, Netherlands). ACM, 1925–1928. https://doi.org/10.1145/3299869.3320223Google Scholar
- Jeremy Kepner, Julian Chaidez, Vijay Gadepally, and Hayden Jansen. 2015. Associative Arrays: Unified Mathematics for Spreadsheets, Databases, Matrices, and Graphs. CoRR abs/1501.05709(2015). arXiv:1501.05709Google Scholar
- Boyan Kolev, Raquel Pau, Oleksandra Levchenko, Patrick Valduriez, Ricardo Jiménez-Peris, and José Orlando Pereira. 2016. Benchmarking polystores: The CloudMdSQL experience. In BigData ’16. 2574–2579.Google Scholar
- M. Kolonko and S. Müllenbach. 2020. Polyglot Persistence in Conceptual Modeling for Information Analysis. In ACIT ’20. 590–594.Google Scholar
- Eric Leclercq and Marinette Savonnet. 2019. TDM: A Tensor Data Model for Logical Data Independence in Polystore Systems. In VLDB ’18 Workshops. Springer, 39–56. https://doi.org/10.1007/978-3-030-14177-6_4Google ScholarCross Ref
- Lippe, E. and Ter Hofstede, A. H. M.1996. A Category Theory Approach to Conceptual Data Modeling. RAIRO-Theor. Inf. Appl. 30, 1 (1996), 31–79.Google ScholarCross Ref
- Zhen Hua Liu, Jiaheng Lu, Dieter Gawlick, Heli Helskyaho, Gregory Pogossiants, and Zhe Wu. 2019. Multi-model Database Management Systems - A Look Forward. In VLDB ’18 Workshops. Springer, 16–29. https://doi.org/10.1007/978-3-030-14177-6_2Google Scholar
- Jiaheng Lu and Irena Holubová. 2019. Multi-Model Databases: A New Journey to Handle the Variety of Data. ACM Comput. Surv. 52, 3, Article 55 (2019). https://doi.org/10.1145/3323214Google Scholar
- Jiaheng Lu, Zhen Hua Liu, Pengfei Xu, and Chao Zhang. 2018. UDBMS: Road to Unification for Multi-model Data Management. In ER ’18 Workshops(LNCS, Vol. 11158). Springer, 285–294. https://doi.org/10.1007/978-3-030-01391-2_33Google ScholarDigital Library
- Kian Win Ong, Yannis Papakonstantinou, and Romain Vernoux. 2014. The SQL++ Semi-structured Data Model and Query Language: A Capabilities Survey of SQL-on-Hadoop, NoSQL and NewSQL Databases. CoRR abs/1405.3631(2014).Google Scholar
- Atzeni Paolo, Stefano Ceri, Stefano Paraboschi, and Riccardo Torlone. 1999. Database Systems: Concepts, Languages and Architectures.Google Scholar
- Marek Polák, Martin Nečaský, and Irena Holubová. 2013. DaemonX: Design, Adaptation, Evolution, and Management of Native XML (and More Other) Formats. In IIWAS ’13 (Vienna, Austria). ACM, 484–493.Google Scholar
- James Rumbaugh, Ivar Jacobson, and Grady Booch. 2004. Unified modeling language reference manual. Pearson Higher Education.Google Scholar
- Patrick Schultz, David I. Spivak, Christina Vasilakopoulou, and Ryan Wisnesky. 2017. Algebraic Databases. Theory & Applications of Categories 32, 16-19 (2017), 547 – 619.Google Scholar
- David I Spivak and Ryan Wisnesky. 2015. Relational Foundations for Functorial Data Migration. In DBPL ’15. ACM, 21–28.Google Scholar
- Martin Svoboda, Pavel Contos, and Irena Holubova. 2021. Categorical Modeling of Multi-Model Data: One Model to Rule Them All. In MEDI ’21: Proceedings of the 10th International Conference on Model and Data Engineering(LNCS, Vol. 12732). Springer, 1–8. https://doi.org/10.1007/978-3-030-78428-7_15Google ScholarDigital Library
- Laurent Thiry, Heng Zhao, and Michel Hassenforder. 2018. Categories for (Big) Data models and optimization. Journal of Big Data 5, 1 (2018), 1–20. https://doi.org/10.1186/s40537-018-0132-9Google ScholarCross Ref
- Chris Tuijn and Marc Gyssens. 1996. ”CGOOD, a Categorical Graph-oriented Object Data Model”. Theoretical Computer Science 160, 1-2 (1996), 217–239.Google ScholarDigital Library
- Michal Vavrek, Irena Holubová, and Stefanie Scherzinger. 2019. MM-evolver: A Multi-model Evolution Management Tool. In EDBT ’19. OpenProceedings.org, 586–589.Google Scholar
- Chao Zhang, Jiaheng Lu, Pengfei Xu, and Yuxing Chen. 2018. UniBench: A Benchmark for Multi-model Database Management Systems. In TPCTC ’18(LNCS, Vol. 11135). Springer, 7–23. https://doi.org/10.1007/978-3-030-11404-6_2Google Scholar
- Categorical Management of Multi-Model Data
Recommendations
Multi-Model Data Modeling and Representation: State of the Art and Research Challenges
IDEAS '21: Proceedings of the 25th International Database Engineering & Applications SymposiumFollowing the current trend, most of the well-known database systems, being relational, NoSQL, or NewSQL, denote themselves as multi-model. This industry-driven approach, however, lacks plenty of important features of the traditional DBMSs. The primary ...
Categorical Modeling of Multi-model Data: One Model to Rule Them All
Model and Data EngineeringAbstractAs most of the DBMSs have become multi-model, there have occurred plenty of related issues. One of them is a design of a multi-model application, where the step from the conceptual layer to a set of distinct interlinked logical models is not ...
Multi-model query languages: taming the variety of big data
AbstractA critical issue in Big Data management is to address the variety of data–data are produced by disparate sources, presented in various formats, and hence inherently involves multiple data models. Multi-Model DataBases (MMDBs) have emerged as a ...
Comments