Abstract
In recent years, the need to use NoSQL systems to store and exploit big data has been steadily increasing. Most of these systems are characterized by the property “schema less” which means absence of the data model when creating a database. This property offers an undeniable flexibility allowing the user to add new data without making any changes on the data model. However, the lack of an explicit data model makes it difficult to express queries on the database. Therefore, users (developers and decision-makers) still need the database data model to know how data are stored and related, and then to write their queries. In previous works, we have proposed a process to extract the physical model of a document-oriented NoSQL database. In this paper, we aim to extend this work to achieve a reverse engineering of NoSQL databases in order to provide an element of semantic knowledge close to human understanding. The reverse engineering process is ensured by a set of transformation algorithms. We provide experiments of our approach using a case study taken from the medical field.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
References
Angadi, A.B., Gull, K.C.: Growth of new databases & analysis of NOSQL datastores. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 3, 1307–1319 (2013)
Baazizi, M.A., Lahmar, H.B., Colazzo, D., Ghelli, G., Sartiani, C.: Schema inference for massive JSON datasets. In: Extending Database Technology (EDBT) (March 2017)
Baazizi, M.-A., Colazzo, D., Ghelli, G., Sartiani, C.: Parametric schema inference for massive JSON datasets. VLDB J. 28(4), 497–521 (2019). https://doi.org/10.1007/s00778-018-0532-7
Bondiombouy, C.: Query processing in cloud multistore systems. In: BDA: Bases de Données Avancées (2015)
Brahim, A., Ferhat, R., Zurfluh, G.: Model driven extraction of NoSQL databases schema: case of MongoDB. In: Proceedings of the 11th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, KDIR, vol. 1, pp. 145–154 (2019). ISBN 978-989-758-382-7
Budinsky, F., Steinberg, D., Ellersick, R., Grose, T.J., Merks, E.: Eclipse Modeling Framework: A Developer’s Guide. Addison-Wesley Professional (2004)
Philip Chen, C.L., Zhang, C.Y.: Data-intensive applications, challenges, techniques and technologies: a survey on Big Data. Inf. Sci. 275, 314–347 (2014)
Comyn-Wattiau, I., Akoka, J.: Model driven reverse engineering of NoSQL property graph databases: the case of Neo4j. In: 2017 IEEE International Conference on Big Data (Big Data), pp. 453–458. IEEE (December 2017)
Extract Mongo Schema, 5 October 2019. https://www.npmjs.com/package/extract-mongo-schema/v/0.2.9
Gallinucci, E., Golfarelli, M., Rizzi, S.: Schema profiling of document-oriented databases. Inf. Syst. 75, 13–25 (2018)
Izquierdo, J.L.C., Cabot, J.: JSONDiscoverer: visualizing the schema lurking behind JSON documents. Knowl. Based Syst. 103, 52–55 (2016)
Klettke, M., Störl, U., Scherzinger, S.: Schema extraction and structural outlier detection for JSON-based NoSQL data stores. In: Datenbanksysteme für Business, Technologie und Web, BTW 2015 (2015)
Maity, B., Acharya, A., Goto, T., Sen, S.: A framework to convert NoSQL to relational model. In: Proceedings of the 6th ACM/ACIS International Conference on Applied Computing and Information Technology, pp. 1–6. ACM (June 2018)
Sevilla Ruiz, D., Morales, S.F., García Molina, J.: Inferring versioned schemas from NoSQL databases and its applications. In: Johannesson, P., Lee, M.L., Liddle, Stephen W., Opdahl, Andreas L., López, Ó.P. (eds.) ER 2015. LNCS, vol. 9381, pp. 467–480. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25264-3_35
Chillón, A.H., Ruiz, D.S., Molina, J.G., Morales, S.F.: A model-driven approach to generate schemas for object-document mappers. IEEE Access 7, 59126–59142 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Abdelhedi, F., Ait Brahim, A., Tighilt Ferhat, R., Zurfluh, G. (2020). Reverse Engineering Approach for NoSQL Databases. In: Song, M., Song, IY., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2020. Lecture Notes in Computer Science(), vol 12393. Springer, Cham. https://doi.org/10.1007/978-3-030-59065-9_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-59065-9_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59064-2
Online ISBN: 978-3-030-59065-9
eBook Packages: Computer ScienceComputer Science (R0)