Skip to main content
Log in

Extraction of Semantic Links from a Document-Oriented NoSQL Database

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

A Correction to this article was published on 10 February 2023

This article has been updated

Abstract

The prior declaration of a schema when creating a database (DB) is not necessary for most NoSQL systems. This “Schemaless” property is important since it provides undeniable flexibility during data exploitation. However, the absence of schema is a major obstacle to the expression of precise queries on a DB. A new area of research has emerged to allow users of Schemaless NoSQL systems to visualize the schema of the data. Research works have proposed schema extraction processes, but these solutions are generally limited. In our previous works (Abdelhedi et al. in Proceedings of the 10th international conference on model-driven engineering and software development, pp 61–71. https://doi.org/10.5220/0010899000003119, 2022), we proposed a logical schema extraction process for a document-oriented NoSQL DB to address the needs of a medical application. In this paper, we extend this process to additional relationship types. To do this, we use the model driven architecture which proposes a development method based on metamodeling and the definition of transformation rules. The DB schema is obtained by applying a set of transformation rules to the specifications extracted from the DB. The interest of our process is to produce a schema that allows users of a NoSQL DB to build complex and precise queries. This is useful for both computer scientists who create a large number of complex queries as well as for decision makers who often have difficulty in apprehending the semantic of the data. Our extraction process was implemented in a medical application.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Change history

References

  1. Abdelhedi F, Rajhi H, Zurfluh G. Extraction process of the logical schema of a document-oriented NoSQL database. In: Proceedings of the 10th international conference on model-driven engineering and software development, 2022, pp. 61–71. https://doi.org/10.5220/0010899000003119.

  2. Wang L, Wang J, Wang M, Li Y, Liang Y, Xu D. Using internet search engines to obtain medical information: a comparative study. J Med Internet Res. 2012;14(3):e74. https://doi.org/10.2196/jmir.1943.

  3. Wang L, Hassanzadeh O, Zhang S, Shi J, Jiao L, Zou J, Wang C. Schema management for document stores. Proc VLDB Endow. 2015;8:922–33. https://doi.org/10.14778/2777598.2777601.

  4. Baazizi M-A, Lahmar HB, Colazzo D, Ghelli G, Sartiani C. Schema inference for massive JSON datasets. Extend Database Technol. 2017. https://doi.org/10.5441/002/edbt.2017.21.

    Article  MATH  Google Scholar 

  5. Baazizi M-A, Colazzo D, Ghelli G, Sartiani C. Parametric schema inference for massive JSON datasets. VLDB J. 2019;28(4):497–521. https://doi.org/10.1007/s00778-018-0532-7.

    Article  MATH  Google Scholar 

  6. Frozza AA, dos Santos Mello R, da Costa FS. An approach for schema extraction of JSON and extended JSON document collections. In: IEEE international conference on information reuse and integration (IRI), 2018. pp. 356–63. https://doi.org/10.1109/IRI.2018.00060.

  7. Fruth M, Dauberschmidt K, Scherzinger SJ. Managing schemas for NoSQL document stores. In: IEEE 37th international conference on data engineering (ICDE), 2021. pp. 2693–6. https://doi.org/10.1109/ICDE51399.2021.00306.

  8. Istiqamah AN, Wiharja KRS. A schema extraction of document-oriented database for data warehouse. Int J Inf Commun Technol. 2021;7(2):36–47. https://doi.org/10.21108/ijoict.v7i2.584.

  9. Aftab Z, Iqbal W, Almustafa KM, Bukhari F, Abdullah M. Automatic NoSQL to relational database transformation with dynamic schema mapping. Sci Programm. 2020. https://doi.org/10.1155/2020/8813350.

  10. Chillón AH, Hoyos JR, García-Molina J, Ruiz DS. Discovering entity inheritance relationships in document stores. Knowl Based Syst. 2021;230:107394. https://doi.org/10.1016/j.knosys.2021.107394.

  11. Ruiz DS, Morales SF, Molina JG. Inferring versioned schemas from NoSQL databases and its applications. In: International conference on conceptual modeling, 2015, pp. 467–480. https://doi.org/10.1007/978-3-319-25264-3_35.

  12. ODMS: Operational Database Management Systems. http://www.odbms.org/odmg-standard/. Accessed 10 Apr 2021.

  13. OrientDB. https://orientdb.org/. Accessed 10 Apr 2021.

  14. OMG. MDA-The architecture of choice for a changing world. https://www.omg.org/mda. Accessed 1 Apr 2021.

  15. Laney D. 3D data management: Controlling data volume, velocity and variety. META group research note, 2001.

  16. Idera: Data modeling tools for enterprise-scale data architecture. https://www.idera.com/products/er-studio/enterprise-data-modeling. Accessed 2 June 2021.

  17. Erwin: Erwin Data Modeler. https://www.erwin.com/products/erwin-data-modeler/. Accessed 2 June 2021.

  18. MongoDB. https://www.mongodb.com/products/compass. Accessed 5 Sept 2021.

  19. MongoDB. https://www.mongodb.com/. Accessed 5 Sept 2021.

  20. OMG. https://www.omg.org/. Accessed 1 June 2021.

  21. Eclipse: Eclipse Modeling Framework (EMF) https://www.eclipse.org/modeling/emf/. Accessed 10 Jan 2022.

  22. Eclipse: Ecore Tools. https://www.eclipse.org/ecoretools/doc/index.html. Accessed 20 Jan 2022.

  23. OMG: MOF query/view/transformation. https://www.omg.org/spec/QVT/1.3/About-QVT/. Accessed 20 Jan 2022.

  24. UML. https://www.uml.org/. Accessed 12 Dec 2021.

  25. Baazizi M-A, Colazzo D, Ghelli G, Sartiani C. A type system for interactive JSON schema inference. In: 46th international colloquium on automata, languages, and programming (ICALP), 2019.

  26. OrientDB: basic concepts. https://orientdb.org/docs/3.0.x/datamodeling/Concepts.html. Accessed 10 Jan 2022.

  27. OMG: XML metadata interchange. https://www.omg.org/spec/XMI/. Accessed 20 Jan 2022.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hela Rajhi.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Advances on Model-Driven Engineering and Software Development” guest edited by Luís Ferreira Pires and Slimane Hammoudi.

The original online version of this article was revised: Incorrect version of Figures 2 and 3 were published in the original publication. Now, they have been corrected.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Abdelhedi, F., Rajhi, H. & Zurfluh, G. Extraction of Semantic Links from a Document-Oriented NoSQL Database. SN COMPUT. SCI. 4, 148 (2023). https://doi.org/10.1007/s42979-022-01578-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-022-01578-z

Keywords

Navigation