Skip to main content

A Methodology to Prepare Real-World and Large Databases to Ontology Learning

  • Conference paper
  • First Online:

Part of the book series: Proceedings of the I-ESA Conferences ((IESACONF,volume 7))

Abstract

Several approaches have been proposed for the generation of application ontologies from relational databases. Most of these approaches, propose a fully automatic process based on unrealistic assumption, where the input database is well designed, up-to the third normal form (3NF). Real-World databases may contain irrelevant, missing or erroneous information to the ontology learning process. Preparing databases before ontology learning is quite rare. We propose in this paper a methodology for Database Preparation (DBP), composed of three sub-processes: the extraction of a Business Database (BDB), the cleaning of the BDB, and the enrichment of the cleaned BDB. A proof-theoretical case study shows that the proposed methodology is feasible and useful.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Gruber, T. R., et al. (1993). A translation approach to portable ontology specifications. Knowledge Acquisition, 5(2), 199–220.

    Article  Google Scholar 

  2. Protégé. (2013). Retrieved October 18, 2013 from http://protege.stanford.edu/

  3. Cruz, C., & Nicolle, C. (2005). Ontology-based heterogeneous xml data integration. Journal of Digital Information Management, 3(2), 133.

    Google Scholar 

  4. El Idrissi, B., Baïna, S., Baïna, K. (2013) Automatic generation of ontology from data models: a practical evaluation of existing approaches. In Proceedings of the 7th IEEE International Conference RCIS’2013 (pp. 241–252). Paris, France: IEEE.

    Google Scholar 

  5. Santoso, H. A., Haw, S. C., & Abdul-Mehdi, Z. T. (2011). Ontology extraction from relational database: Concept hierarchy as background knowledge. Knowledge-Based Systems, 24(3), 457–464.

    Article  Google Scholar 

  6. Cerbah, F. (2008). Learning highly structured semantic repositories from relational databases. The semantic web: research and applications (pp. 777–781). Heidelberg: Springer.

    Google Scholar 

  7. Liu, H., & Motoda, H. (2001). Instance selection and construction for data mining. MA: Kluwer International Series in Engineering and Computer Science.

    Book  Google Scholar 

  8. El Idrissi, B., Baïna, S., Baïna, K. (2012). An ontology based approach for semantic interoperability of erp systems within an organization. In Proceedings of the 2nd International Symposium ISKO-Maghreb 2012.

    Google Scholar 

  9. Wache, H., Voegele, T., Visser, U., Stuckenschmidt, H., Schuster, G., Neumann, H., et al. (2001) Ontology-based integration of information-a survey of existing approaches. In Proceedings of the IJCAI-01 Workshop: Ontologies and Information Sharing (vol. 2001, pp. 108–117) Citeseer.

    Google Scholar 

  10. Fox M., S., Gruninger M. (1998) Enterprise modelling. AI Magazine. 19(3), 109–121.

    Google Scholar 

  11. Ontology Web Language. (2013). Retrieved October 18, 2013, from http://www.w3.org/OWL

  12. Openbravo. (2013). Retrieved October 18, 2013, from http://www.openbravo.com/

  13. OpenERP. (2013).Retrieved October 18, 2013, from http://www.openerp.com/

  14. Albarrak, K., & Sibley, E. (2010). An extensible framework for generating ontology models from data models. International Transactions On System Science and Applications (ITSSA), 6(2/3), 97–112.

    Google Scholar 

  15. Cerbah, F. (2008). Mining the content of relational databases to learn ontologies with deeper taxonomies. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, 2008 WI-IAT’08 (vol. 1, pp. 553–557). IEEE.

    Google Scholar 

  16. Alalwan, N., Zedan, H., Siewe, F. (2009). Generating OWL ontology for database integration. In Proceedings of the Third International Conference on Advances in Semantic Processing, SEMAPRO’09 (pp. 22–31). IEEE.

    Google Scholar 

  17. Benslimane, S. M., Malki, M., Rahmouni, M. K., & Benslimane, D. (2007). Extracting personalised ontology from data-intensive web application: An HTML forms-based reverse engineering approach. Informatica, 18(4), 511–534.

    MATH  Google Scholar 

  18. Pipino, L. L., Lee, Y. W., & Wang, R. Y. (2002). Data quality assessment. Communications of the ACM, 45(4), 211–218.

    Article  Google Scholar 

  19. Batini, C., Cappiello, C., Francalanci, C., & Maurino, A. (2009). Methodologies for data quality assessment and improvement. ACM Computing Surveys (CSUR), 41(3), 16.

    Article  Google Scholar 

  20. Müller, H., Freytag, J., C. (2005). Problems, methods, and challenges in comprehensive data cleansing. Paper presented at Conference on Advanced Information Systems Engineering (CAiSE05), Porto, Portugal.

    Google Scholar 

  21. Sql Editor. (2013). Retrieved October 18, 2013, from http://sql-editor.winsite.com

  22. ETL Tools (2013). Retrieved October 18, 2013, from http://www.etltool.com

  23. Elmagarmid, A. K., Ipeirotis, P. G., & Verykios, V. S. (2007). Duplicate record detection: A survey. IEEE Transactions on Knowledge and Data Engineering, 19(1), 1–16.

    Article  Google Scholar 

  24. 22 free tools for data visualization and analysis. Computer World. (2011). Retrieved October 18, 2013, from http://www.computerworld.com/s/article/9215504/22_free_tools_for_data_visualization_and_analysis

  25. Rahm, E., & Do, H. H. (2000). Data cleaning: Problems and current approaches. IEEE Data Engineering Bulletin, 23(4), 3–13.

    Google Scholar 

  26. Castellanos, M. (1993). A methodology for semantically enriching interoperable databases. In Advances in Databases (pp. 58–75). New York: Springer.

    Google Scholar 

  27. Castellanos, M., Saltor, F., Garcia-Solaco, M. A. (1992). Canonical model for the interoperability among object-oriented and relational databases. In Proceedings of the IWDOM 1992 (pp. 309–314).

    Google Scholar 

  28. Maatuk, M. A., Ali, A., Rossiter, N. (2010). Semantic enrichment: The first phase of relational database migration. In Proceedings of the Innovations and Advances in Computer Sciences and Engineering (pp. 373–378). Berlin: Springer.

    Google Scholar 

  29. Extensible Markup Language (XML) W3C. (2013). Retrieved October 18, 2013, from http://www.w3.org/XML/

  30. XMLSpy XML Editor Altova. (2013). Retrieved October 18, 2013, from http://www.altova.com/xmlspy.html

  31. Oxygen XML Editor Oxygen. (2013). Retrieved October 18, 2013, from http://www.oxygenxml.com/

  32. Incremental Database Reverse Engineering (DBRE) Add-On. Spring source; http://docs.spring.io/spring-roo/reference/html/base-dbre.html

  33. Bohring, H., & Auer, S. (2005). Mapping XML to OWL Ontologies. Leipziger Informatik-Tage, 72, 147–156.

    Google Scholar 

  34. Ghawi, R., Cullot, N. (2007). Database-to-ontology mapping generation for semantic interoperability. In Proceedings of the VDBLŠ07 conference, VLDB Endowment ACM (pp. 1–8).

    Google Scholar 

  35. Thuy, P. T. T., Lee, Y. K., Lee, S. (2009). DTD2OWL: Automatic transforming XML documents into OWL ontology. In Proceedings of the 2nd International Conference on Interaction Sciences: Information Technology, Culture and Human (pp. 125–131). New York: ACM.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bouchra El Idrissi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

El Idrissi, B., Baïna, S., Baïna, K. (2014). A Methodology to Prepare Real-World and Large Databases to Ontology Learning. In: Mertins, K., Bénaben, F., Poler, R., Bourrières, JP. (eds) Enterprise Interoperability VI. Proceedings of the I-ESA Conferences, vol 7. Springer, Cham. https://doi.org/10.1007/978-3-319-04948-9_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-04948-9_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-04947-2

  • Online ISBN: 978-3-319-04948-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics