skip to main content
10.1145/1458432.1458444acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Natural language reporting for ETL processes

Published:30 October 2008Publication History

ABSTRACT

The conceptual design of the Extract -- Transform -- Load (ETL) processes is a crucial, burdensome, and challenging procedure that takes places at the early phases of a Data Warehouse project. Several models have been proposed for the conceptual design and representation of ETL processes, but all share two inconveniences: they require intensive human effort from the designers to create them, as well as technical knowledge from the business people to understand them. In a previous work, we have relaxed the former difficulty by working on the automation of the conceptual design leveraging Semantic Web technology. In this paper, we built upon our previous results and we tackle the second issue by investigating the application of natural language generation techniques to the ETL environment. In particular, we provide a method for the representation of a conceptual ETL design as a narrative, which is the most natural means of communication and does not require knowledge of any specific model. We discuss how linguistic techniques can be used for the establishment of a common application vocabulary. Finally, we present a flexible and customizable template-based mechanism for generating natural language representations for the ETL process requirements and operations.

References

  1. Bontcheva, K.: Generating Tailored Textual Summaries from Ontologies. In ESWC, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bontcheva, K., Wilks, Y.: Automatic Report Generation from Ontologies: The MIAKT Approach. In NLDB, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  3. Dalianis, H., Hovy, E.H.: Aggregation in Natural Language Generation. In EWNLG, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. van Deemter, K, Theune, M., Krahmer, E.: Real versus Template-Based Natural Language Generation: A False Opposition? Computational Linguistics 31(1), 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. IBM. IBM WebSphere DataStage. URL: http://www-306.ibm.com/software/data/integration/datastage/Google ScholarGoogle Scholar
  6. Informatica. PowerCenter. URL: http://www.informatica.com/powercenter/Google ScholarGoogle Scholar
  7. Kedad, Z., Métais, E.: Ontology-Based Data Cleaning. In NLDB, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  8. Kimball, R., Caserta, J.: The Data Warehouse ETL Toolkit (chapter 11). Wiley Publishing, Inc., 2004.Google ScholarGoogle Scholar
  9. Kimball, R., et al.: The Data Warehouse Lifecycle Toolkit. John Wiley & Sons, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Kiyavitskaya, N., Zeni, N., Mich, L., Mylopoulos, J.: Experimenting with Linguistic Tools for Conceptual Modelling: Quality of the Models and Critical Features. In NLDB, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  11. Kof, L.: Natural Language Processing: Mature Enough for Requirements Documents Analysis? In NLDB, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Luján-Mora, S., Vassiliadis, P., Trujillo, J.: Data Mapping Diagrams for Data Warehouse Design with UML. In ER, 2004.Google ScholarGoogle Scholar
  13. Mazon, J-N., Trujillo, J., Serrano, M., Piattini, M.: Applying MDA to the Development of Data Warehouses. In DOLAP, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Metais, E., Meunier, J., Levreau, G.: Database Schema Design: A Perspective from Natural Language Techniques to Validation and View Integration. In ER, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Microsoft. Data Transformation Services. URL: http://www.microsoft.com/sql/prodinfo/features/Google ScholarGoogle Scholar
  16. Oracle. Oracle Warehouse Builder Product Page. URL: http://otn.oracle.com/products/warehouse/content.htmlGoogle ScholarGoogle Scholar
  17. Rahm, E., Bernstein, P. A.: A survey of approaches to automatic schema matching. In VLDB J. 10(4), 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Reape, M., Mellish, C.: Just What is Aggregation Anyway? In ENLG, 1999.Google ScholarGoogle Scholar
  19. Reiter, E., Mellish, C., Levine, J.: Automatic generation of technical documentation. In Applied Artificial Intelligence 9(3), 1995.Google ScholarGoogle Scholar
  20. Rolland, C., Proix, C.: A Natural Language Approach for Requirements Engineering. In CAiSE, 1992.Google ScholarGoogle ScholarCross RefCross Ref
  21. Romero, O., Abelló, A.: Automating Multidimensional Design from Ontologies. In DOLAP, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Simitsis, A.: Mapping Conceptual to Logical Models for ETL Processes. In DOLAP, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Simitsis, A., Koutrika, G., Alexandrakis, Y., Ioannidis, Y.: Synthesizing Structured Text from Logical Database Subsets. In EDBT, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Skoutas, D., Simitsis, A.: Designing ETL Processes Using Semantic Web Technologies. In DOLAP, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Skoutas, D., Simitsis, A.: Flexible and Customizable NL Representation of Requirements for ETL processes. In NLDB, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Smith, M. K., Welty, C., McGuinness, D. L. OWL Web Ontology Language Guide. W3C Rec. 2004 (http://www.w3.org/TR/owl-guide)Google ScholarGoogle Scholar
  27. Storey, V. C., Goldstein, R. C., Ullrich, H.: Naive Semantics to Support Automated Database Design. In TKDE 14(1), 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Min Tjoa, A., Berger, L.: Transformation of Requirement Specifications Expressed in Natural Language into an EER Model. In ER, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Trujillo, J., Lujan-Mora, S.: A UML Based Approach for Modeling ETL Processes in Data Warehouses. In ER, 2003.Google ScholarGoogle Scholar
  30. Vassiliadis, P., Simitsis, A., Skiadopoulos, S.: Conceptual Modeling for ETL Processes. In DOLAP, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Wilcock, G.: Talking OWLs: Towards an Ontology Verbalizer. In ISWC, 2003.Google ScholarGoogle Scholar
  32. Wilcock, G., Jokinen, K.: Generating Responses and Explanations from RDF/XML and DAML+OIL. In IJCAI, 2003.Google ScholarGoogle Scholar
  33. Wu, W., Reinwald, R., Sismanis, Y., Manjrekar, R.: Discovering Topical Structures of Databases. In SIGMOD, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Natural language reporting for ETL processes

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      DOLAP '08: Proceedings of the ACM 11th international workshop on Data warehousing and OLAP
      October 2008
      104 pages
      ISBN:9781605582504
      DOI:10.1145/1458432

      Copyright © 2008 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 30 October 2008

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate29of79submissions,37%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader