Skip to main content

On Handling One-to-Many Transformations in Relational Systems

  • Conference paper
  • 1120 Accesses

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 12))

Abstract

The optimization capabilities of RDBMSs make them attractive for executing data transformations that support ETL, data cleaning and integration activities. Despite the fact that many useful data transformations can be expressed as relational queries, an important class of data transformations that produces several output tuples for a single input tuple are not adequately supported by RDBMSs.

In this paper we address the issue of extending a RDBMS to include the mapper operator. In particular, we propose an SQL-like syntax together with several logical optimizations involving relational operators and the mapper. Finally, we experimentally compare the mapper operator with RDBMS implementations of one-to-many data transformations and validate the logical optimizations proposed.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aho, A.V., Ullman, J.D.: Universality of Data Retrieval Languages. In: Proceedings of the 6th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages, pp. 110–119. ACM Press, New York (1979)

    Chapter  Google Scholar 

  2. Amer-Yahia, S., Cluet, S.: A Declarative Approach to Optimize Bulk Loading into Databases. ACM Transactions of Database Systems 29(2), 233–281 (2004)

    Article  Google Scholar 

  3. Apache. Derby homepage (2005), http://db.apache.org/derby

  4. Carreira, P., Galhardas, H.: Efficient Development of Data Migration Transformations. In: Proceedings of the ACM SIGMOD International Conference on the Management of Data (2004)

    Google Scholar 

  5. Carreira, P., Galhardas, H., Lopes, A., Pereira, J.: One-to-many Transformation Through Data Mappers. Data and Knowledge Engineering Journal (DKE) (2006)

    Google Scholar 

  6. Carreira, P., Galhardas, H., Pereira, J., Martins, F., Silva, M.J.: On the Performance of One-to-many Data Transformations. In: Proc. of the 5th International Workshop on Quality in Databases at VLDB (QDB 2007) (2007)

    Google Scholar 

  7. Chaudhuri, S.: An Overview of Query Optimization in Relational Systems. In: Proceedings of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS 1998), pp. 34–43. ACM Press, New York (1998)

    Chapter  Google Scholar 

  8. Chaudhuri, S., Shim, K.: Query Optimization in the Presence of Foreign Functions. In: Proceedings of the International Conference on Very Large Data Bases (VLDB 1993), pp. 529–542 (1993)

    Google Scholar 

  9. Cui, Y., Widom, J.: Lineage Tracing for General Data Warehouse Transformation. In: Proceedings of the International Conference on Very Large Data Bases (VLDB 2001) (2001)

    Google Scholar 

  10. Cunningham, C., Graefe, G., Galindo-Legaria, C.A.: PIVOT and UNPIVOT: Optimization and Execution Strategies in an RDBMS. In: Proceedings of the International Conference on Very Large Data Bases (VLDB 2004), pp. 998–1009. Morgan Kaufmann, San Francisco (2004)

    Google Scholar 

  11. Eisenberg, A., Melton, J., Kulkarni, K., Michels, J.-E., Zemke, F.: SQL:2003 has been published. In: Proceedings of the ACM SIGMOD Record, vol. 33(1), pp. 119–126 (2004)

    Google Scholar 

  12. Feuerstein, S., Pribyl, B.: Oracle PL/SQL Programming, 4th edn. O’Reilly & Associates, Sebastopol (2005)

    Google Scholar 

  13. Galhardas, H., Florescu, D., Shasha, D., Simon, E.: AJAX: An Extensible Data Cleaning Tool. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, vol. 2(29) (2000)

    Google Scholar 

  14. Galhardas, H., Florescu, D., Shasha, D., Simon, E., Saita, C.A.: Declarative Data Cleaning: Language, Model, and Algorithms. In: Proceedings of the International Conference on Very Large Data Bases (VLDB 2001) (2001)

    Google Scholar 

  15. Garcia-Molina, H., Ullman, J.D., Widom, J.: Database Systems – The Complete Book. Prentice-Hall, Englewood Cliffs (2002)

    Google Scholar 

  16. Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, D., Venkatrao, M., Pellow, F., Pirahesh, H.: Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals. Data Mining and Knowledge Discovery 1(1), 29–53 (1997)

    Article  Google Scholar 

  17. Gupta, A., Subramanian, S., Bellamkonda, S., Bozkaya, T., Folkert, N., Sheng, L., Witkowski, A.: Data Densification in a Relational Database System. In: Proc. of the 2004 ACM SIGMOD International Conference on Management of Data (SIGMOD 2004), pp. 855–859. ACM, New York (2004)

    Chapter  Google Scholar 

  18. Haas, L.M., Miller, R.J., Niswonger, B., Roth, M.T., Schwarz, P.M., Wimmers, E.L.: Transforming Heterogeneous Data with Database Middleware: Beyond Integration. IEEE Data Engineering Bulletin 22(1), 31–36 (1999)

    Google Scholar 

  19. Hellerstein, J.M.: Optimization Techniques for Queries with Expensive Methods. ACM Transactions on Database Systems 22(2), 113–157 (1998)

    Article  Google Scholar 

  20. Klug, A.: Equivalence of Relational Algebra and Relational Calculus Query Languages Having Aggregate Functions. Journal of the ACM 29(3), 699–717 (1982)

    Article  Google Scholar 

  21. Lomet, D., Rundensteiner, E.A. (eds.): Special Issue on Data Transformations. IEEE Data Engineering Bulletin 22 (1999)

    Google Scholar 

  22. Melton, J., Simon, A.R.: SQL:1999 Understanding Relational Language Components. Morgan Kaufmann Publishers, Inc., San Francisco (2002)

    Google Scholar 

  23. Miller, R.J., Haas, L.M., Hernandéz, M., Ho, C.T.H., Fagin, R., Popa, L.: The Clio Project: Managing Heterogeneity. SIGMOD Record 1(30) (2001)

    Google Scholar 

  24. Neumann, T., Helmer, S., Moerkotte, G.: On the Optimal Ordering of Maps, Selections, and Joins under Factorization (2005)

    Google Scholar 

  25. Rahm, E., Do, H.-H.: Data Cleaning: Problems and Current Approaches. IEEE Bulletin of the Technical Committee on Data Engineering 24(4) (2000)

    Google Scholar 

  26. Raman, V., Hellerstein, J.M.: Potter’s Wheel: An Interactive Data Cleaning System. In: Proceedings of the International Conference on Very Large Data Bases (VLDB 2001) (2001)

    Google Scholar 

  27. Shan, M.-C., Neimat, M.-A.: Optimization of Relational Algebra Expressions Containing Recursion Operators. In: Proceedings of the 19th Annual Conference on Computer Science (CSC 1991), pp. 332–341. ACM, New York (1991)

    Chapter  Google Scholar 

  28. Simitsis, A., Vassiliadis, P., Sellis, T.K.: Optimizing ETL Processes in Data Warehouses. In: Proceedings of the 21st International Conference on Data Engineering (ICDE 2005) (2005)

    Google Scholar 

  29. TPC. Benchmark H Standard Specification (1999), http://www.tpc.org

  30. van den Bercken, J., Dittrich, J.P., Kräamer, J., Schäafer, T., Schneider, M., Seeger, B.: XXL – A Library Approach to Supporting Efficient Implementations of Advanced Database Queries. In: Proceedings of the International Conference on Very Large Data Bases (VLDB 2001) (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Carreira, P., Galhardas, H., Pereira, J.D., Wichert, A. (2008). On Handling One-to-Many Transformations in Relational Systems. In: Filipe, J., Cordeiro, J., Cardoso, J. (eds) Enterprise Information Systems. ICEIS 2007. Lecture Notes in Business Information Processing, vol 12. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88710-2_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-88710-2_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-88709-6

  • Online ISBN: 978-3-540-88710-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics