Skip to main content

ASSET Queries: A Set-Oriented and Column-Wise Approach to Modern OLAP

  • Conference paper
Enabling Real-Time Business Intelligence (BIRTE 2009)

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 41))

  • 550 Accesses

Abstract

Modern data analysis has given birth to numerous grouping constructs and programming paradigms, way beyond the traditional group by. Applications such as data warehousing, web log analysis, streams monitoring and social networks understanding necessitated the use of data cubes, grouping variables, windows and MapReduce. In this paper we review the associated set (ASSET) concept and discuss its applicability in both continuous and traditional data settings. Given a set of values B, an associated set over B is just a collection of annotated data multisets, one for each b(B. The goal is to efficiently compute aggregates over these data sets. An ASSET query consists of repeated definitions of associated sets and aggregates of these, possibly correlated, resembling a spreadsheet document. We review systems implementing ASSET queries both in continuous and persistent contexts and argue for associated sets’ analytical abilities and optimization opportunities.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Elmasri, R., Navathe, S.B.: Fundamentals of Database Systems. Addison-Wesley, Reading (1994)

    MATH  Google Scholar 

  2. Graefe, G.: Query Evaluation Techniques for Large Databases. ACM Computing Surveys 25, 73–170 (1993)

    Article  Google Scholar 

  3. Chaudhuri, S., Shim, K.: Including Group-By in Query Optimization. In: 20th International Conference on Very Large Data Bases, pp. 354–366. Morgan Kaufmann, San Francisco (1994)

    Google Scholar 

  4. Yan, W.P., Larson, P.: Eager Aggregation and Lazy Aggregation. In: 21st International Conference on Very Large Data Bases, pp. 345–357. Morgan Kaufmann, San Francisco (1995)

    Google Scholar 

  5. Chaudhuri, S., Dayal, U.: An Overview of Data Warehousing and OLAP Technology. SIGMOD Record 26, 65–74 (1997)

    Article  Google Scholar 

  6. Gray, J., Bosworth, A., Layman, A., Pirahesh, H.: Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total. In: 12th International Conference on Data Engineering, pp. 152–159. IEEE Computer Society, Los Alamitos (1996)

    Google Scholar 

  7. Agarwal, S., Agrawal, R., Deshpande, P., Gupta, A., Naughton, J.F., Ramakrishnan, R., Sarawagi, S.: On the Computation of Multidimensional Aggregates. In: 22nd International Conference on Very Large Data Bases, pp. 506–521. Morgan Kaufmann, San Francisco (1996)

    Google Scholar 

  8. Ross, K.A., Srivastava, D.: Fast Computation of Sparse Datacubes. In: International Conference on Very Large Data Bases (VLDB), pp. 116–125. Morgan Kaufmann, San Francisco (1997)

    Google Scholar 

  9. Ross, K.A., Srivastava, D., Chatziantoniou, D.: Complex Aggregation at Multiple Granularities. In: Schek, H.-J., Saltor, F., Ramos, I., Alonso, G. (eds.) EDBT 1998. LNCS, vol. 1377, pp. 263–277. Springer, Heidelberg (1998)

    Google Scholar 

  10. Chatziantoniou, D., Ross, K.A.: Querying Multiple Features of Groups in Relational Databases. In: 22nd International Conference on Very Large Data Bases, pp. 295–306. Morgan Kaufmann, San Francisco (1996)

    Google Scholar 

  11. Chatziantoniou, D.: Using grouping variables to express complex decision support queries. Data & Knowledge Engineering 61, 114–136 (2007)

    Article  Google Scholar 

  12. Chatziantoniou, D.: Evaluation of Ad Hoc OLAP: In-Place Computation. In: 11th International Conference on Scientific and Statistical Database Management, pp. 34–43. IEEE Computer Society, Los Alamitos (1999)

    Google Scholar 

  13. Chatziantoniou, D.: The PanQ Tool and EMF SQL for Complex Data Management. In: 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 420–424. ACM, New York (1999)

    Google Scholar 

  14. Chatziantoniou, D., Akinde, M.O., Johnson, T., Kim, S.: The MD-join: An Operator for Complex OLAP. In: 17th International Conference on Data Engineering, pp. 524–533. IEEE Computer Society, Los Alamitos (2001)

    Chapter  Google Scholar 

  15. Akinde, M.O., Böhlen, M.H., Johnson, T., Lakshmanan, L.V.S., Srivastava, D.: Efficient OLAP Query Processing in Distributed Data Warehouses. In: Jensen, C.S., Jeffery, K., Pokorný, J., Šaltenis, S., Bertino, E., Böhm, K., Jarke, M. (eds.) EDBT 2002. LNCS, vol. 2287, pp. 336–353. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  16. Steenhagen, H.J., Apers, P.M.G., Blanken, H.M.: Optimization of Nested Queries in a Complex Object Model. In: Jarke, M., Bubenko, J.A., Jeffery, K.G. (eds.) EDBT 1994. LNCS, vol. 779, pp. 337–350. Springer, Heidelberg (1994)

    Chapter  Google Scholar 

  17. Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and Issues in Data Stream Systems. In: 21st ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 1–16. ACM, New York (2002)

    Google Scholar 

  18. Dean, J., Ghemawat, S.: MapReduce: Simplified Data Processing on Large Clusters. In: 6th Symposium on Operating System Design and Implementation, pp. 137–150. USENIX Association (2004)

    Google Scholar 

  19. DeWitt, D.J., Stonebraker, M.: MapReduce: A major step backwards. The Database Column, http://www.databasecolumn.com/2008/01/mapreduce-a-major-step-back.html

  20. Pavlo, A., et al.: A Comparison of Approaches to Large-Scale Data Analysis. In: SIGMOD International Conference on Management of Data, pp. 165–178. ACM, New York (2009)

    Chapter  Google Scholar 

  21. Olston, C., Reed, B., Srivastava, U., Kumar, R., Tomkins, A.: Pig latin: a not-so-foreign Language for Data Processing. In: SIGMOD International Conference on Management of Data, pp. 1099–1110. ACM, New York (2008)

    Google Scholar 

  22. Abouzeid, A., Bajda-Pawlikowski, K., Abadi, D.J., Rasin, R., Silberschatz, A.: HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads. In: Proceedings of the VLDB Conference, vol. 2(1), pp. 922–933. VLDB Endowment (2009)

    Google Scholar 

  23. Roth, M.A., Korth, H.F., Silberschatz, A.: Extended Algebra and Calculus for Nested Relational Databases. Transactions on Database Systems 13, 389–417 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  24. Mamoulis, N.: Efficient Processing of Joins on Set-valued Attributes. In: SIGMOD International Conference on Management of Data, pp. 157–168. ACM, New York (2003)

    Google Scholar 

  25. Winslett, M.: Interview with Jim Gray. SIGMOD Record 32, 53–61 (2003)

    Article  Google Scholar 

  26. Witkowski, A., Bellamkonda, S., Bozkaya, T., Dorman, G., Folkert, N., Gupta, A., Sheng, L., Subramanian, S.: Spreadsheets in RDBMS for OLAP. In: SIGMOD International Conference on Management of Data, pp. 52–63. ACM, New York (2003)

    Google Scholar 

  27. Liu, B., Jagadish, H.V.: A Spreadsheet Algebra for a Direct Data Manipulation Query Interface. In: 25th International Conference on Data Engineering (ICDE), pp. 417–428. IEEE, Los Alamitos (2009)

    Google Scholar 

  28. Chatziantoniou, D., Sotiropoulos, Y.: Stream Variables: A Quick but not Dirty SQL Extension for Continuous Queries. In: 23rd International Conference on Data Engineering Workshops, pp. 19–28. IEEE Computer Society, Los Alamitos (2007)

    Google Scholar 

  29. Chatziantoniou, D., Sotiropoulos, Y.: COSTES: Continuous spreadsheet-like computations. In: 24th International Conference on Data Engineering Workshops, pp. 82–87. IEEE Computer Society, Los Alamitos (2008)

    Google Scholar 

  30. Gehrke, J., Korn, F., Srivastava, D.: On Computing Correlated Aggregates Over Continual Data Streams. In: SIGMOD International Conference on Management of Data, pp. 13–24. ACM, New York (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chatziantoniou, D., Sotiropoulos, Y. (2010). ASSET Queries: A Set-Oriented and Column-Wise Approach to Modern OLAP. In: Castellanos, M., Dayal, U., Miller, R.J. (eds) Enabling Real-Time Business Intelligence. BIRTE 2009. Lecture Notes in Business Information Processing, vol 41. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14559-9_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-14559-9_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-14558-2

  • Online ISBN: 978-3-642-14559-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics