ABSTRACT
Sequence data processing has been studied extensively in the literature.
In recent years, the warehousing and online-analytical processing (OLAP) of archived sequence data have received growing attentions. In particular, the concept of sequence OLAP is recently proposed with the objective of evaluating various kinds of so-called Pattern-Based Aggregate (PBA) queries so that various kinds of data analytical tasks on sequence data can be carried out efficiently. This paper studies the evaluation of ranking PBA queries, which rank the results of PBA queries and return only the top-ranked ones to users. We discuss how ranking PBA queries drastically improve the usability of S-OLAP systems and present techniques that can evaluate various kinds of ranking PBA queries efficiently.
- B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom. Models and issues in data stream systems. In PODS '02: Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 1--16, 2002. Google ScholarDigital Library
- N. Bruno and H. Wang. The threshold algorithm: From middleware systems to the relational engine. IEEE Trans. on Knowl. and Data Eng., 19(4):523--537, 2007. Google ScholarDigital Library
- J. Chen, D. J. DeWitt, F. Tian, and Y. Wang. Niagaracq: a scalable continuous query system for internet databases. SIGMOD Rec., 29(2):379--390, 2000. Google ScholarDigital Library
- G. Das, D. Gunopulos, N. Koudas, and D. Tsirogiannis. Answering top-k queries using views. In VLDB, pages 451--462, 2006. Google ScholarDigital Library
- R. Fagin, A. Lotem, and M. Naor. Optimal aggregation algorithms for middleware. In PODS, pages 102--113, 2001. Google ScholarDigital Library
- L. Golab and M. T. Özsu. Issues in data stream management. SIGMOD Record, 32(2):5--14, 2003. Google ScholarDigital Library
- H. Gonzalez, J. Han, and X. Li. FlowCube: Constructuing RFID FlowCubes for Multi-Dimensional Analysis of Commodity Flows. In VLDB, pages 834--845, 2006. Google ScholarDigital Library
- H. Gonzalez, J. Han, X. Li, and D. Klabjan. Warehousing and Analyzing Massive RFID Data Sets. In ICDE, page 83, 2006. Google ScholarDigital Library
- U. Güntzer, W.-T. Balke, and W. Kießling. Optimizing multi-feature queries for image databases. In VLDB, pages 419--428, 2000. Google ScholarDigital Library
- I. F. Ilyas, R. Shah, W. G. Aref, J. S. Vitter, and A. K. Elmagarmid. Rank-aware query optimization. In SIGMOD, pages 203--214, 2004. Google ScholarDigital Library
- R. Kohavi, C. Brodley, B. Frasca, L. Mason, and Z. Zheng. KDD-Cup 2000 organizers' report: Peeling the onion. SIGKDD Explorations, 2(2):86--98, 2000. Google ScholarDigital Library
- C. Li, K. C.-C. Chang, and I. F. Ilyas. Supporting ad-hoc ranking aggregates. In SIGMOD Conference, pages 61--72, 2006. Google ScholarDigital Library
- E. Lo, B. Kao, W.-S. Ho, S. D. Lee, C. K. Chui, and D. W. Cheung. OLAP on Sequence Data. In SIGMOD, pages 649--660, 2008. Google ScholarDigital Library
- A. Marian, N. Bruno, and L. Gravano. Evaluating top-k queries over web-accessible databases. TODS, 29(2):319--362, 2004. Google ScholarDigital Library
- R. Ramakrishnan, D. Donjerkovic, A. Ranganathan, K. S. Beyer, and M. Krishnaprasad. SRQL: Sorted Relational Query Language. In SSDBM, pages 84--95, 1998. Google ScholarDigital Library
- R. Sadri, C. Zaniolo, A. Zarkesh, and J. Adibi. Optimization of sequence queries in database systems. In PODS, pages 71--81, 2001. Google ScholarDigital Library
- P. Seshadri, M. Livny, and R. Ramakrishnan. Sequence query processing. In SIGMOD, pages 430--441, 1994. Google ScholarDigital Library
- P. Seshadri, M. Livny, and R. Ramakrishnan. The design and implementation of a sequence database system. In VLDB, pages 99--110, 1996. Google ScholarDigital Library
- F. Wang and P. Liu. Temporal management of rfid data. In VLDB, pages 1128--1139, 2005. Google ScholarDigital Library
- T. Wu, D. Xin, and J. Han. Arcube: supporting ranking aggregate queries in partially materialized data cubes. In SIGMOD Conference, pages 79--92, 2008. Google ScholarDigital Library
Index Terms
- Supporting ranking pattern-based aggregate queries in sequence data cubes
Recommendations
Probabilistic top-k and ranking-aggregate queries
Ranking and aggregation queries are widely used in data exploration, data analysis, and decision-making scenarios. While most of the currently proposed ranking and aggregation techniques focus on deterministic data, several emerging applications involve ...
Ranking queries on uncertain data
Uncertain data is inherent in a few important applications. It is far from trivial to extend ranking queries (also known as top-k queries), a popular type of queries on certain data, to uncertain data. In this paper, we cast ranking queries on uncertain ...
Processing Aggregate Queries with Materialized Views in Data Warehouse Environment
Materialized views, which are derived from base relations and stored in the database, offer opportunities for significant performance gain in query evaluation by providing quick access to the pre-computed data. A materialized view can be utilized in ...
Comments