Abstract
Probabilistic databases have been established as a powerful technique for managing and analysing large uncertain data sets. A major challenge for probabilistic databases is query evaluation. There exist even simple relational queries for which the exact probability computation is \(\#\mathcal{P}\)-hard. Consequently, if we are only interested in the k highest ranked tuples, then an efficient pre-filtering can reduce the computation time significantly. In this work we present a top-k filter which computes a small candidate set for a top-k answer based on a complex relational query in polynomial time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Antova, L., Jansen, T., Koch, C., Olteanu, D.: Fast and simple relational processing of uncertain data. In: ICDE, pp. 983–992 (2008)
Dalvi, N., Suciu, D.: Efficient query evaluation on probabilistic databases. VLDB J. 16(4), 523–544 (2007)
Fuhr, N., Roelleke, T.: A probabilistic relational algebra for the integration of information retrieval and database systems. ACM Trans. IS 15(1), 32–66 (1997)
Ilyas, I.F., Soliman, M.A.: Probabilistic Ranking Techniques in Relational Databases. Synthesis Lectures on DM. Morgan & Claypool (2011)
Karp, R.M., Luby, M., Madras, N.: Monte-carlo approximation algorithms for enumeration problems. Journal of Algorithms 10(3), 429–448 (1989)
Koch, C.: MayBMS: A System for Managing Large Uncertain and Probabilistic Databases. In: Managing and Mining Uncertain Data, ch. 6. Springer (2008)
Lehrack, S., Saretz, S., Schmitt, I.: QSQL2: Query Language Support for Logic-Based Similarity Conditions on Probabilistic Databases. In: RCIS (2012)
Lehrack, S., Schmitt, I.: QSQL: Incorporating Logic-Based Retrieval Conditions into SQL. In: Kitagawa, H., Ishikawa, Y., Li, Q., Watanabe, C. (eds.) DASFAA 2010. LNCS, vol. 5981, pp. 429–443. Springer, Heidelberg (2010)
Lehrack, S., Schmitt, I.: A Probabilistic Interpretation for a Geometric Similarity Measure. In: Liu, W. (ed.) ECSQARU 2011. LNCS, vol. 6717, pp. 749–760. Springer, Heidelberg (2011)
Lehrack, S., Schmitt, I.: A Unifying Probability Measure for Logic-Based Similarity Conditions on Uncertain Relational Data. In: NTSS, pp. 14–19 (2011)
Li, J., Saha, B., Deshpande, A.: A unified approach to ranking in probabilistic databases. VLDB J. 20(2), 249–275 (2011)
Olteanu, D., Huang, J., Koch, C.: Approximate confidence computation in probabilistic databases. In: ICDE, pp. 145–156 (2010)
Olteanu, D., Wen, H.: Ranking Query Answers in Probabilistic Databases: Complexity and Efficient Algorithms. In: ICDE (to appear, 2012)
Re, C., Dalvi, N.N., Suciu, D.: Efficient top-k query evaluation on probabilistic data. In: ICDE, pp. 886–895 (2007)
Re, C., Suciu, D.: Approximate lineage for probabilistic databases. PVLDB 1(1), 797–808 (2008)
Re, C., Suciu, D.: Managing Probabilistic Data with MystiQ: The Can-Do, the Could-Do, and the Can’t-Do. In: Greco, S., Lukasiewicz, T. (eds.) SUM 2008. LNCS (LNAI), vol. 5291, pp. 5–18. Springer, Heidelberg (2008)
Sarma, A.D., Benjelloun, O., Halevy, A.Y., Widom, J.: Working models for uncertain data. In: ICDE, p. 7 (2006)
Schaefer, F., Schulze, A.: OpenInfRA – Storing and retrieving information in a heterogenous documentation system. In: CAA (2012)
Schmitt, I.: QQL: A DB&IR Query Language. VLDB J. 17(1), 39–56 (2008)
Soliman, M.A., Ilyas, I.F., Saleeb, M.: Building ranked mashups of unstructured sources with uncertain information. Proc. VLDB Endow 3, 826–837 (2010)
Suciu, D., Olteanu, D., Ré, C., Koch, C.: Probabilistic Databases. Synthesis Lectures on Data Management. Morgan & Claypool Publishers (2011)
Widom, J.: Trio: A system for data, uncertainty, and lineage. In: Managing and Mining Uncertain Data, pp. 113–148. Springer, Heidelberg (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lehrack, S., Saretz, S. (2012). A Top-k Filter for Logic-Based Similarity Conditions on Probabilistic Databases. In: Morzy, T., Härder, T., Wrembel, R. (eds) Advances in Databases and Information Systems. ADBIS 2012. Lecture Notes in Computer Science, vol 7503. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33074-2_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-33074-2_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33073-5
Online ISBN: 978-3-642-33074-2
eBook Packages: Computer ScienceComputer Science (R0)