Abstract
Most existing optimization strategies for relational algebra queries in probabilistic relational databases focus on accelerating probability computation of lineage expressions of answering tuples. However, none of them take into account simplifying lineage expression during query processing. To this aim, an optimization method that makes use of integrity constraints to generate simplified lineage expressions for query results is proposed. The simplified lineage expressions for two algebra operations are generated under functional dependency constraints and referential constraints separately. The effectiveness of the optimization strategies for relational algebra queries is demonstrated in the experiment.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agreste, S., De Meo, P., Ferrara, E., Ursino, D.: Xml matchers: approaches and challenges. Knowl.-Based Syst. 66, 190–209 (2014)
Ayat, N., Akbarinia, R., Afsarmanesh, H., Valduriez, P.: Entity resolution for probabilistic data. Inf. Sci. 277, 492–511 (2014)
Cheng, R., Chen, J., Xie, X.: Cleaning uncertain data with quality guarantees. Proc. VLDB Endowment 1(1), 722–735 (2008)
Cormode, G., Srivastava, D., Shen, E., Yu, T.: Aggregate query answering on possibilistic data with cardinality constraints. In: The 29th IEEE International Conference on Data Engineering (ICDE), pp. 258–269. IEEE Computer Society, Arlington (2012)
Dalvi, N., Ré, C., Suciu, D.: Probabilistic databases: diamonds in the dirt. Commun. ACM 52(7), 86–94 (2009)
Feng, H., Wang, H., Li, J., Gao, H.: Entity resolution on uncertain relations. In: Wang, J., Xiong, H., Ishikawa, Y., Xu, J., Zhou, J. (eds.) WAIM 2013. LNCS, vol. 7923, pp. 77–86. Springer, Heidelberg (2013). doi:10.1007/978-3-642-38562-9_8
Fink, R., Olteanu, D.: Dichotomies for queries with negation in probabilistic databases. ACM Trans. Database Syst. 41(1), 4–47 (2016)
Fink, R., Olteanu, D., Rath, S.: Providing support for full relational algebra in probabilistic databases. In: The 27th IEEE International Conference on Data Engineering (ICDE), pp. 315–326. IEEE Computer Society, Hannover (2011)
Fuhr, N., Rölleke, T.: A probabilistic relational algebra for the integration of information retrieval and database systems. ACM Trans. Inf. Syst. (TOIS) 15(1), 32–66 (1997)
Lian, X., Chen, L.: Probabilistic top-k dominating queries in uncertain databases. Inf. Sci. 226, 23–46 (2013)
Mahmood, N., Burney, S.A., Ahsan, K.: Generic temporal and fuzzy ontological framework (gtfof) for developing temporal-fuzzy database model for managing patient’s data. J. UCS 18(2), 177–193 (2012)
Mo, L., Cheng, R., Li, X., Cheung, D.W.l., Yang, X.S.: Cleaning uncertain data for top-k queries. In: The 29th IEEE International Conference on Data Engineering (ICDE), pp. 134–145. IEEE Computer Society, Brisbane (2013)
Qin, B.: Efficient queries evaluation on block independent disjoint probabilistic databases. In: Renz, M., Shahabi, C., Zhou, X., Cheema, M.A. (eds.) DASFAA 2015. LNCS, vol. 9050, pp. 74–88. Springer, Cham (2015). doi:10.1007/978-3-319-18123-3_5
Ré, C., Suciu, D.: Materialized views in probabilistic databases: for information exchange and query optimization. In: The 33rd International Conference on Very Large Data Bases, pp. 51–62. VLDB Endowment, University of Vienna, Austria (2007)
Roy, S., Perduca, V., Tannen, V.: Faster query answering in probabilistic databases using read-once functions. In: the 14th International Conference on Database Theory (ICDT), pp. 232–243. ACM, New York (2011)
Sarma, A.D., Benjelloun, O., Halevy, A., Widom, J.: Working models for uncertain data. In: The 22nd International Conference on Data Engineering (ICDE), pp. 2–7. IEEE Computer Society, Atlanta (2006)
Sen, P., Deshpande, A.: Representing and querying correlated tuples in probabilistic databases. In: The 23rd IEEE International Conference on Data Engineering (ICDE), pp. 596–605. IEEE Computer Society, Istanbul (2007)
Sen, P., Deshpande, A., Getoor, L.: PRDB: managing and exploiting rich correlations in probabilistic databases. VLDB J. 18(5), 1065–1090 (2009)
Škrbić, S., Racković, M., Takači, A.: Prioritized fuzzy logic based information processing in relational databases. Knowl.-Based Syst. 38, 62–73 (2013)
Song, W., Yu, J.X., Cheng, H., Liu, H., He, J., Du, X.: Bayesian network structure learning from attribute uncertain data. In: Gao, H., Lim, L., Wang, W., Li, C., Chen, L. (eds.) WAIM 2012. LNCS, vol. 7418, pp. 314–321. Springer, Heidelberg (2012). doi:10.1007/978-3-642-32281-5_31
Suciu, D., Olteanu, D., Ré, C., Koch, C.: Probabilistic databases. Synth. Lect. Data Manag. 3(2), 1–180 (2011)
Tang, R., Shao, D., Ba, M.L., Wu, H.: Conditioning probabilistic relational data with referential constraints. In: Han, W.-S., Lee, M.L., Muliantara, A., Sanjaya, N.A., Thalheim, B., Zhou, S. (eds.) DASFAA 2014. LNCS, vol. 8505, pp. 413–427. Springer, Heidelberg (2014). doi:10.1007/978-3-662-43984-5_32
Widom, J.: Trio: a system for integrated management of data, accuracy, and lineage. In: CIDR, pp. 262–276. ACM, New York (2005)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Zhang, C., Cao, Z., Zhu, H. (2017). Query Optimization Strategies in Probabilistic Relational Databases. In: Du, D., Li, L., Zhu, E., He, K. (eds) Theoretical Computer Science. NCTCS 2017. Communications in Computer and Information Science, vol 768. Springer, Singapore. https://doi.org/10.1007/978-981-10-6893-5_16
Download citation
DOI: https://doi.org/10.1007/978-981-10-6893-5_16
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6892-8
Online ISBN: 978-981-10-6893-5
eBook Packages: Computer ScienceComputer Science (R0)