Skip to main content

Query Optimization Strategies in Probabilistic Relational Databases

  • Conference paper
  • First Online:
Theoretical Computer Science (NCTCS 2017)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 768))

Included in the following conference series:

  • 1018 Accesses

Abstract

Most existing optimization strategies for relational algebra queries in probabilistic relational databases focus on accelerating probability computation of lineage expressions of answering tuples. However, none of them take into account simplifying lineage expression during query processing. To this aim, an optimization method that makes use of integrity constraints to generate simplified lineage expressions for query results is proposed. The simplified lineage expressions for two algebra operations are generated under functional dependency constraints and referential constraints separately. The effectiveness of the optimization strategies for relational algebra queries is demonstrated in the experiment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agreste, S., De Meo, P., Ferrara, E., Ursino, D.: Xml matchers: approaches and challenges. Knowl.-Based Syst. 66, 190–209 (2014)

    Article  Google Scholar 

  2. Ayat, N., Akbarinia, R., Afsarmanesh, H., Valduriez, P.: Entity resolution for probabilistic data. Inf. Sci. 277, 492–511 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  3. Cheng, R., Chen, J., Xie, X.: Cleaning uncertain data with quality guarantees. Proc. VLDB Endowment 1(1), 722–735 (2008)

    Article  Google Scholar 

  4. Cormode, G., Srivastava, D., Shen, E., Yu, T.: Aggregate query answering on possibilistic data with cardinality constraints. In: The 29th IEEE International Conference on Data Engineering (ICDE), pp. 258–269. IEEE Computer Society, Arlington (2012)

    Google Scholar 

  5. Dalvi, N., Ré, C., Suciu, D.: Probabilistic databases: diamonds in the dirt. Commun. ACM 52(7), 86–94 (2009)

    Article  Google Scholar 

  6. Feng, H., Wang, H., Li, J., Gao, H.: Entity resolution on uncertain relations. In: Wang, J., Xiong, H., Ishikawa, Y., Xu, J., Zhou, J. (eds.) WAIM 2013. LNCS, vol. 7923, pp. 77–86. Springer, Heidelberg (2013). doi:10.1007/978-3-642-38562-9_8

    Chapter  Google Scholar 

  7. Fink, R., Olteanu, D.: Dichotomies for queries with negation in probabilistic databases. ACM Trans. Database Syst. 41(1), 4–47 (2016)

    Article  MathSciNet  Google Scholar 

  8. Fink, R., Olteanu, D., Rath, S.: Providing support for full relational algebra in probabilistic databases. In: The 27th IEEE International Conference on Data Engineering (ICDE), pp. 315–326. IEEE Computer Society, Hannover (2011)

    Google Scholar 

  9. Fuhr, N., Rölleke, T.: A probabilistic relational algebra for the integration of information retrieval and database systems. ACM Trans. Inf. Syst. (TOIS) 15(1), 32–66 (1997)

    Article  Google Scholar 

  10. Lian, X., Chen, L.: Probabilistic top-k dominating queries in uncertain databases. Inf. Sci. 226, 23–46 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  11. Mahmood, N., Burney, S.A., Ahsan, K.: Generic temporal and fuzzy ontological framework (gtfof) for developing temporal-fuzzy database model for managing patient’s data. J. UCS 18(2), 177–193 (2012)

    Google Scholar 

  12. Mo, L., Cheng, R., Li, X., Cheung, D.W.l., Yang, X.S.: Cleaning uncertain data for top-k queries. In: The 29th IEEE International Conference on Data Engineering (ICDE), pp. 134–145. IEEE Computer Society, Brisbane (2013)

    Google Scholar 

  13. Qin, B.: Efficient queries evaluation on block independent disjoint probabilistic databases. In: Renz, M., Shahabi, C., Zhou, X., Cheema, M.A. (eds.) DASFAA 2015. LNCS, vol. 9050, pp. 74–88. Springer, Cham (2015). doi:10.1007/978-3-319-18123-3_5

    Google Scholar 

  14. Ré, C., Suciu, D.: Materialized views in probabilistic databases: for information exchange and query optimization. In: The 33rd International Conference on Very Large Data Bases, pp. 51–62. VLDB Endowment, University of Vienna, Austria (2007)

    Google Scholar 

  15. Roy, S., Perduca, V., Tannen, V.: Faster query answering in probabilistic databases using read-once functions. In: the 14th International Conference on Database Theory (ICDT), pp. 232–243. ACM, New York (2011)

    Google Scholar 

  16. Sarma, A.D., Benjelloun, O., Halevy, A., Widom, J.: Working models for uncertain data. In: The 22nd International Conference on Data Engineering (ICDE), pp. 2–7. IEEE Computer Society, Atlanta (2006)

    Google Scholar 

  17. Sen, P., Deshpande, A.: Representing and querying correlated tuples in probabilistic databases. In: The 23rd IEEE International Conference on Data Engineering (ICDE), pp. 596–605. IEEE Computer Society, Istanbul (2007)

    Google Scholar 

  18. Sen, P., Deshpande, A., Getoor, L.: PRDB: managing and exploiting rich correlations in probabilistic databases. VLDB J. 18(5), 1065–1090 (2009)

    Article  Google Scholar 

  19. Škrbić, S., Racković, M., Takači, A.: Prioritized fuzzy logic based information processing in relational databases. Knowl.-Based Syst. 38, 62–73 (2013)

    Article  Google Scholar 

  20. Song, W., Yu, J.X., Cheng, H., Liu, H., He, J., Du, X.: Bayesian network structure learning from attribute uncertain data. In: Gao, H., Lim, L., Wang, W., Li, C., Chen, L. (eds.) WAIM 2012. LNCS, vol. 7418, pp. 314–321. Springer, Heidelberg (2012). doi:10.1007/978-3-642-32281-5_31

    Chapter  Google Scholar 

  21. Suciu, D., Olteanu, D., Ré, C., Koch, C.: Probabilistic databases. Synth. Lect. Data Manag. 3(2), 1–180 (2011)

    Article  MATH  Google Scholar 

  22. Tang, R., Shao, D., Ba, M.L., Wu, H.: Conditioning probabilistic relational data with referential constraints. In: Han, W.-S., Lee, M.L., Muliantara, A., Sanjaya, N.A., Thalheim, B., Zhou, S. (eds.) DASFAA 2014. LNCS, vol. 8505, pp. 413–427. Springer, Heidelberg (2014). doi:10.1007/978-3-662-43984-5_32

    Google Scholar 

  23. Widom, J.: Trio: a system for integrated management of data, accuracy, and lineage. In: CIDR, pp. 262–276. ACM, New York (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Caicai Zhang or Hong Zhu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper

Zhang, C., Cao, Z., Zhu, H. (2017). Query Optimization Strategies in Probabilistic Relational Databases. In: Du, D., Li, L., Zhu, E., He, K. (eds) Theoretical Computer Science. NCTCS 2017. Communications in Computer and Information Science, vol 768. Springer, Singapore. https://doi.org/10.1007/978-981-10-6893-5_16

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-6893-5_16

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-6892-8

  • Online ISBN: 978-981-10-6893-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics