Skip to main content
Log in

Efficient top-(k,l) range query processing for uncertain data based on multicore architectures

  • Published:
Distributed and Parallel Databases Aims and scope Submit manuscript

Abstract

Query processing over uncertain data is very important in many applications due to the existence of uncertainty in real-world data. In this paper, we first elaborate a new and important query in the context of an uncertain database, namely uncertain top-(k,l) range (UTR) query, which retrieves \(l\) uncertain tuples that are expected to meet score range constraint [\(CR_1\),\(CR_2\)] and have the maximum top-k probabilities but no less than a user-specified probability threshold \(q\). In order to enable the UTR query answer faster, we put forward some effective pruning rules to reduce the UTR query space, which are integrated into an efficient UTR query procedure. What’s more, to improve the efficiency and effectiveness of the UTR query, a parallel UTR (PUTR) query procedure is presented. Extensive experiments have verified the efficiency and effectiveness of our proposed algorithms. It is worth to notice that, comparing to the UTR query procedure, the PUTR query procedure performs much more efficiently and effectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. http://openmp.org

  2. http://nsidc.org/data/g00807.html

References

  1. Abiteboul, S., Kanellakis, P., Grahne, G.: On the representation and querying of sets of possible worlds. In: SIGMOD (1987)

  2. Afshani, P., Brodal, G.S., Zeh, N.: Ordered and unordered top-k range reporting in large data sets. In: Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 390–400 (2011)

  3. Aggarwal, C.C., Yu, P.S.: A survey of uncertain data algorithms and applications. IEEE Trans. Knowl. Data Eng. 21, 609–623 (2009)

    Article  Google Scholar 

  4. Chen, J., Cheng, R.: Efficient evaluation of imprecise location-dependent queries. In: Proceedings of the 23th International Conference on Data Engineering (2007)

  5. Cheng, R., Kalashn, I.D., Prabhakar, S.: Evaluating probabilistic queries over imprecise data. In: Proceeding of the 2003 ACM SIGMOD International Conference on Management of Data, pp. 551–562. ACM Press, New York (2003)

  6. Cheng, R., Xia, Y., Prabhakar, S., et al.: Efficient indexing methods for probabilistic threshold queries over uncertain data. In: Proceedings of the VLDB (2004)

  7. Cormode, G., Li, F., Yi, K.: Semantics of ranking queries for probabilistic data and expected ranks. In: Proceedings of the International Conference on Data Engineering, pp. 305–316. IEEE Computer Society Press, Washington (2009)

  8. Dagum, L., Menon, R.: OpenMP: an industry standard API for shared-memory programming. IEEE Comput. Sci. Eng. 5(1), 46–55 (1998)

    Article  Google Scholar 

  9. Dai, X.Y., Yiu, M.L., Mamoulis, N., Tao, Y.F., Vaitis, M.: Probabilistic spatial queries on existentially uncertain data. In: SSTD, pp. 400–417 (2005)

  10. Ding, X.F., Jin, H.: Efficient and progressive algorithms for distributed skyline queries over uncertain data. IEEE Trans. Knowl. Data Eng. 24(8), 1148–1162 (2012)

    Article  Google Scholar 

  11. Fagin, R., Kumar, R., Sivakumar, D.: Comparing top-k lists. SIAM J. Discret. Math. 17(1), 134–160 (2004)

    Article  MathSciNet  Google Scholar 

  12. Ge, T., Zdonik, S., Madden, S.: Top-k queries on uncertain data: on score distribution and typical answers. In: Proceedings of the SIGMOD, pp. 375–388. ACM Press, New York (2009)

  13. Gedik, B., Wu, K.L., Yu, P.S., Liu, L.: Processing moving queries over moving objects using motion-adaptive indexes. IEEE Trans. Knowl. Data Eng. 18(5), 651–668 (2006)

    Article  Google Scholar 

  14. Hu, H.B., Lee, D.L.: Range nearest-neighbor query. IEEE Trans. Knowl. Data Eng. 18(1), 78–91 (2006)

    Article  MathSciNet  Google Scholar 

  15. Hua, M., Pei, J.: Ranking queries on uncertain data. VLDB J. 20(1), 129–153 (2011)

    Article  Google Scholar 

  16. Hua, M., Pei, J., Zhang, W.J., Lin, X.M.: Ranking queries on uncertain data: a probabilistic threshold approach. In: Proceedings of the SIGMOD, pp. 673–686. ACM Press, New York (2008)

  17. Hua, M., Pei, J., Zhang, W.J., Lin, X.M.: Efficiently answering probabilistic threshold top-k queries on uncertain data. In: Proceedings of the International Conference on Data Engineering, pp. 1403–1405. IEEE Computer Society Press, Washington (2008)

  18. Jestes, J., Cormode, G., Li, F.F., Yi, K.: Semantics of ranking queries for probabilistic data. IEEE Trans. Knowl. Data Eng. 23(12), 1903–1917 (2010)

    Article  Google Scholar 

  19. Li, J., Saha, B., Deshpande, A.: A unified approach to ranking in probabilistic databases. Proc. VLDB Endow. 2(1), 502–513 (2009)

    Article  Google Scholar 

  20. Lian, X., Chen, L.: Probabilistic ranked queries in uncertain databases. In: Proceedinggs of the EDBT, pp. 511–522. ACM Press, New York (2008)

  21. Lian, X., Chen, L.: Shooting Top-k stars in uncertain databases. J. VLDB 20(6), 819–840 (2011)

    Article  Google Scholar 

  22. Lin, X., Xu, J.L., Hu, H.B.: Range-based skyline queries in mobile environments. IEEE Trans Knowl. Data Eng. 25(4), 835–849 (2013)

    Article  Google Scholar 

  23. Sarma, A.D., Benjelloun, O., Halevy, A., Widom, J.: Working models for uncertain data.In: ICDE (2006)

  24. Sheng, C., Tao, Y.F.: Dynamic top-k range reporting in external memory. In: Proceedings of ACM Symposium on Principles of Database Systems (PODS), pp. 121–130 (2012)

  25. Soliman, M.A., Ilyas, I.F., Chang, K.C.C.: Top-k query processing in uncertain databases. In: Proceedings of the International Conference on Data Engineering, pp. 896–905. IEEE Computer Society Press, Washington (2007)

  26. Soliman, M.A., Ilyas, I.F.: Ranking with uncertain scores. In: Proceedings of the ICDE, pp. 317–328. IEEE Computer Society Press, Washington (2009)

  27. Tao, Y.F., Cheng, R., Xiao, X.K.: Indexing multi-dimensional uncertain data with arbitrary probability density functions. In: Proceedings of the VLDB (2005)

  28. Yiu, M.L., Mamoulis, N., Dai, X.Y., Tao, Y.F., Vaitis, M.: Efficient evaluation of probabilistic advanced spatial queries on existentially uncertain data. IEEE Trans. Knowl. Data Eng. 21(1), 108–122 (2009)

    Article  Google Scholar 

  29. Zhang, X., Chomicki, J.: On the semantics and evaluation of Top-K queries in probabilistic databases. Distrib. Parallel Databases 26(1), 67–126 (2009)

    Article  Google Scholar 

  30. Zhang, Z., Yang, Y., Tung, A.K.H., Papadias, D.: Continuous k-means monitoring over moving objects. IEEE Trans. Knowl. Data Eng. 20(9), 1205–1216 (2008)

    Article  Google Scholar 

Download references

Acknowledgments

The authors would like to thank the three anonymous reviewers for their valuable and helpful comments on improving the manuscript. This research was partially funded by the Key Program of National Natural Science Foundation of China (Grant Nos.61133005, 61432005), and the National Natural Science Foundation of China (Grant Nos.61370095, 61202109, 1472124), Project supported by the National Science Foundation for Distinguished Young Scholars of Hunan (12JJ1011).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kenli Li.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xiao, G., Li, K., Li, K. et al. Efficient top-(k,l) range query processing for uncertain data based on multicore architectures. Distrib Parallel Databases 33, 381–413 (2015). https://doi.org/10.1007/s10619-014-7156-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10619-014-7156-8

Keywords

Navigation