Abstract
Query processing over uncertain data is very important in many applications due to the existence of uncertainty in real-world data. In this paper, we first elaborate a new and important query in the context of an uncertain database, namely uncertain top-(k,l) range (UTR) query, which retrieves \(l\) uncertain tuples that are expected to meet score range constraint [\(CR_1\),\(CR_2\)] and have the maximum top-k probabilities but no less than a user-specified probability threshold \(q\). In order to enable the UTR query answer faster, we put forward some effective pruning rules to reduce the UTR query space, which are integrated into an efficient UTR query procedure. What’s more, to improve the efficiency and effectiveness of the UTR query, a parallel UTR (PUTR) query procedure is presented. Extensive experiments have verified the efficiency and effectiveness of our proposed algorithms. It is worth to notice that, comparing to the UTR query procedure, the PUTR query procedure performs much more efficiently and effectively.
Similar content being viewed by others
References
Abiteboul, S., Kanellakis, P., Grahne, G.: On the representation and querying of sets of possible worlds. In: SIGMOD (1987)
Afshani, P., Brodal, G.S., Zeh, N.: Ordered and unordered top-k range reporting in large data sets. In: Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 390–400 (2011)
Aggarwal, C.C., Yu, P.S.: A survey of uncertain data algorithms and applications. IEEE Trans. Knowl. Data Eng. 21, 609–623 (2009)
Chen, J., Cheng, R.: Efficient evaluation of imprecise location-dependent queries. In: Proceedings of the 23th International Conference on Data Engineering (2007)
Cheng, R., Kalashn, I.D., Prabhakar, S.: Evaluating probabilistic queries over imprecise data. In: Proceeding of the 2003 ACM SIGMOD International Conference on Management of Data, pp. 551–562. ACM Press, New York (2003)
Cheng, R., Xia, Y., Prabhakar, S., et al.: Efficient indexing methods for probabilistic threshold queries over uncertain data. In: Proceedings of the VLDB (2004)
Cormode, G., Li, F., Yi, K.: Semantics of ranking queries for probabilistic data and expected ranks. In: Proceedings of the International Conference on Data Engineering, pp. 305–316. IEEE Computer Society Press, Washington (2009)
Dagum, L., Menon, R.: OpenMP: an industry standard API for shared-memory programming. IEEE Comput. Sci. Eng. 5(1), 46–55 (1998)
Dai, X.Y., Yiu, M.L., Mamoulis, N., Tao, Y.F., Vaitis, M.: Probabilistic spatial queries on existentially uncertain data. In: SSTD, pp. 400–417 (2005)
Ding, X.F., Jin, H.: Efficient and progressive algorithms for distributed skyline queries over uncertain data. IEEE Trans. Knowl. Data Eng. 24(8), 1148–1162 (2012)
Fagin, R., Kumar, R., Sivakumar, D.: Comparing top-k lists. SIAM J. Discret. Math. 17(1), 134–160 (2004)
Ge, T., Zdonik, S., Madden, S.: Top-k queries on uncertain data: on score distribution and typical answers. In: Proceedings of the SIGMOD, pp. 375–388. ACM Press, New York (2009)
Gedik, B., Wu, K.L., Yu, P.S., Liu, L.: Processing moving queries over moving objects using motion-adaptive indexes. IEEE Trans. Knowl. Data Eng. 18(5), 651–668 (2006)
Hu, H.B., Lee, D.L.: Range nearest-neighbor query. IEEE Trans. Knowl. Data Eng. 18(1), 78–91 (2006)
Hua, M., Pei, J.: Ranking queries on uncertain data. VLDB J. 20(1), 129–153 (2011)
Hua, M., Pei, J., Zhang, W.J., Lin, X.M.: Ranking queries on uncertain data: a probabilistic threshold approach. In: Proceedings of the SIGMOD, pp. 673–686. ACM Press, New York (2008)
Hua, M., Pei, J., Zhang, W.J., Lin, X.M.: Efficiently answering probabilistic threshold top-k queries on uncertain data. In: Proceedings of the International Conference on Data Engineering, pp. 1403–1405. IEEE Computer Society Press, Washington (2008)
Jestes, J., Cormode, G., Li, F.F., Yi, K.: Semantics of ranking queries for probabilistic data. IEEE Trans. Knowl. Data Eng. 23(12), 1903–1917 (2010)
Li, J., Saha, B., Deshpande, A.: A unified approach to ranking in probabilistic databases. Proc. VLDB Endow. 2(1), 502–513 (2009)
Lian, X., Chen, L.: Probabilistic ranked queries in uncertain databases. In: Proceedinggs of the EDBT, pp. 511–522. ACM Press, New York (2008)
Lian, X., Chen, L.: Shooting Top-k stars in uncertain databases. J. VLDB 20(6), 819–840 (2011)
Lin, X., Xu, J.L., Hu, H.B.: Range-based skyline queries in mobile environments. IEEE Trans Knowl. Data Eng. 25(4), 835–849 (2013)
Sarma, A.D., Benjelloun, O., Halevy, A., Widom, J.: Working models for uncertain data.In: ICDE (2006)
Sheng, C., Tao, Y.F.: Dynamic top-k range reporting in external memory. In: Proceedings of ACM Symposium on Principles of Database Systems (PODS), pp. 121–130 (2012)
Soliman, M.A., Ilyas, I.F., Chang, K.C.C.: Top-k query processing in uncertain databases. In: Proceedings of the International Conference on Data Engineering, pp. 896–905. IEEE Computer Society Press, Washington (2007)
Soliman, M.A., Ilyas, I.F.: Ranking with uncertain scores. In: Proceedings of the ICDE, pp. 317–328. IEEE Computer Society Press, Washington (2009)
Tao, Y.F., Cheng, R., Xiao, X.K.: Indexing multi-dimensional uncertain data with arbitrary probability density functions. In: Proceedings of the VLDB (2005)
Yiu, M.L., Mamoulis, N., Dai, X.Y., Tao, Y.F., Vaitis, M.: Efficient evaluation of probabilistic advanced spatial queries on existentially uncertain data. IEEE Trans. Knowl. Data Eng. 21(1), 108–122 (2009)
Zhang, X., Chomicki, J.: On the semantics and evaluation of Top-K queries in probabilistic databases. Distrib. Parallel Databases 26(1), 67–126 (2009)
Zhang, Z., Yang, Y., Tung, A.K.H., Papadias, D.: Continuous k-means monitoring over moving objects. IEEE Trans. Knowl. Data Eng. 20(9), 1205–1216 (2008)
Acknowledgments
The authors would like to thank the three anonymous reviewers for their valuable and helpful comments on improving the manuscript. This research was partially funded by the Key Program of National Natural Science Foundation of China (Grant Nos.61133005, 61432005), and the National Natural Science Foundation of China (Grant Nos.61370095, 61202109, 1472124), Project supported by the National Science Foundation for Distinguished Young Scholars of Hunan (12JJ1011).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Xiao, G., Li, K., Li, K. et al. Efficient top-(k,l) range query processing for uncertain data based on multicore architectures. Distrib Parallel Databases 33, 381–413 (2015). https://doi.org/10.1007/s10619-014-7156-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10619-014-7156-8