Abstract
Due to the recent massive data generation, preference queries are becoming an increasingly important for users because such queries retrieve only a small number of preferable data objects from a huge multi-dimensional dataset. A top-k dominating query, which retrieves the k data objects dominating the highest number of data objects in a given dataset, is particularly important in supporting multi-criteria decision making because this query can find interesting data objects in an intuitive way exploiting the advantages of top-k and skyline queries. Although efficient algorithms for top-k dominating queries have been studied over centralized databases, there are no studies which deal with top-k dominating queries in distributed environments. The recent data management is becoming increasingly distributed, so it is necessary to support processing of top-k dominating queries in distributed environments. In this paper, we address, for the first time, the challenging problem of processing top-k dominating queries in distributed networks and propose a method for efficient top-k dominating data retrieval, which avoids redundant communication cost and latency. Furthermore, we also propose an approximate version of our proposed method, which further reduces communication cost. Extensive experiments on both synthetic and real data have demonstrated the efficiency and effectiveness of our proposed methods.











Similar content being viewed by others
References
Akbarinia, R., Pacitti, E., Valduriez, P.: Reducing network traffic in unstructured p2p systems using top-k queries. Distributed and Parallel Databases 19(2), 67–86 (2006)
Akbarinia, R., Pacitti, E., Valduriez, P.: Best position algorithms for top-k queries VLDB, pp. 495–506 (2007)
Balke, W.T., Kießling, W.: Optimizing multi-feature queries for image databases VLDB, pp. 10–14 (2000)
Börzsönyi, S., Kossmann, D., Stocker, K.: The skyline operator, ICDE, pp. 421–430 (2001)
Buckley, C., Voorhees, E.M.: Evaluating evaluation measure stability SIGIR, pp. 33–40 (2000)
Chan, C.Y., Jagadish, H., Tan, K.L., Tung, A.K., Zhang, Z.: Finding k-dominant skylines in high dimensional space SIGMOD, pp. 503–514 (2006)
Chan, C.Y., Jagadish, H., Tan, K.L., Tung, A.K., Zhang, Z.: On high dimensional skylines EDBT, pp. 478–495 (2006)
Chen, L., Cui, B., Lu, H.: Constrained skyline query processing against distributed data sites. IEEE TKDE 23(2), 204–217 (2011)
Han, X., Li, J., Gao, H.: Tdep: efficiently processing top-k dominating query on massive data. Knowledge and Information Systems Springer (2014)
He, Z., Lo, E.: Answering why-not questions on top-k queries ICDE, pp. 750–761 (2012)
Hose, K., Vlachou, A.: A survey of skyline processing in highly distributed environments. The VLDB J. 21(3), 359–384 (2012)
Huang, Z., Jensen, C.S., Lu, H., Ooi, B.C.: Skyline queries against mobile lightweight devices in manets ICDE, p. 66 (2006)
Ilyas, I.F., Beskales, G., Soliman, M.A.: A survey of top-k query processing techniques in relational database systems. ACM Comput. Surveys (CSUR) 40(4), 11 (2008)
Kießling, W.: Foundations of preferences in database systems, VLDB, pp. 311–322 (2002)
Kontaki, M., Papadopoulos, A.N., Manolopoulos, Y.: Continuous top-k dominating queries in subspaces, Panhellenic Conference on Informatics, pp. 31–35 (2008)
Kontaki, M., Papadopoulos, A.N., Manolopoulos, Y.: Continuous top-k dominating queries. IEEE TKDE 24(5), 840–853 (2012)
Kosmatopoulos, A., Papadopoulos, A., Tsichlas, K.: Dynamic processing of dominating queries with performance guarantees ICDT, pp. 225–234 (2014)
Lee, J., You, G.W., Hwang, S.W.: Personalized top-k skyline queries in high-dimensional space. Inf. Syst. 34(1), 45–61 (2009)
Lian, X., Chen, L.: Top-k dominating queries in uncertain databases EDBT, pp. 660–671 (2009)
Lian, X., Chen, L.: Probabilistic top-k dominating queries in uncertain databases. Inf. Sci. 226, 23–46 (2013)
Lin, X., Yuan, Y., Zhang, Q., Zhang, Y.: Selecting stars: The k most representative skyline operator, SIGMOD, pp. 86–95 (2007)
Papadias, D., Tao, Y., Fu, G., Seeger, B.: Progressive skyline computation in database systems. ACM Trans. Database Syst. 30(1), 41–82 (2005)
Santoso, B., Chiu, G.: Close dominance graph: An efficient framework for answering continuous top-k dominating queries IEEE TKDE (2013)
Sarma, A.D., Lall, A., Nanongkai, D., Lipton, R.J., Xu, J.: Representative skylines using threshold-based preference distributions ICDE, pp. 387–398 (2011)
Skoutas, D., Sacharidis, D., Simitsis, A., Kantere, V., Sellis, T.: Top-k dominant web services under multi-criteria matching EDBT, pp. 898–909 (2009)
Tao, Y., Ding, L., Lin, X., Pei, J.: Distance-based representative skyline ICDE, pp. 892–903 (2009)
Tao, Y., Xiao, X., Pei, J.: Subsky: Efficient computation of skylines in subspaces ICDE, pp. 65–76 (2006)
Tiakas, E., Valkans, G., Papadopoulos, A.N., Manolopoulos, Y.D.G.: Metric-based top-k dominating queries EDBT, pp. 415–426 (2014)
Vlachou, A., Doulkeridis, C., Halkidi, M.: Discovering representative skyline points over distributed data Scientific and Statistical Database Management, pp. 141–158. Springer (2012)
Vlachou, A., Doulkeridis, C., Kotidis, Y., Vazirgiannis, M.: Skypeer: Efficient subspace skyline computation over distributed data ICDE, pp. 416–425 (2007)
Vlachou, A., Doulkeridis, C., Nørvåg, K.: Distributed top-k query processing by exploiting skyline summaries. Distributed and Parallel Databases 30(3-4), 239–271 (2012)
Vlachou, A., Doulkeridis, C., Nørvåg, K., Vazirgiannis, M.: On efficient top-k query processing in highly distributed environments, SIGMOD, pp. 753–764 (2008)
Xie, X., Lu, H., Chen, J., Shang, S.: Top-k neighborhood dominating query DASFAA, pp. 131–145 (2013)
Yiu, M.L., Mamoulis, N.: Efficient processing of top-k dominating queries on multi-dimensional data VLDB, pp. 483–494 (2007)
Yiu, M.L., Mamoulis, N.: Multi-dimensional top-k dominating queries. The VLDB J. 18(3), 695–718 (2009)
Zhan, L., Zhang, Y., Zhang, W., Lin, X.: Identifying top k dominating objects over uncertain data DASFAA, pp. 388–405 (2014)
Zhang, W., Lin, X., Zhang, Y., Pei, J., Wang, W.: Threshold-based probabilistic top-k dominating queries. The VLDB J. 19(2), 283–305 (2010)
Zhang, W., Lin, X., Zhang, Y., Pei, J., Wang, W.: Progressive processing of subspace dominating queries. The VLDB J. 20(6), 921–948 (2011)
Zhu, L., Tao, Y., Zhou, S.: Distributed skyline retrieval with low bandwidth consumption. IEEE TKDE 21(3), 384–400 (2009)
Acknowledgments
This research is partially supported by the Grant-in-Aid for Scientific Research (A)(26240013) of MEXT, and JSPS Fellows (26-4907) of the Ministry of Education, Culture, Sports, Science and Technology, Japan.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Amagata, D., Sasaki, Y., Hara, T. et al. Efficient processing of top-k dominating queries in distributed environments. World Wide Web 19, 545–577 (2016). https://doi.org/10.1007/s11280-015-0340-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-015-0340-6