Skip to main content

Preference-Based Top-k Representative Skyline Queries on Uncertain Databases

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9078))

Included in the following conference series:

Abstract

Top-k representative skyline queries are important for multi-criteria decision making applications since they provide an intuitive way to identify the k most significant objects for data analysts. Despite their importance, top-k representative skyline queries have not received adequate attention from the research community. Existing work addressing the problem focuses only on certain data models. For this reason, in this paper, we present the first study on processing top-k representative skyline queries in uncertain databases, based on user-defined references, regarding the priority of individual dimensions. We also apply the odds ratio to restrict the cardinality of the result set, instead of using a threshold which might be difficult for an end-user to define. We then develop two novel algorithms for answering top-k representative skyline queries on uncertain data. In addition, several pruning conditions are proposed to enhance the efficiency of our proposed algorithms. Performance evaluations are conducted on both real-life and synthetic datasets to demonstrate the efficiency, effectiveness and scalability of our proposed approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Zhan, L., Zhang, Y., Zhang, W., Lin, X.: Identifying top k dominating objects over uncertain data. In: Bhowmick, S.S., Dyreson, C.E., Jensen, C.S., Lee, M.L., Muliantara, A., Thalheim, B. (eds.) DASFAA 2014, Part I. LNCS 8421, vol. 8421, pp. 388–405. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  2. Yiu, M.L., Mamoulis, N.: Multi-dimensional top-k dominating queries. The VLDB Journal 18(3), 695–718 (2009)

    Article  Google Scholar 

  3. Yiu, M.L., Mamoulis, N.: Efficient processing of top-k dominating queries on multi-dimensional data. In: Proceedings of the 33rd International Conference on Very Large data Bases, pp. 483-494. VLDB Endowment (2007)

    Google Scholar 

  4. Lin, X., Yuan, Y., Zhang, Q., Zhang, Y.: Selecting stars: The k most representative skyline operator. In: IEEE 23rd International Conference on Data Engineering, ICDE 2007, pp. 86-95. IEEE (2007)

    Google Scholar 

  5. Lian, X., Chen, L.: Probabilistic top-k dominating queries in uncertain databases. Information Sciences 226, 23–46 (2013)

    Article  MATH  MathSciNet  Google Scholar 

  6. Lian, X., Chen, L.: Top-k dominating queries in uncertain databases. In: Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, pp. 660-671. ACM (2009)

    Google Scholar 

  7. Zhang, W., Lin, X., Zhang, Y., Pei, J., Wang, W.: Threshold-based probabilistic top-k dominating queries. The VLDB Journal 19(2), 283–305 (2010)

    Article  Google Scholar 

  8. Kontaki, M., Papadopoulos, A.N., Manolopoulos, Y.: Continuous top-k dominating queries in subspaces. In: Panhellenic Conference on Informatics, PCI 2008, pp. 31-35. IEEE (2008)

    Google Scholar 

  9. Yao, Y.: Measuring retrieval effectiveness based on user preference of documents. JASIS 46(2), 133–145 (1995)

    Article  Google Scholar 

  10. Zhou, B., Yao, Y.: Evaluating information retrieval system performance based on user preference. Journal of Intelligent Information Systems 34(3), 227–248 (2010)

    Article  Google Scholar 

  11. Vargas, S., Castells, P.: Exploiting the diversity of user preferences for recommendation. In: Proceedings of the 10th Conference on Open Research Areas in Information Retrieval, pp. 129-136 (2013)

    Google Scholar 

  12. Chomicki, J.: Querying with Intrinsic Preferences. In: Jensen, C.S., Jeffery, K., Pokorný, J., Šaltenis, S., Bertino, E., Böhm, K., Jarke, M. (eds.) EDBT 2002. LNCS, vol. 2287, pp. 34–51. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  13. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to information retrieval, vol. 1. Cambridge university press Cambridge, (2008)

    Google Scholar 

  14. Li, H., Li, J., Wong, L., Feng, M., Tan, Y.-P.: Relative risk and odds ratio: a data mining perspective. In: Proceedings of the Twenty-Fourth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 368-377. ACM (2005)

    Google Scholar 

  15. Nanongkai, D., Sarma, A.D., Lall, A., Lipton, R.J., Xu, J.: Regret-minimizing representative databases. Proceedings of the VLDB Endowment 3(1–2), 1114–1124 (2010)

    Article  Google Scholar 

  16. Das Sarma, A., Lall, A., Nanongkai, D., Lipton, R.J., Xu, J.: Representative skylines using threshold-based preference distributions. In: 2011 IEEE 27th International Conference on Data Engineering (ICDE), pp. 387-398. IEEE (2011)

    Google Scholar 

  17. Magnani, M., Assent, I., Mortensen, M.L.: Taking the Big Picture: representative skylines based on significance and diversity. The VLDB Journal 1-21 (2014)

    Google Scholar 

  18. Zhang, Y., Zhang, W., Lin, X., Jiang, B., Pei, J.: Ranking uncertain sky: The probabilistic top-k skyline operator. Information Systems 36(5), 898–915 (2011)

    Article  Google Scholar 

  19. Yong, H., Lee, J., Kim, J., Hwang, S.-W.: Skyline ranking for uncertain databases. Information Sciences 273, 247–262 (2014)

    Article  MathSciNet  Google Scholar 

  20. Pei, J., Jiang, B., Lin, X., Yuan, Y.: Probabilistic skylines on uncertain data. In: Proceedings of the 33rd International Conference on Very large data bases, pp. 15-26. VLDB Endowment (2007)

    Google Scholar 

  21. Tao, Y., Ding, L., Lin, X., Pei, J.: Distance-based representative skyline. In: IEEE 25th International Conference on Data Engineering, ICDE 2009, pp. 892-903. IEEE (2009)

    Google Scholar 

  22. Vlachou, A., Doulkeridis, C., Halkidi, M.: Discovering representative skyline points over distributed data. In: Ailamaki, A., Bowers, S. (eds.) SSDBM 2012. LNCS, vol. 7338, pp. 141–158. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  23. Borzsony, S., Kossmann, D., Stocker, K.: The skyline operator. In: Proceedings. 17th International Conference on Data Engineering, 2001, pp. 421-430. IEEE (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ha Thanh Huynh Nguyen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Nguyen, H.T.H., Cao, J. (2015). Preference-Based Top-k Representative Skyline Queries on Uncertain Databases. In: Cao, T., Lim, EP., Zhou, ZH., Ho, TB., Cheung, D., Motoda, H. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2015. Lecture Notes in Computer Science(), vol 9078. Springer, Cham. https://doi.org/10.1007/978-3-319-18032-8_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-18032-8_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-18031-1

  • Online ISBN: 978-3-319-18032-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics