Skip to main content

Advertisement

Log in

Top-K representative documents query over geo-textual data stream

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

The increasing popularity of location-based social networks encourages more and more users to share their experiences. It deeply impacts the decision of customers when shopping, traveling, and so on. This paper studies the problem of top-K valuable documents query over geo-textual data stream. Many researchers have studied this problem. However, they do not consider the reliability of documents, where some unreliable documents may mislead customers to make improper decisions. In addition, they lack the ability to prune documents with low representativeness. In order to increase user satisfaction in recommendation systems, we propose a novel framework named PDS. It first employs an efficiently machine learning technique named ELM to prune unreliable documents, and then uses a novel index named \(\mathcal {GH}\) to maintain documents. For one thing, this index maintains a group of pruning values to filter low quality documents. For another, it utilizes the unique property of sliding window to further enhance the PDS performance. Theoretical analysis and extensive experimental results demonstrate the effectiveness of the proposed algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6

Similar content being viewed by others

Notes

  1. In this paper, we use a the tuple 〈N,s〉 to express a sliding window, where N is the window length, s is the objects amount that are arriving at the window at the same moment.

References

  1. Bai, M., Xin, J., Wang, G., Zhang, L., Zimmermann, R., Ye, Y., Wu, X.: Discovering the k representative skyline over a sliding window. IEEE Trans. Knowl Data Eng. 28(8), 2041–2056 (2016)

    Article  Google Scholar 

  2. Caruana, G., Li, M., Qi, M.: A MapReduce based parallel SVM for large scale spam filtering. In: Fuzzy Systems and Knowledge Discovery (2011)

  3. Chen, X., Zeng, Y., Cong, G., Qin, S., Xiang, Y., Dai, Y.: On information coverage for location category based point-of-interest recommendation. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, Texas, USA, pp. 37–43 (2015)

  4. Chen, L., Cong, G.: Diversity-aware top-k publish/subscribe for text stream. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Victoria, Australia, pp. 347–362 (2015)

  5. Cheng, Y., Ye, Y., Chen, L., Wang, G., Giraud-Carrier, C.G., Sun, Y.: Distr: A distributed method for the reachability query over large uncertain graphs. IEEE Trans. Parallel Distrib. Syst. 27(11), 3172–3185 (2016)

    Article  Google Scholar 

  6. Di, Y., Shastri, A., Rundensteiner, E.A., Ward, M.O.: An optimal strategy for monitoring top-k queries in streaming windows. In: EDBT, pp. 57–68 (2011)

  7. Hu, H., Liu, Y., Li, G., Feng, J., Tan, K.-L.: A location-aware publish/subscribe framework for parameterized spatio-textual subscriptions. In: 31st IEEE International Conference on Data Engineering, ICDE 2015, Seoul, South Korea, pp. 711–722 (2015)

  8. Huang, G.-B., Zhu, Q.-Y., Siew, C.-K.: Extreme learning machine: a new learning scheme of feedforward neural networks. In: International Symposium on Neural Networks, vol. 2 (2004)

  9. Huang, G.-B., Zhu, Q.-Y., Siew, C.-K.: Extreme learning Machine: Theory and applications. Neurocomputing 70, 489–501 (2006)

    Article  Google Scholar 

  10. Huang, G.-B., Zhou, H., Ding, X., Zhang, R.: Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybern. 42, 513–529 (2012)

    Article  Google Scholar 

  11. Mouratidis, K., Bakiras, S., Papadias, D.: Continuous monitoring of top-k queries over sliding windows. In: SIGMOD Conference, pp. 635–646 (2006)

  12. Rong, H.-J., Huang, G.-B., Sundararajan, N., Saratchandran, P.: Online sequential fuzzy extreme learning machine for function approximation and classification problems. IEEE Trans. Syst. Man Cybern. 39, 1067–1072 (2009)

    Article  Google Scholar 

  13. She, J., Tong, Y., Chen, L., Cao, C.C.: Conflict-aware event-participant arrangement and its variant for online setting. IEEE Trans. Knowl. Data Eng. 28(9), 2281–2295 (2016)

    Article  Google Scholar 

  14. Shen, Z., Cheema, M.A., Lin, X., Zhang, W., Wang, H.: Efficiently monitoring top-k pairs over sliding windows. In: ICDE, pp. 798–809 (2012)

  15. Tong, Y., Zhang, X., Chen, L.ei: Tracking frequent items over distributed probabilistic data. World Wide Web 19(4), 579–604 (2016)

    Article  Google Scholar 

  16. Tong, Y., She, J., Meng, R.: Bottleneck-aware arrangement over event-based social networks: the max-min approach. World Wide Web 19(6), 1151–1177 (2016)

    Article  Google Scholar 

  17. Tong, Y., She, J., Ding, B., Chen, L., Wo, T., Xu, K.: Online minimum matching in real-time spatial data Experiments and analysis. PVLDB 9(12), 1053–1064 (2016)

    Google Scholar 

  18. Tong, Y., She, J., Ding, B., Wang, L., Chen, L.: Online mobile micro-task allocation in spatial crowdsourcing. In: 32nd IEEE International Conference on Data Engineering, ICDE 2016, Helsinki, Finland, pp. 49–60 (2016)

  19. Wang, X., Zhang, Y., Zhang, W., Lin, X., Wang, W.: Selectivity estimation on streaming spatio-textual data using local correlations. PVLDB 8(2), 101–112 (2014)

    Google Scholar 

  20. Ye, M., Yin, P., Lee, W.-C.: Location recommendation for location-based social networks 18th ACM SIGSPATIAL International Symposium on Advances in Geographic Information Systems, ACM-GIS 2010, pp 458–461. Proceedings, CA, USA (2010)

    Google Scholar 

  21. Zhu, R., Wang, B., Yang, X., Zheng, B., Wang, G.: SAP: Improving continuous top-k queries over streaming data. IEEE Trans. Knowl. Data Eng. 29(6), 1310–1328 (2017)

    Article  Google Scholar 

Download references

Acknowledgments

This work is partially supported by the NSF of China for Outstanding Young Scholars under grant No. 61322208, the NSF of China under grant Nos. 61572122, 61272178, 61502317, U1401256, and the NSF of China for Key Program under grant No. 61532021. Bin Wang is the corresponding author.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rui Zhu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, B., Zhu, R., Yang, X. et al. Top-K representative documents query over geo-textual data stream. World Wide Web 21, 537–555 (2018). https://doi.org/10.1007/s11280-017-0470-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-017-0470-0

Keywords