Skip to main content

Probabilistic Threshold Range Aggregate Query Processing over Uncertain Data

  • Conference paper
Advances in Data and Web Management (APWeb 2009, WAIM 2009)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5446))

  • 1305 Accesses

Abstract

Large amount of uncertain data is inherent in many novel and important applications such as sensor data analysis and mobile data management. A probabilistic threshold range aggregate (PTRA) query retrieves summarized information about the uncertain objects satisfying a range query, with respect to a given probability threshold. This paper is the first one to address this important type of query. We develop a new index structure aU-tree and propose an exact querying algorithm based on aU-tree. For the pursue of efficiency, two techniques SingleSample and DoubleSample are developed. Both techniques provide approximate answers to a PTRA query with accuracy guarantee. Experimental study demonstrates the efficiency and effectiveness of our proposed methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Acharya, S., Poosala, V., Ramaswamy, S.: Selectivity estimation in spatial databases. In: SIGMOD 1999 (1999)

    Google Scholar 

  2. Dalvi, N., Suciu, D.: Efficient query evaluation on probabilistic databases. In: VLDB 2004 (2004)

    Google Scholar 

  3. Dalvi, N., Suciu, D.: Management of probabilistic data: foundations and challenges. In: PODS 2007 (2007)

    Google Scholar 

  4. Dey, D., Sarkar, S.: A probabilistic relational model and algebra. In: TODS 1996 (1996)

    Google Scholar 

  5. Kriegel, H.P., et al.: Probabilistic similarity join on uncertain data. In: Li Lee, M., Tan, K.-L., Wuwongse, V. (eds.) DASFAA 2006. LNCS, vol. 3882, pp. 295–309. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  6. Antova, L., et al.: \(10^{10^6}\) worlds and beyond: Efficient representation and processing of incomplete information. In: ICDE 2007 (2007)

    Google Scholar 

  7. Guttman, A.: R-trees: A dynamic index structure for spatial searching. In: SIGMOD 1984 (1984)

    Google Scholar 

  8. Hua, M., Pei, J., Lin, X., Zhang, W.: Efficiently answering probabilistic threshold top-k queries on uncertain data. In: ICDE 2008 (2008)

    Google Scholar 

  9. Kriegel, H.-P., Kunath, P., Renz, M.: Probabilistic nearest-neighbor query on uncertain objects. In: Kotagiri, R., Radha Krishna, P., Mohania, M., Nantajeewarawat, E. (eds.) DASFAA 2007. LNCS, vol. 4443, pp. 337–348. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  10. Kriegel, H.P., Pfeifle, M.: Density-based clustering of uncertain data. In: KDD 2005 (2005)

    Google Scholar 

  11. Lee, S.K.: Imprecise and uncertain information in databases: an evidential approach. In: ICDE 1992 (1992)

    Google Scholar 

  12. Agrawal, P., Benjelloun, O., Das Sarma, A., Hayworth, C., Nabar, S., Sugihara, T., Widom, J.: Trio: A system for data, uncertainty, and lineage. In: VLDB 2006 (2006)

    Google Scholar 

  13. Papadias, D., Kalnis, P., Zhang, J., Tao, Y.: Efficient OLAP operations in spatial data warehouses. In: Jensen, C.S., Schneider, M., Seeger, B., Tsotras, V.J. (eds.) SSTD 2001. LNCS, vol. 2121, p. 443. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  14. Pei, J., Hua, M., Tao, Y., Lin, X.: Query answering techniques on uncertain and probabilistic data. In: SIGMOD 2008 (2008)

    Google Scholar 

  15. Pei, J., Jiang, B., Lin, X., Yuan, Y.: Probabilistic skyline on uncertain data. In: VLDB (2007)

    Google Scholar 

  16. Sarma, A.D., Benjelloun, O., Halevy, A., Widom, J.: Working models for uncertain data. In: ICDE 2005 (2005)

    Google Scholar 

  17. Sen, P., Deshpande, A.: Representing and querying correlated tuples in probabilistic databases. In: ICDE 2007 (2007)

    Google Scholar 

  18. Tao, Y., Cheng, R., Xiao, X., Ngai, W.K., Kao, B., Prabhakar, S.: Indexing multi-dimensional uncertain data with arbitrary probability density functions. In: VLDB, pp. 922–933 (2005)

    Google Scholar 

  19. Tao, Y., Papadias, D.: Range aggregate processing in spatial databases. ACM TODS 16(12), 1555–1570 (2004)

    Google Scholar 

  20. Zhang, W., Lin, X., Pei, J., Zhang, Y.: Managing uncertain data: Probabilistic approaches. In: WAIM 2008 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yang, S., Zhang, W., Zhang, Y., Lin, X. (2009). Probabilistic Threshold Range Aggregate Query Processing over Uncertain Data. In: Li, Q., Feng, L., Pei, J., Wang, S.X., Zhou, X., Zhu, QM. (eds) Advances in Data and Web Management. APWeb WAIM 2009 2009. Lecture Notes in Computer Science, vol 5446. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00672-2_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-00672-2_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-00671-5

  • Online ISBN: 978-3-642-00672-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics