Skip to main content
Log in

Abstract

As online spatial datasets grow both in number and sophistication, it becomes increasingly difficult for users to decide whether a dataset is suitable for their tasks, especially when they do not have prior knowledge of the dataset. In this paper, we propose browsing as an effective and efficient way to explore the content of a spatial dataset. Browsing allows users to view the size of a result set before evaluating the query at the database, thereby avoiding zero-hit/mega-hit queries and saving time and resources. Although the underlying technique supporting browsing is similar to range query aggregation and selectivity estimation, spatial dataset browsing poses some unique challenges. In this paper, we identify a set of spatial relations that need to be supported in browsing applications, namely, the contains, contained and the overlap relations. We prove a lower bound on the storage required to answer queries about the contains relation accurately at a given resolution. We then present three storage-efficient approximation algorithms which we believe to be the first to estimate query results about these spatial relations. We evaluate these algorithms with both synthetic and real world datasets and show that they provide highly accurate estimates for datasets with various characteristics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. A. Aboulnaga and J.F. Naughton, “Accurate estimation of the cost of spatial selections,” in ICDE’00, Proceedings of the 16th International Conference on Data Engineering, 2000, pp. 123–134.

  2. S. Acharya, V. Poosala, and S. Ramaswamy, “Selectivity estimation in spatial databases,” in SIGMOD’99, Proceedings ACM SIGMOD International Conference on Management of Data, 1999, pp. 13–24.

  3. Alexandria, “Alexandria digital library project,” 1999. http://www.alexandria.ucsb.edu.

  4. R. Beigel and E. Tanin, “The geometry of browsing,” in Proceedings of the Latin American Symposium on Theoretical Informatics, Brazil, 1998, pp. 331–340.

  5. C.Y. Chan and Y.E. Ioannidis, “Hierarchical prefix cubes for range-sum queries,” in VLDB’99, Proceedings of 25th International Conference on Very Large Data Bases, 1999, pp. 675–686.

  6. M.J. Egenhofer and J.R. Herring, “Categorizing binary topological relations between regions, lines, and points in geographic databases,” in M.J. Egenhofer, D.M. Mark, and J.R. Herring (eds.), The 9-Intersection: Formalism and Its Use for Natural-Language Spatial Predicates. National Center for Geographic Information and Analysis, Report 94-1, 1994, pp. 13–17.

  7. D.M. Flewelling and M.J. Egenhofer, “Using digital spatial archives effectively,” International Journal of Geographical Information Science, vol. 13, no. 1, pp. 1–8, 1999.

  8. A.U. Frank, “Qualitative spatial reasoning: Cardinal directions as an example,” International Journal of Geographic Information Systems, vol. 10. pp. 269–290, 1996.

  9. V. Gaede and O. Günther, “Multidimensional access methods,” Computing Surveys, vol. 30, no. 2, 170–231, 1998.

  10. S. Geffner, M. Riedewald, D. Agrawal, and A. El Abbadi, “Data cubes in dynamic environments,” Data Engineering Bulletin, vol. 22, no. 4, pp. 31–40, 1999.

  11. J. Gray, S. Chaudhuri, A. Bosworth, A. Layman, D. Reichart, M. Venkatrao, F. Pellow, and H. Pirahesh, “Data cubes: A relational aggregation operator generalizing group-by, cross-tab and sub-totals,” Data Mining and Knowledge Discovery, vol. 1, no. 1, 1997.

  12. S. Greene, E. Tanin, C. Plaisant, B. Shneiderman, L. Olsen, G. Major, and S. Johns, “The end of zero-hit queries: Query previews for NASA’s global change master directory,” International Journal of Digital Libraries, pp. 79–90, 1999.

  13. F. Harary, Graph Theory, Addison-Wesley Publishing Company, 1969.

  14. A. Herskovits, Language and Spatial Cognition. Cambridge University Press, Cambridge, 1986.

  15. C.-T. Ho, R. Agrawal, N. Megiddo, and R. Srikant, “Range queries in OLAP data cubes,” in Proceedings of the 1997 ACM SIGMOD International Conference on Management of Data, 1997, pp. 73–88.

  16. P.D. Holmes, and E.R.A. Jungert, “Symbolic and geometric connectivity graph methods for route planning in digitized maps,” IEEE Trans. Pattern Anal. Mach. Intell, vol. 14, no. 5, pp. 549–565, 1992.

  17. J. Jin, N. An, and A. Sivasubramaniam, “Analyzing range queries on spatial data,” in ICDE’00, Proceedings of the 16th International Conference on Data Engineering, 2000, pp. 525–534.

  18. S.Y. Lee, M.C. Yang, and J.W. Chen, “Signature file as a spatial filter for iconic image database,” Journal of Visual Languages and Computing, vol. 3, pp. 373–397, 1992.

  19. W.S. Massey, Algebraic Topology: An Introduction Brace & World, 1967.

  20. D. Papadias and T. Sellis, “Qualitative representation of spatial knowledge in two dimensional space,” The VLDB Journal, vol. 3, no. 4, pp. 479–516, 1994.

  21. M. Riedewald, D. Agrawal, and A. El Abbadi, “pCube: Update-efficient online aggregation with progressive feedback and error bounds,” in Proceedings of the International Conference on Scientific and Statistical Database Management (SSDBM), 2000, pp. 95–108.

  22. TIGER, “1997 TIGER/Line Files (machine-readable data files),” Technical report, U.S. Bureau of the Census, Washington, DC, 1997.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nagender Bandi.

Additional information

Recommended by: Sunil Prabhakar

Work supported by NSF grants IIS 02-23022 and CNF 04-23336. An earlier version of this paper appeared in the 17th International Conference on Data Engineering (ICDE 2001).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sun, C., Bandi, N., Agrawal, D. et al. Exploring spatial datasets with histograms. Distrib Parallel Databases 20, 57–88 (2006). https://doi.org/10.1007/s10619-006-8576-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10619-006-8576-x

Keywords

Navigation