Skip to main content
Log in

IRSJ: incremental refining spatial joins for interactive queries in GIS

  • Published:
GeoInformatica Aims and scope Submit manuscript

Abstract

An increasing number of emerging web database applications deal with large georeferenced data sets. However, exploring these large data sets through spatial queries can be very time and resource intensive. The need for interactive spatial queries has arisen in many applications such as Geographic Information Systems (GIS) for efficient decision-support. In this paper, we propose a new interactive spatial query processing technique for GIS. We present a family of the Incremental Refining Spatial Join (IRSJ) algorithms that can be used to report incrementally refined running estimates for aggregate queries while simultaneously displaying the actual query result tuples of the data sets sampled so far. Our goal is to minimize the time until an acceptably accurate estimate of the query result is available (to users) measured by a confidence interval. Our approach enables more interactive data exploration and analysis. While similar work has been done in relational databases, to the best of our knowledge, this is the first work using this approach in GIS. We investigate and evaluate different sampling methodologies through extensive experimental performance comparisons. Experiments on both real and synthetic data show an order of magnitude response time improvement relative to the final answer obtained when using a full R-tree join. We also show the impact of different index structures on the performance of our algorithms using three known sampling methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

References

  1. An N, Yang Z, Sivasubramaniam A (2001) Selectivity estimation for spatial joins. In: Proceedings of international conf. on data engineering (ICDE), pp 368–375

  2. Anselin L (1992) Spatial data analysis with GIS: an introduction to application in the social sciences. In: Technical report 92-10, National Center for Geographic Information and Analysis, University of California at Santa Barbara

  3. Aref WG, Samet H (1994) A cost model for query optimization using R-trees. In: Proceedings of workship advances in GIS

  4. Bae WD, Alkobaisi S, Leutenegger ST (2006) An incremental refinining spatial join algorithm for estimating qeury results in GIS. In: Proceedings of international conf. on database and expert systems applications (DEXA), pp 935–944

  5. Beckmann N, Kriegel H-P, Schneider R (1990) The R*-tree: an efficient and robust access methods for points and rectangles. In: Proceedings of ACM SIGMOD, pp 322–331

  6. Belussi A, Faloutsos C (1998) Self-spacial join selectivity estimation using fractal concepts. ACM Trans Inf Sys 16(2):161–201

    Article  Google Scholar 

  7. Brinkhoff T, Kriegel H, Seeger B (1993) Efficient processing of spatial joins using R-trees. In: Proceedings of ACM SIGMOD, pp 127–246

  8. Chen CM, Roussopoulos N (1994) Adaptive selectivity estimation using query feedback. In: Proceedings of ACM SIGMOD, pp 161–172

  9. Das A, Gehrke J, Riedewald M (2004) Approximation techniques for spatial data. In: Proceedings of ACM SIGMOD, pp 695–706

  10. Faloutsos C, Seeger B, Graina A, Traina C (2000) Spatial join selectivity using power laws. In: Proceedings of ACM SIGMOD, pp 177–188

  11. Faloutsos C, Sellis T, Roussopoulos N (1987) Analysis of object oriented spatial access methods. In: Proceedings of ACM SIGMOD, pp 426–439

  12. Ghilani CD, Wolf PR (2006) Adjustment computations: spatial data analysis. Wiley, New York

    Google Scholar 

  13. Guttman A (1984) R-trees: a dynamic index structure for spatial searching. In: Proceedings of ACM SIGMOD, pp 45–57

  14. Haas PJ, Swami AN (1995) Sampling-based selectivity estimation for joins using augmented frequent value statistics. In: Proceedings of international conf. on data engineering (ICDE), pp 522–531

  15. Haas PJ, Naughton JF, Swami AN (1994) On the relative cost of sampling for join selectivity estimation. In: Proceedings of ACM PODS, pp 14–24

  16. Harangsri JSB, Ngu A (1997) Selectivity estimation for joins using systematic sampling. In: Proceedings of international conf. on database and expert systems applications (DEXA), pp 384–389

  17. Hass PJ (1997) Large-sample and deterministic confidence intervals for online aggregation. In: Proceedings of international conf. scientific and statistical databases management (SSDBM), pp 51–63

  18. Hass PJ, Hellerstein JM (1999) Ripple joins for online aggregation. In: Proceedings of ACM SIGMOD, pp 287–298

  19. Hellerstein JM, Hass PJ, Wang HJ (1997) Online aggregation. In: Proceedings of ACM SIGMOD, pp 171–182

  20. Hellerstein JM, Avnur R, Raman V (2000) Informix under control: online query processing. Data Mining and Knowledge Discovery 12:281–314

    Article  Google Scholar 

  21. Huang YW, Jing N, Rundensteiner EA (1997a) A cost model for estimating the performance of spatial joins using R-trees. In: Proceedings of international conf. on scientific and statistical databases management (SSDBM), pp 30–38

  22. Huang YW, Jing N, Rundensteiner EA (1997b) Spatial join using R-tree: breadth-first traversal with global optimizations. In: Proceedings of VLDB, pp 396–405

  23. Kamel I, Faloutsos C (1993) An packing R-trees. In: Proceedings of ACM CIKM, pp 490–499

  24. Larson RR (1996) Geographic information retrieval and spatial browsing. GIS and Libraries, University of Illinois

    Google Scholar 

  25. De Floriani L, Puppo E, Magillo P (1999) Applications of computational geometry to geographic information systems. Handbook of computational geometry, chapter 7, pp 333–388

  26. Leutenegger ST, Lopez MA, Edginton J (1997) STR: a simple and efficient algorithm for R-tree packing. In: Proceedings of international conf. on data engineering (ICDE), pp 497–506

  27. Leutenegger ST, Lopez MA (1998) The effect of buffering on the performance of R-trees. IEEE Trans Knowl Data Eng 12(1):33–44

    Article  Google Scholar 

  28. Lo ML, Ravishankar CV (1993) The design and implementation of seeded trees: an efficient method for spatial joins. IEEE Trans Knowl Data Eng 10(1):136–151

    Google Scholar 

  29. Medeiros CB, Pires F (1994) Databases for GIS. ACM SIDMOD Record 23(1):107–115

    Article  Google Scholar 

  30. Olken F (1993) Random sampling from databases. Master’s thesis, University of California at Berkeley

  31. Olken F, Rotem D (1986) Simple random sampling from relational databases. In: Proceedings of VLDB, pp 160–169

  32. Pagel B-U, Six H-W, Widmayer P (1993) Towards an analysis of range query performance. In: Proceedings of ACM PODS

  33. Papadias D, Mamoulis N, Theodoridis Y (1999) Processing and optimization of multiway spatial joins using R-trees. In: Proceedings of ACM PODS, pp 44–55

  34. Rubinstein RY (1981) Simulation and the Monte Carlo method. Wiley, New York

    Book  Google Scholar 

  35. Scheaffer RL, Mendenhall W, Ott RL (1995) Elementary survey sampling. Duxbury Press

  36. Serfling RJ (2002) Basic statistics for business and economics. McGraw-Hill, New York

    Google Scholar 

  37. Seshadri S (1992) Probabilistic methods in query processing. Master’s thesis, University of Wisconsin

  38. Sun C, Agrawal O, Abbadi AE (2002) Selectivity estimation for spatial joins with geometric selections. In: Proceedings of international conf. on extending database technology (EDBT), pp 609–626

  39. Theodoridis Y (2000) Efficient cost models for spatial queries using R-trees. IEEE Trans Knowledge Data Eng 12(1):19–32

    Article  Google Scholar 

  40. Theodoridis Y, Sellis T (1996) A model for the prediction of R-tree performance. In: Proceedings of ACM PODS, pp 161–171

  41. Theodoridis Y, Stefanakis E, Sellis T (1998) Cost models for join queries in spatial databases. In: Proceedings of international conf. data engineering (ICDE), pp 476–483

  42. USGS. USGS mineral resources on-line spatial data. http://tin.er.usgs.gov/

  43. Vassilakopoulos M, Manolopoulos Y (1997) On sampling regional data. Data Knowl Eng 22:309–318

    Article  Google Scholar 

Download references

Acknowledgements

The following people provided helpful suggestions during early stages of this work: Dr. Sada Narayanappa, Brandon Haenlein, and Mohammed Albow. Help in statistics was graciously provided by Dr. Petr Vojtěchovský.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wan D. Bae.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bae, W.D., Alkobaisi, S. & Leutenegger, S.T. IRSJ: incremental refining spatial joins for interactive queries in GIS. Geoinformatica 14, 507–543 (2010). https://doi.org/10.1007/s10707-009-0089-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10707-009-0089-0

Keywords

Navigation