Skip to main content

An Evaluation of Modern Spatial Libraries

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12113))

Abstract

Applications such as Uber, Yelp, and Tinder rely on spatial data or locations from their users. These applications and services either build their own spatial data management systems or rely on existing solutions. The JTS Topology Suite (JTS), its C++ port GEOS, Google S2, ESRI Geometry API, and Java Spatial Index (JSI) are among the spatial processing libraries that these systems build upon. Applications and services depend on the indexing capabilities available in such libraries for high-performance spatial query processing. However, limited prior work has empirically compared these libraries. Herein, we compare these libraries qualitatively and quantitatively based on four popular spatial queries and using two real-world datasets. We also compare a lesser known library (jvptree) which utilizes Vantage Point Trees. In addition to performance evaluation, we also analyzed the construction time, and space overhead, and identified the strengths and weaknesses of each libraries and their underlying index structures. Our results demonstrate that there are vast differences in space consumption (up to 9.8 x), construction time (up to 5 x), and query runtime (up to 54 x) between the libraries evaluated.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://github.com/Esri/geometry-api-java.

  2. 2.

    https://github.com/aled/jsi.

  3. 3.

    http://trove4j.sourceforge.net/html/overview.html.

  4. 4.

    https://www.opengeospatial.org/standards/sfa.

  5. 5.

    https://trac.osgeo.org/geos/.

  6. 6.

    https://github.com/google/s2geometry.

  7. 7.

    https://www.fastcompany.com/3007394/how-foursquare-building-humane-map-framework-rival-googles/.

  8. 8.

    https://www.infoq.com/presentations/uber-market-platform/.

  9. 9.

    https://tech.gotinder.com/geosharded-recommendations-part-1-sharding-approach-2/.

  10. 10.

    https://github.com/jchambers/jvptree.

  11. 11.

    https://openjdk.java.net/projects/code-tools/jmh/.

  12. 12.

    https://openjdk.java.net/projects/code-tools/jol/.

  13. 13.

    https://github.com/google/benchmark.

  14. 14.

    https://github.com/gperftools/gperftools.

  15. 15.

    https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page.

  16. 16.

    http://spatial-libs.db.in.tum.de.

  17. 17.

    https://github.com/varpande/spatial-libs.

  18. 18.

    CPU: https://ark.intel.com/content/www/us/en/ark/products/75272/intel-xeon- processor-e5-2660-v2-25m-cache-2-20-ghz.html.

  19. 19.

    We kept the tolerance value to 0.0 which means if the point coordinates are exactly the same only then they are snapped to the same node: https://locationtech.github.io/jts/javadoc/org/locationtech/jts/index/kdtree/KdTree.html.

  20. 20.

    We store points from the datasets as degerate rectangles in SAMs.

  21. 21.

    http://trove4j.sourceforge.net/html/benchmarks.shtml.

  22. 22.

    http://s2geometry.io/devguide/s2shapeindex.html.

  23. 23.

    https://locationtech.github.io/jts/javadoc/org/locationtech/jts/geom/prep/PreparedGeometry.html.

  24. 24.

    https://esri.github.io/geometry-api-java/javadoc/com/esri/core/geometry/Geometry.GeometryAccelerationDegree.html.

References

  1. Aji, A., et al.: Hadoop-GIS: a high performance spatial data warehousing system over mapreduce. PVLDB (2013). https://doi.org/10.14778/2536222.2536227

  2. Eldawy, A., Mokbel, M.F.: SpatialHadoop: a MapReduce framework for spatial data. In: ICDE 2015. IEEE Computer Society (2015)

    Google Scholar 

  3. Eldawy, A., Sabek, I., Elganainy, M., Bakeer, A., Abdelmotaleb, A., Mokbel, M.F.: Sphinx: empowering impala for efficient execution of SQL queries on big spatial data. In: Gertz, M., et al. (eds.) SSTD 2017. LNCS, vol. 10411, pp. 65–83. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-64367-0_4

    Chapter  Google Scholar 

  4. Gomes, D.: MemSQL Live (2019). https://www.memsql.com/blog/memsql-live-nikita-shamgunov-on-the-data-engineering-podcast/

  5. Guttman, A.: R-trees: a dynamic index structure for spatial searching. In: SIGMOD 1984 (1984)

    Google Scholar 

  6. Hagedorn, S., et al.: The STARK framework for spatio-temporal data analytics on spark. In: Datenbanksysteme für Business, Technologie und Web (BTW 2017) (2017)

    Google Scholar 

  7. Kipf, A., et al.: Approximate geospatial joins with precision guarantees. In: 34th IEEE ICDE 2018, Paris, France, April 16–19 2018 (2018)

    Google Scholar 

  8. Kipf, A., et al.: Adaptive main-memory indexing for high-performance point-polygon joins. In: Extending Database Technology, EDBT 2020 (2020)

    Google Scholar 

  9. Lee, K., Ganti, R.K., Srivatsa, M., Liu, L.: Efficient spatial query processing for big data. In: Proceedings of the 22nd ACM SIGSPATIAL (2014)

    Google Scholar 

  10. Lee, K., et al.: Lightweight indexing and querying services for big spatial data. IEEE Trans. Serv. Comput. (2019)

    Google Scholar 

  11. Liu, L., Özsu, M.T. (eds.): Encyclopedia of Database Systems. Springer, New York (2018). https://doi.org/10.1007/978-1-4614-8265-9

    Book  Google Scholar 

  12. Makris, A., et al.: Performance evaluation of MongoDB and PostgreSQL for spatio-temporal data. In: EDBT Workshop. CEUR Workshop Proceedings (2019)

    Google Scholar 

  13. Malensek, M., et al.: Polygon-based query evaluation over geospatial data using distributed hash tables. In: Utility and Cloud Computing, UCC 2013 (2013)

    Google Scholar 

  14. Malensek, M., et al.: Evaluating geospatial geometry and proximity queries using distributed hash tables. Comput. Sci. Eng. 16, 53–61 (2014)

    Article  Google Scholar 

  15. MongoDB Releases - New Geo Features in MongoDB 2.4 (2013). https://www.mongodb.com/blog/post/new-geo-features-in-mongodb-24/

  16. Oracle Spatial (2019). https://www.oracle.com/technetwork/database/options/spatialandgraph/overview/spatialfeatures-1902020.html/

  17. Orenstein, J.A.: Redundancy in spatial databases. In: ACM SIGMOD 1989 (1989)

    Google Scholar 

  18. Pandey, V., Kipf, A., Neumann, T., Kemper, A.: How good are modern spatial analytics systems? PVLDB 11(11), 1661–1673 (2018)

    Google Scholar 

  19. Pandey, V., et al.: High-performance geospatial analytics in hyperspace. In: Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, 26 June–01 July 2016 (2016)

    Google Scholar 

  20. Tahboub, R.Y., Rompf, T.: On supporting compilation in spatial query engines: (vision paper). In: ACM SIGSPATIAL (2016)

    Google Scholar 

  21. Tang, M., et al.: LocationSpark: a distributed in-memory data management system for big spatial data. PVLDB 9(13), 1565–1568 (2016)

    Google Scholar 

  22. Tang, M., et al.: Similarity group-by operators for multi-dimensional relational data. IEEE Trans. Knowl. Data Eng. 28, 510–523 (2016)

    Article  Google Scholar 

  23. Theocharidis, K., Liagouris, J., Mamoulis, N., Bouros, P., Terrovitis, M.: SRX: efficient management of spatial RDF data. VLDB J. 28(5), 703–733 (2019). https://doi.org/10.1007/s00778-019-00554-z

    Article  Google Scholar 

  24. Tsitsigkos, D., et al.: Parallel in-memory evaluation of spatial joins. In: ACM SIGSPATIAL 2019 (2019). https://doi.org/10.1145/3347146.3359343

  25. Xie, D., et al.: Simba: efficient in-memory spatial analytics. In: SIGMOD 2016 (2016)

    Google Scholar 

  26. Yianilos, P.N.: Data structures and algorithms for nearest neighbor search in general metric spaces. In: Proceedings of the Fourth Annual ACM/SIGACT-SIAM Symposium on Discrete Algorithms, Austin, Texas, USA, 25–27 January 1993 (1993)

    Google Scholar 

  27. You, S., Zhang, J., Gruenwald, L.: Large-scale spatial join query processing in cloud. In: 31st ICDE Workshops 2015 (2015)

    Google Scholar 

  28. Yu, J., et al.: GeoSpark: a cluster computing framework for processing large-scale spatial data. In: ACM SIGSPATIAL 2015 (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Varun Pandey .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pandey, V., van Renen, A., Kipf, A., Kemper, A. (2020). An Evaluation of Modern Spatial Libraries. In: Nah, Y., Cui, B., Lee, SW., Yu, J.X., Moon, YS., Whang, S.E. (eds) Database Systems for Advanced Applications. DASFAA 2020. Lecture Notes in Computer Science(), vol 12113. Springer, Cham. https://doi.org/10.1007/978-3-030-59416-9_46

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-59416-9_46

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-59415-2

  • Online ISBN: 978-3-030-59416-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics