Abstract
Applications such as Uber, Yelp, and Tinder rely on spatial data or locations from their users. These applications and services either build their own spatial data management systems or rely on existing solutions. The JTS Topology Suite (JTS), its C++ port GEOS, Google S2, ESRI Geometry API, and Java Spatial Index (JSI) are among the spatial processing libraries that these systems build upon. Applications and services depend on the indexing capabilities available in such libraries for high-performance spatial query processing. However, limited prior work has empirically compared these libraries. Herein, we compare these libraries qualitatively and quantitatively based on four popular spatial queries and using two real-world datasets. We also compare a lesser known library (jvptree) which utilizes Vantage Point Trees. In addition to performance evaluation, we also analyzed the construction time, and space overhead, and identified the strengths and weaknesses of each libraries and their underlying index structures. Our results demonstrate that there are vast differences in space consumption (up to 9.8 x), construction time (up to 5 x), and query runtime (up to 54 x) between the libraries evaluated.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
We kept the tolerance value to 0.0 which means if the point coordinates are exactly the same only then they are snapped to the same node: https://locationtech.github.io/jts/javadoc/org/locationtech/jts/index/kdtree/KdTree.html.
- 20.
We store points from the datasets as degerate rectangles in SAMs.
- 21.
- 22.
- 23.
- 24.
References
Aji, A., et al.: Hadoop-GIS: a high performance spatial data warehousing system over mapreduce. PVLDB (2013). https://doi.org/10.14778/2536222.2536227
Eldawy, A., Mokbel, M.F.: SpatialHadoop: a MapReduce framework for spatial data. In: ICDE 2015. IEEE Computer Society (2015)
Eldawy, A., Sabek, I., Elganainy, M., Bakeer, A., Abdelmotaleb, A., Mokbel, M.F.: Sphinx: empowering impala for efficient execution of SQL queries on big spatial data. In: Gertz, M., et al. (eds.) SSTD 2017. LNCS, vol. 10411, pp. 65–83. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-64367-0_4
Gomes, D.: MemSQL Live (2019). https://www.memsql.com/blog/memsql-live-nikita-shamgunov-on-the-data-engineering-podcast/
Guttman, A.: R-trees: a dynamic index structure for spatial searching. In: SIGMOD 1984 (1984)
Hagedorn, S., et al.: The STARK framework for spatio-temporal data analytics on spark. In: Datenbanksysteme für Business, Technologie und Web (BTW 2017) (2017)
Kipf, A., et al.: Approximate geospatial joins with precision guarantees. In: 34th IEEE ICDE 2018, Paris, France, April 16–19 2018 (2018)
Kipf, A., et al.: Adaptive main-memory indexing for high-performance point-polygon joins. In: Extending Database Technology, EDBT 2020 (2020)
Lee, K., Ganti, R.K., Srivatsa, M., Liu, L.: Efficient spatial query processing for big data. In: Proceedings of the 22nd ACM SIGSPATIAL (2014)
Lee, K., et al.: Lightweight indexing and querying services for big spatial data. IEEE Trans. Serv. Comput. (2019)
Liu, L., Özsu, M.T. (eds.): Encyclopedia of Database Systems. Springer, New York (2018). https://doi.org/10.1007/978-1-4614-8265-9
Makris, A., et al.: Performance evaluation of MongoDB and PostgreSQL for spatio-temporal data. In: EDBT Workshop. CEUR Workshop Proceedings (2019)
Malensek, M., et al.: Polygon-based query evaluation over geospatial data using distributed hash tables. In: Utility and Cloud Computing, UCC 2013 (2013)
Malensek, M., et al.: Evaluating geospatial geometry and proximity queries using distributed hash tables. Comput. Sci. Eng. 16, 53–61 (2014)
MongoDB Releases - New Geo Features in MongoDB 2.4 (2013). https://www.mongodb.com/blog/post/new-geo-features-in-mongodb-24/
Oracle Spatial (2019). https://www.oracle.com/technetwork/database/options/spatialandgraph/overview/spatialfeatures-1902020.html/
Orenstein, J.A.: Redundancy in spatial databases. In: ACM SIGMOD 1989 (1989)
Pandey, V., Kipf, A., Neumann, T., Kemper, A.: How good are modern spatial analytics systems? PVLDB 11(11), 1661–1673 (2018)
Pandey, V., et al.: High-performance geospatial analytics in hyperspace. In: Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, 26 June–01 July 2016 (2016)
Tahboub, R.Y., Rompf, T.: On supporting compilation in spatial query engines: (vision paper). In: ACM SIGSPATIAL (2016)
Tang, M., et al.: LocationSpark: a distributed in-memory data management system for big spatial data. PVLDB 9(13), 1565–1568 (2016)
Tang, M., et al.: Similarity group-by operators for multi-dimensional relational data. IEEE Trans. Knowl. Data Eng. 28, 510–523 (2016)
Theocharidis, K., Liagouris, J., Mamoulis, N., Bouros, P., Terrovitis, M.: SRX: efficient management of spatial RDF data. VLDB J. 28(5), 703–733 (2019). https://doi.org/10.1007/s00778-019-00554-z
Tsitsigkos, D., et al.: Parallel in-memory evaluation of spatial joins. In: ACM SIGSPATIAL 2019 (2019). https://doi.org/10.1145/3347146.3359343
Xie, D., et al.: Simba: efficient in-memory spatial analytics. In: SIGMOD 2016 (2016)
Yianilos, P.N.: Data structures and algorithms for nearest neighbor search in general metric spaces. In: Proceedings of the Fourth Annual ACM/SIGACT-SIAM Symposium on Discrete Algorithms, Austin, Texas, USA, 25–27 January 1993 (1993)
You, S., Zhang, J., Gruenwald, L.: Large-scale spatial join query processing in cloud. In: 31st ICDE Workshops 2015 (2015)
Yu, J., et al.: GeoSpark: a cluster computing framework for processing large-scale spatial data. In: ACM SIGSPATIAL 2015 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Pandey, V., van Renen, A., Kipf, A., Kemper, A. (2020). An Evaluation of Modern Spatial Libraries. In: Nah, Y., Cui, B., Lee, SW., Yu, J.X., Moon, YS., Whang, S.E. (eds) Database Systems for Advanced Applications. DASFAA 2020. Lecture Notes in Computer Science(), vol 12113. Springer, Cham. https://doi.org/10.1007/978-3-030-59416-9_46
Download citation
DOI: https://doi.org/10.1007/978-3-030-59416-9_46
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59415-2
Online ISBN: 978-3-030-59416-9
eBook Packages: Computer ScienceComputer Science (R0)