skip to main content
10.1145/3347146.3359086acmconferencesArticle/Chapter ViewAbstractPublication PagesgisConference Proceedingsconference-collections
research-article

GloBiMaps - A Probabilistic Data Structure for In-Memory Processing of Global Raster Datasets

Published: 05 November 2019 Publication History

Abstract

In the last decade, more and more spatial data has been acquired on a global scale due to satellite missions, social media, and coordinated governmental activities. This observational data suffers from huge storage footprints and makes global analysis challenging. Therefore, many information products have been designed in which observations are turned into global maps showing features such as land cover or land use, often with only a few discrete values and sparse spatial coverage like only within cities.
Traditional coding of such data as a raster image becomes challenging due to the sizes of the datasets and spatially non-local access patterns, for example, when labeling social media streams.
This paper proposes GloBiMap, a randomized data structure, based on Bloom filters, for modeling low-cardinality sparse raster images of excessive sizes in a configurable amount of memory with pure random access operations avoiding costly intermediate decompression. In addition, the data structure is designed to correct the inevitable errors of the randomized layer in order to have a fully exact representation.
We show the feasibility of the approach on several real-world data sets including the Global Urban Footprint in which each pixel denotes whether a particular location contains a building at a resolution of roughly 10cm globally as well as on a global Twitter sample of more than 220 million precisely geolocated tweets.

References

[1]
Austin Appleby. 2008. Murmurhash 2.0.
[2]
Randolph E Bank and Craig C Douglas. 1993. Sparse matrix multiplication package (SMMP). Advances in Computational Mathematics 1, 1 (1993), 127--137.
[3]
Burton H Bloom. 1970. Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13, 7 (1970), 422--426.
[4]
Andrei Broder and Michael Mitzenmacher. 2004. Network applications of bloom filters: A survey. Internet mathematics 1, 4 (2004), 485--509.
[5]
Thomas Esch, Mattia Marconcini, Andreas Felbier, Achim Roth, Wieke Heldens, Martin Huber, Max Schwinger, Hannes Taubenböck, Andreas Müller, and Stefan Dech. 2013. Urban footprint processor---Fully automated processing chain generating settlement masks from global data of the TanDEM-X mission. IEEE Geoscience and Remote Sensing Letters 10, 6 (2013), 1617--1621.
[6]
Adam Kirsch and Michael Mitzenmacher. 2008. Less hashing, same performance: Building a better Bloom filter. Random Structures & Algorithms 33, 2 (2008), 187--218.
[7]
Samuel Leffler. 2003. LibTIFF-TIFF Library and Utilities. remotesensing.org/libtiff.
[8]
The United Nations. 2016. The World's Cities in 2016 - Data Booklet. http://www.un.org/en/development/desa/population/publications/pdf/urbanization/the_worlds_cities_in_2016_data_booklet.pdf. Accessed 2017/11/30.
[9]
United Nations. 2019. Sustainable Development Goals. Retrieved from https://www.un.org/sustainabledevelopment/.
[10]
N Ritter and M Ruth. 1997. The GeoTiff data interchange standard for raster geographic images. International Journal of Remote Sensing 18, 7 (1997), 1637--1647.
[11]
Peter Ruppel and Axel Küpper. 2014. Geocookie: a space-efficient representation of geographic location sets. Journal of Information Processing 22, 3 (2014), 418--424.
[12]
Mirco Schönfeld and Martin Werner. 2013. Node Wake-Up via OVSF-Coded Bloom Filters in Wireless Sensor Networks. In Proceedings of the 5th International Conference on Ad Hoc Networks (ADHOCNETS 2013). 119--134.
[13]
N. Sturtevant. 2012. Benchmarks for Grid-Based Pathfinding. Transactions on Computational Intelligence and AI in Games 4, 2 (2012), 144--148.
[14]
Martin Werner. 2015. BACR: Set Similarities with Lower Bounds and Application to Spatial Trajectories. In 23rd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL 2015). ACM, 10.

Cited By

View all
  • (2021)GloBiMapsAI: An AI-Enhanced Probabilistic Data Structure for Global Raster DatasetsACM Transactions on Spatial Algorithms and Systems10.1145/34531847:4(1-24)Online publication date: 21-Jun-2021
  • (2021)HQ-Filter: Hierarchy-Aware Filter For Empty-Resulting Queries in Interactive Exploration2021 22nd IEEE International Conference on Mobile Data Management (MDM)10.1109/MDM52706.2021.00019(49-58)Online publication date: Jun-2021

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGSPATIAL '19: Proceedings of the 27th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
November 2019
648 pages
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 November 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Data Sparsity and Compression
  2. Data Structures
  3. Geographic Information Systems
  4. Image Representation
  5. Randomized

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

SIGSPATIAL '19
Sponsor:

Acceptance Rates

SIGSPATIAL '19 Paper Acceptance Rate 34 of 161 submissions, 21%;
Overall Acceptance Rate 257 of 1,238 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)2
Reflects downloads up to 25 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2021)GloBiMapsAI: An AI-Enhanced Probabilistic Data Structure for Global Raster DatasetsACM Transactions on Spatial Algorithms and Systems10.1145/34531847:4(1-24)Online publication date: 21-Jun-2021
  • (2021)HQ-Filter: Hierarchy-Aware Filter For Empty-Resulting Queries in Interactive Exploration2021 22nd IEEE International Conference on Mobile Data Management (MDM)10.1109/MDM52706.2021.00019(49-58)Online publication date: Jun-2021

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media