ABSTRACT
When integrating geo-spatial datasets, a join algorithm is used for finding sets of corresponding objects (i.e., objects that represent the same real-world entity). Algorithms for joining two datasets were studied in the past. This paper investigates integration of three datasets and proposes methods that can be easily generalized to any number of datasets. Two approaches that use only locations of objects are presented and compared. In one approach, a join algorithm for two datasets is applied sequentially. In the second approach, all the integrated datasets are processed simultaneously. For the two approaches, join algorithms are given and their performances, in terms of recall and precision, are compared. The algorithms are designed to perform well even when locations are imprecise and each dataset represents only some of the real-world entities. Results of extensive experiments show that one of the algorithms has the best (or close to the best) performances under all circumstances. This algorithm has a much better performance than applying sequentially the one-sided nearest-neighbor join.
- C. Beeri, Y. Kanza, E. Safra, and Y. Sagiv. Object fusion in geographic information systems. In Proc. of the 13th International Conference on Very Large Data Bases, Toronto (Ontario, Canada), 2004.]]Google ScholarCross Ref
- O. Boucelma, M. Essid, and Z. Lacroix. A WFS-based mediation system for GIS interoperability. In Proc. of the 10th ACM International Symposium on Advances in Geographic Information Systems, McLean, (Virginia, US), 2002.]] Google ScholarDigital Library
- T. Bruns and M. Egenhofer. Similarity of spatial scenes. In Proc. of the 7th International Symposium on Spatial Data Handling, Delft (Netherlands), 1996.]]Google Scholar
- M. A. Cobb, M. J. Chung, H. Foley, F. E. Petry, and K. B. Show. A rule-based approach for con ation of attribute vector data. GioInformatica, 2(1):7--33, 1998.]] Google ScholarDigital Library
- T. Devogele, C. Parent, and S. Spaccapietra. On spatial database integration. International Journal of Geographic Information Systems, 12(4), 1998.]]Google ScholarCross Ref
- Y. Doytsher and S. Filin. The detection of of corresponding objects in a linear-based map con ation. Surveying and Land Information Systems, 60(2):117--128, 2000.]]Google Scholar
- Y. Doytsher, S. Filin, and E. Ezra. Transformation of datasets in a linear-based map con ation framework. Surveying and Land Information Systems, 61(3):159--169, 2001.]]Google Scholar
- F. T. Fonseca and M. J. Egenhofer. Ontology-driven geographic information systems. In Proc. of the 7th ACM International Symposium on Advances in Geographic Information Systems, Kansas City (Missouri, US), 1999.]] Google ScholarDigital Library
- F. T. Fonseca, M. J. Egenhofer, and P. Agouris. Using ontologies for integrated geographic information systems. Transactions in GIS, 6(3), 2002.]]Google ScholarDigital Library
- R. Laurini, K. Yetongnon, and D. Benslimane. Gis interoperability, from problems to solutions. In Encyclopedia of of Life Support Systems (EOLSS). Eolss Publishers, 2002.]]Google Scholar
- H. Mayer. Automatic object extraction from aerial imagery - a survey focusing on buildings. Computer Vision and Image Understanding, 74(2), 1999.]] Google ScholarDigital Library
- E. M. Mikhail. Observations and Least Squares. University Press of America, 1976.]]Google Scholar
- M. Minami. Using ArcMap. Environmental Systems Research Institute, Inc., 2000.]]Google Scholar
- C. Parent and S. Spaccapietra. Database integration: The key to data interoperability. In Advances in Object-Oriented Data Modeling. MIT Press, 2000.]]Google Scholar
- B. Rosen and A. Saalfeld. Match criteria for automatic alignment. In Proc. of 7th International Symposium on Computer-Assisted Cartography (Auto-Carto 7), 1985.]]Google Scholar
- A. Saalfeld. Con ation-automated map compilation. International Journal of Geographical Information Systems, 2(3):217--228, 1988.]]Google ScholarCross Ref
- A. Samal, S. Seth, and K. Cueto. A feature based approach to con ation of geospatial sources. International Journal of Geographical Information Science, 18(00):1--31, 2004.]]Google Scholar
- R. Sinkhorn. A relationship between arbitrary positive matrices and doubly stochastic matrices. The Annals of Mathematical Statistics, 35(2):876--879, 1964.]]Google ScholarCross Ref
- R. Sinkhorn. Diagonal equivalence to matrices with perscribed row and column sums. The American Mathematical Monthly, 74(4):402--405, 1967.]]Google ScholarCross Ref
- H. Uitermark, P. V. Oosterom, N. Mars, and M. Molenaar. Ontology-based geographic data set integration. In Proc. of Workshop on Spatio-Temporal Database Management, Edinburgh (Scotland), 1999.]] Google ScholarDigital Library
- G. Wiederhold. Mediators in the architecture of future information systems. Computer, 25(3):38--49, 1992.]] Google ScholarDigital Library
- G. Wiederhold. Mediation to deal with heterogeneous data sources. In Proc. of 2nd International Conference on Introperating Geographic Information Systems, Zurich (Swizerland), 1999.]] Google ScholarDigital Library
Index Terms
- Finding corresponding objects when integrating several geo-spatial datasets
Recommendations
Location-based algorithms for finding sets of corresponding objects over several geo-spatial data sets
When integrating geo-spatial data sets, a join algorithm is used for finding sets of corresponding objects (i.e., objects that represent the same real-world entity). This article investigates location-based join algorithms for integration of several ...
Towards a Learned Cost Model for Distributed Spatial Join: Data, Code & Models
CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge ManagementGeospatial data comprise around 60% of all the publicly available data. One of the essential and most complex operations that brings together multiple geospatial datasets is the spatial join operation. Due to its complexity, there is a lot of ...
On Spatial Joins in MapReduce
SIGSPATIAL '17: Proceedings of the 25th ACM SIGSPATIAL International Conference on Advances in Geographic Information SystemsThis paper provides the first attempt for a full-fledged query optimizer for MapReduce-based spatial join algorithms. The optimizer develops its own taxonomy that covers almost all possible ways of doing a spatial join for any two input datasets. The ...
Comments