skip to main content
10.1145/3437120.3437345acmotherconferencesArticle/Chapter ViewAbstractPublication PagespciConference Proceedingsconference-collections
research-article

Prepartitioning in MapReduce Processing of Group Nearest-Neighbor Query

Published: 04 March 2021 Publication History

Abstract

Given two datasets of points (called Query and Training), the Group (K) Nearest-Neighbor (GKNN) query retrieves (K) points of the Training with the smallest sum of distances to every point of the Query. This spatial query has been studied during the recent years and several performance improving techniques and pruning heuristics have been proposed. In a previous work, we presented the first MapReduce algorithm, consisting of alternating local and parallel phases, which can be used to effectively process the GKNN query when the Query fits in memory, while the Training one belongs to the Big Data category. In subsequent works, we presented several improvements on the first version of the algorithm. In this paper we present yet another improvement, which consists in the prepartitioning of the Training dataset. As shown in the experimentation section, this technique significantly reduces data transfer and total running time of the algorithm. Furthermore, the prepartitioning of the Training dataset is performed only once and can be reused with multiple Query datasets, leading to faster response times.

References

[1]
Francisco García-García, Antonio Corral, Luis Iribarne, and Michael Vassilakopoulos. 2020. Improving Distance-Join Query processing with Voronoi-Diagram based partitioning in SpatialHadoop. Future Generation Computer Systems 111 (2020), 723–740.
[2]
Francisco Garcia-Garcia, Antonio Corral, Luis Iribarne, Michael Vassilakopoulos, and Yannis Manolopoulos. 2020. Efficient distance join query processing in distributed spatial data management systems. Information Sciences 512(2020), 985–1008.
[3]
Tanzima Hashem, Lars Kulik, and Rui Zhang. 2010. Privacy preserving group nearest neighbor queries. In EDBT Conf.ACM, 489–500.
[4]
Anil K. Jain, M. Narasimha Murty, and Patrick J. Flynn. 1999. Data Clustering: A Review. ACM Comput. Surv. 31, 3 (1999), 264–323.
[5]
Tao Jiang, Yunjun Gao, Bin Zhang, Qing Liu, and Lu Chen. 2013. Reverse Top-k Group Nearest Neighbor Search. In WAIM Conf.429–439.
[6]
Xutong Liu, Feng Chen, and Chang-Tien Lu. 2012. Robust Prediction and Outlier Detection for Spatial Datasets. In ICDM Conf.469–478.
[7]
Panagiotis Moutafis, Francisco García-García, George Mavrommatis, Michael Vassilakopoulos, Antonio Corral, and Luis Iribarne. 2019. MapReduce algorithms for the K group nearest-neighbor query. In Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing, SAC 2019. 448–455.
[8]
Panagiotis Moutafis, Francisco Garcia-Garcia, George Mavrommatis, Michael Vassilakopoulos, Antonio Corral, and Luis Iribarne. 2020. Algorithms for Processing the Group K Nearest-Neighbor Query on Distributed Frameworks. Submitted for publication. Accessible at https://faculty.e-ce.uth.gr/mvasilako/DRAFT_GNNQ.pdf.
[9]
Thao P. Nghiem, David Green, and David Taniar. 2013. Peer-to-Peer Group k-Nearest Neighbours in Mobile Ad-Hoc Networks. In ICPADS Conf.166–173.
[10]
Dimitris Papadias, Qiongmao Shen, Yufei Tao, and Kyriakos Mouratidis. 2004. Group Nearest Neighbor Queries. In ICDE Conf.IEEE, 301–312.
[11]
Dimitris Papadias, Yufei Tao, Kyriakos Mouratidis, and Chun Kit Hui. 2005. Aggregate nearest neighbor queries in spatial databases. ACM Trans. Database Syst. 30, 2 (2005), 529–576.
[12]
George Roumelis, Michael Vassilakopoulos, Antonio Corral, and Yannis Manolopoulos. 2015. Plane-Sweep Algorithms for the K Group Nearest-Neighbor Query. In GISTAM Conf.Scitepress, 83–93.
[13]
George Roumelis, Michael Vassilakopoulos, Antonio Corral, and Yannis Manolopoulos. 2016. The K Group Nearest-Neighbor Query on Non-indexed RAM-Resident Data. In Geographical Information Systems Theory, Applications and Management, Cédric Grueau and Jorge Gustavo Rocha (Eds.). Springer International Publishing, 69–89.
[14]
Dongxiang Zhang, Chee-Yong Chan, and Kian-Lee Tan. 2013. Nearest group queries. In SSDBM Conf.ACM, 7.
[15]
Liang Zhu, Yinan Jing, Weiwei Sun, Dingding Mao, and Peng Liu. 2010. Voronoi-based aggregate nearest neighbor query processing in road networks. In ACM-GIS Conf.ACM, 518–521.

Cited By

View all
  • (2021)Efficient Group K Nearest-Neighbor Spatial Query Processing in Apache SparkISPRS International Journal of Geo-Information10.3390/ijgi1011076310:11(763)Online publication date: 11-Nov-2021

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
PCI '20: Proceedings of the 24th Pan-Hellenic Conference on Informatics
November 2020
433 pages
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 March 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Group nearest-neighbor query
  2. Hadoop
  3. Spatial query processing

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

PCI 2020
PCI 2020: 24th Pan-Hellenic Conference on Informatics
November 20 - 22, 2020
Athens, Greece

Acceptance Rates

Overall Acceptance Rate 190 of 390 submissions, 49%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2021)Efficient Group K Nearest-Neighbor Spatial Query Processing in Apache SparkISPRS International Journal of Geo-Information10.3390/ijgi1011076310:11(763)Online publication date: 11-Nov-2021

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media