skip to main content
10.1145/2811222.2811225acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Nearest Neighbour Join with Groups and Predicates

Published: 22 October 2015 Publication History

Abstract

This paper proposes the nearest neighbor join, r x T [G, Θ] s, with similarity on T, and integrated support for grouping attributes G and selection predicates Θ. The corresponding valuation algorithm, roNNJ, is robust and does not suffer from redundant fetches and index false hits, which are major performance bottlenecks in nearest neighbour joins that do not support grouping attributes and selection predicates. Our solution does not compute redundant fetches since it accesses the fact table only once, and uses the groups of the outer relation to limit the fact table to its relevant portions. We experimentally evaluate our solution using a data warehouse that manages analyses of animal feeds, and the TPC-H.

References

[1]
Yao, B., Li, F., Kumar, P.: K nearest neighbor queries and knn-joins in large relational databases (almost) for free. ICDE 0 (2010) 4--15.
[2]
Silva, Y. N., Aref, W. G., Ali, M. H.: The similarity join database operator. In: ICDE. (2010) 892--903.
[3]
Galindo-Legaria, C. A., Joshi, M.: Orthogonal optimization of subqueries and aggregation. In: SIGMOD. (2001) 571--581.
[4]
Taliun, A., Böhlen, M., Bracher, A., Cafagna, F.: A gis-based data analysis platform for analyzing the time-varying quality of animal feed and its impact on the environment. In: iEMSs. (2012).
[5]
TPC: TCP-H benchmark. http://www.tpc.org/tpch/ (2015).
[6]
Silva, Y. N., Aref, W. G., Larson, P. Å., Pearson, S., Ali, M. H.: Similarity queries: their conceptual evaluation, trans- formations, and processing. VLDB J. 22(3) (2013) 395--420.
[7]
Kimura, H., Huo, G., Rasin, A., Madden, S., Zdonik, S. B.: Correlation maps: A compressed access method for exploiting soft functional dependencies. PVLDB 2(1) (2009) 1222--1233.
[8]
Soliman, M. A., Antova, L., Raghavan, V., El-Helw, A., Gu, Z., Shen, E., Caragea, G. C., Garcia-Alvarado, C., Rahman, F., Petropoulos, M., Waas, F., Narayanan, S., Krikellas, K., Baldwin, R.: Orca: a modular query optimizer architecture for big data. In: International Conference on Management of Data, SIGMOD 2014, Snowbird, UT, USA, June 22--27, 2014. (2014) 337--348.
[9]
Antova, L., El-Helw, A., Soliman, M. A., Gu, Z., Petropoulos, M., Waas, F.: Optimizing queries over partitioned tables in mpp systems. In: SIGMOD. (2014) 373--384.
[10]
Aly, A. M., Aref, W. G., Ouzzani, M.: Spatial queries with two knn predicates. PVLDB 5(11) (2012) 1100--1111.
[11]
Böhm, C.: The similarity joins: A powerful database primite for high performance data mining. ICDE, Tutorial (2001).
[12]
Jacox, E. H., Samet, H.: Metric space similarity joins. ACM Trans. Database Syst. 33(2) (2008) 7:1--7:38.
[13]
Dohnal, V., Gennaro, C., Savino, P., Zezula, P.: Similarity join in metric spaces. In: ECIR. (2003) 452--467.
[14]
Dohnal, V., Gennaro, C., Zezula, P.: Similarity join in metric spaces using ed-index. In: DEXA. (2003) 484--493.
[15]
Kimball, R., Ross, M.: The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling. p.269--271, 2nd edn. John Wiley & Sons, Inc., New York, USA (2002).
[16]
Vassiliadis, P.: Encyclopedia of Database Systems, p. 671. Springer US (2009).
[17]
Blasgen, M. W., Eswaran, K. P.: Storage and access in relational data bases. IBM Syst. J. 16(4) (1977) 363--377.
[18]
Mishra, P., Eich, M. H.: Join processing in relational databases. ACM Comput. Surv. 24(1) (1992) 63--113.

Cited By

View all
  • (2017)Category- and selection-enabled nearest neighbor joinsInformation Systems10.1016/j.is.2017.01.00668(3-16)Online publication date: Aug-2017
  • (2015)DOLAP 2015 Workshop SummaryProceedings of the 24th ACM International on Conference on Information and Knowledge Management10.1145/2806416.2806876(1939-1940)Online publication date: 17-Oct-2015

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DOLAP '15: Proceedings of the ACM Eighteenth International Workshop on Data Warehousing and OLAP
October 2015
108 pages
ISBN:9781450337854
DOI:10.1145/2811222
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 October 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. robust nearest neighbour join
  2. similarity join
  3. sort merge

Qualifiers

  • Research-article

Funding Sources

Conference

CIKM'15
Sponsor:

Acceptance Rates

DOLAP '15 Paper Acceptance Rate 8 of 31 submissions, 26%;
Overall Acceptance Rate 29 of 79 submissions, 37%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)1
Reflects downloads up to 10 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2017)Category- and selection-enabled nearest neighbor joinsInformation Systems10.1016/j.is.2017.01.00668(3-16)Online publication date: Aug-2017
  • (2015)DOLAP 2015 Workshop SummaryProceedings of the 24th ACM International on Conference on Information and Knowledge Management10.1145/2806416.2806876(1939-1940)Online publication date: 17-Oct-2015

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media