Skip to main content

Progressive High-Dimensional Similarity Join

  • Conference paper
Database and Expert Systems Applications (DEXA 2007)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4653))

Included in the following conference series:

Abstract

The Rate-Based Progressive Join (RPJ) is a non-blocking relational equijoin algorithm. It is an equijoin that can deliver results progressively. In this paper, we first present a naive extension, called neRPJ, to the progressive computation of the similarity join of high-dimensional data. We argue that this naive extension is not suitable. We therefore propose an adequate solution in the form of a Result-Rate Progressive Join (RRPJ) for high-dimensional distance similarity joins. Using both synthetic and real-life datasets, we empirically show that RRPJ is effective and efficient, and outperforms the naive extension.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Tao, Y., Yiu, M.L., Papadias, D., Hadjieleftheriou, M., Mamoulis, N.: RPJ: Producing fast join results on streams through rate-based optimization. In: SIGMOD, pp. 371–382 (2005)

    Google Scholar 

  2. Tok, W.H., Bressan, S., Lee, M.-L.: RRPJ: Result-rate based progressive relational join. In: DASFAA, pp. 43–54 (2007)

    Google Scholar 

  3. Tok, W.H., Bressan, S., Lee, M.-L.: Progressive spatial joins. In: SSDBM, pp. 353–358 (2006)

    Google Scholar 

  4. Shim, K., Srikant, R., Agrawal, R.: High-dimensional similarity joins. In: ICDE, pp. 301–311 (1997)

    Google Scholar 

  5. Koudas, N., Sevcik, K.C.: High dimensional similarity joins: Algorithms and performance evaluation. IEEE Transactions on Knowledge and Data Engineering 12(1), 3–18 (2000)

    Article  Google Scholar 

  6. Böhm, C., Braunmüller, B., Breunig, M.M., Kriegel, H.-P.: High performance clustering based on the similarity join. In: CIKM, pp. 298–305 (2000)

    Google Scholar 

  7. Böhm, C., Braunmüller, B., Krebs, F., Kriegel, H.-P.: Epsilon grid order: An algorithm for the similarity join on massive high-dimensional data. In: SIGMOD, pp. 379–388 (2001)

    Google Scholar 

  8. Kalashnikov, D.V., Prabhakar, S.: Fast similarity join for multi-dimensional data. Inf. Syst. 32(1), 160–177 (2007)

    Article  Google Scholar 

  9. Guttman, A.: R-trees: A dynamic index structure for spatial searching. In: SIGMOD, pp. 47–57 (1984)

    Google Scholar 

  10. Berchtold, S., Keim, D.A., Kriegel, H.-P.: The x-tree: An index structure for high-dimensional data. In: VLDB, pp. 28–39 (1996)

    Google Scholar 

  11. Koudas, N., Sevcik, K.C.: High dimensional similarity joins: Algorithms and performance evaluation. In: ICDE, pp. 466–475 (1998)

    Google Scholar 

  12. Urhan, T., Franklin, M.J.: XJoin: Getting fast answers from slow and bursty networks. Technical Report CS-TR-3994, University of Maryland (1999)

    Google Scholar 

  13. Dittrich, J.-P., Seeger, B., Taylor, D.S., Widmayer, P.: Progressive merge join: A generic and non-blocking sort-based join algorithm. In: VLDB, pp. 299–310 (2002)

    Google Scholar 

  14. Mokbel, M.F., Lu, M., Aref, W.G.: Hash-merge join: A non-blocking join algorithm for producing fast and early join results. In: ICDE, pp. 251–263 (2004)

    Google Scholar 

  15. Wilschut, A.N., Apers, P.M.G.: Dataflow query execution in a parallel main-memory environment. In: PDIS, pp. 68–77 (1991)

    Google Scholar 

  16. Corel image features dataset (1999), http://kdd.ics.uci.edu/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Roland Wagner Norman Revell Günther Pernul

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Tok, W.H., Bressan, S., Lee, ML. (2007). Progressive High-Dimensional Similarity Join. In: Wagner, R., Revell, N., Pernul, G. (eds) Database and Expert Systems Applications. DEXA 2007. Lecture Notes in Computer Science, vol 4653. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74469-6_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74469-6_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74467-2

  • Online ISBN: 978-3-540-74469-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics