Skip to main content

Similarity Sketching

  • Living reference work entry
  • First Online:
Book cover Encyclopedia of Big Data Technologies

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  • Andoni A, Indyk P (2008) Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Commun ACM 51(1):117–122

    Google Scholar 

  • Broder AZ (1997) On the resemblance and containment of documents. In: Proceedings of compression and complexity of sequences. IEEE, pp 21–29

    Google Scholar 

  • Broder AZ, Glassman SC, Manasse MS, Zweig G (1997) Syntactic clustering of the web. Comput Netw ISDN Syst 29(8):1157–1166

    Google Scholar 

  • Charikar M (2002) Similarity estimation techniques from rounding algorithms. In: Proceedings of symposium on theory of computing (STOC), pp 380–388

    Google Scholar 

  • Chierichetti F, Kumar R (2015) Lsh-preserving functions and their applications. J ACM 62(5):33

    Google Scholar 

  • Dahlgaard S, Knudsen MBT, Thorup M (2017) Fast similarity sketching. In: Proceedings of symposium on foundations of computer science (FOCS), pp 663–671

    Google Scholar 

  • Gionis A, Indyk P, Motwani R (1999) Similarity search in high dimensions via hashing. In: Proceedings of conference on very large databases (VLDB), pp 518–529

    Google Scholar 

  • Jégou H, Douze M, Schmid C (2011) Product quantization for nearest neighbor search. IEEE Trans Pattern Anal Mach Intell 33(1):117–128

    Google Scholar 

  • Li P, König AC (2011) Theory and applications of b-bit minwise hashing. Commun ACM 54(8):101–109

    Google Scholar 

  • Li P, Owen AB, Zhang C (2012) One permutation hashing. In: Advances in neural information processing systems (NIPS), pp 3122–3130

    Google Scholar 

  • Mitzenmacher M, Pagh R, Pham N (2014) Efficient estimation for high similarities using odd sketches. In: Proceedings of international world wide web conference (WWW), pp 109–118

    Google Scholar 

  • Rahimi A, Recht B (2007) Random features for large-scale kernel machines. In: Advances in neural information processing systems (NIPS), pp 1177–1184

    Google Scholar 

  • Thorup M (2013) Bottom-k and priority sampling, set similarity and subset sums with minimal independence. In: Proceedings of symposium on theory of computing (STOC). ACM, pp 371–380

    Google Scholar 

  • Wang J, Zhang T, Song J, Sebe N, Shen HT (2017) A survey on learning to hash. IEEE Trans Pattern Anal Mach Intell 13(9) https://doi.org/10.1109/TPAMI.2017.2699960

Download references

Acknowledgements

This work received support from the European Research Council under the European Union’s 7th Framework Programme (FP7/2007-2013)/ ERC grant agreement no. 614331.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rasmus Pagh .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Pagh, R. (2018). Similarity Sketching. In: Sakr, S., Zomaya, A. (eds) Encyclopedia of Big Data Technologies. Springer, Cham. https://doi.org/10.1007/978-3-319-63962-8_58-1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-63962-8_58-1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-63962-8

  • Online ISBN: 978-3-319-63962-8

  • eBook Packages: Springer Reference MathematicsReference Module Computer Science and Engineering

Publish with us

Policies and ethics