ABSTRACT
Similarity search, and specifically the nearest-neighbor search (NN) problem is widely used in many fields of computer science such as machine learning, computer vision and databases. However, in many settings such searches are known to suffer from the notorious curse of dimensionality, where running time grows exponentially with d. This causes severe performance degradation when working in high-dimensional spaces. Approximate techniques such as locality-sensitive hashing [2] improve the performance of the search, but are still computationally intensive.
In this paper we propose a new way to solve this problem using a special hardware device called ternary content addressable memory (TCAM). TCAM is an associative memory, which is a special type of computer memory that is widely used in switches and routers for very high speed search applications. We show that the TCAM computational model can be leveraged and adjusted to solve NN search problems in a single TCAM lookup cycle, and with linear space. This concept does not suffer from the curse of dimensionality and is shown to improve the best known approaches for NN by more than four orders of magnitude. Simulation results demonstrate dramatic improvement over the best known approaches for NN, and suggest that TCAM devices may play a critical role in future large-scale databases and cloud applications.
- M. Aly, M. Munich, and P. Perona. Indexing in large scale image collections: Scaling properties and benchmark. In WACV, pages 418--425, 2011. Google ScholarDigital Library
- A. Andoni and P. Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Comm. of the ACM, 51(1):117--122, 2008. Google ScholarDigital Library
- N. Beckmann, H.-P. Kriegel, R. Schneider, and B. Seeger. The r*-tree: An efficient and robust access method for points and rectangles. In SIGMOD, pages 322--331, 1990. Google ScholarDigital Library
- J. S. Beis and D. G. Lowe. Shape indexing using approximate nearest-neighbour search in high-dimensional spaces. In CVPR, pages 1000--1006, 1997. Google ScholarDigital Library
- J. L. Bentley. Multidimensional divide-and-conquer. Comm. of the ACM, 23(4):214--229, 1980. Google ScholarDigital Library
- A. Borodin, R. Ostrovsky, and Y. Rabani. Lower bounds for high dimensional nearest neighbor search and related problems. In STOC, pages 312--321, 1999. Google ScholarDigital Library
- M. Brown and D. G. Lowe. Recognising panoramas. In ICCV, volume 3, page 1218, 2003. Google ScholarDigital Library
- I. Corp. Intel Xeon processor E7-4870, 2011. http://ark.intel.com/products/53579/.Google Scholar
- V. Garcia, E. Debreuve, F. Nielsen, and M. Barlaud. K-nearest neighbor search: Fast GPU-based implementations and application to high-dimensional feature matching. In ICIP, pages 3757--3760, 2010.Google ScholarCross Ref
- F. Gray. Pulse code communication. US Patent 2,632,058, March 17 1953 (filed November 13 1947).Google Scholar
- C. Inc. NEURON search processors, 2014. http://bit.ly/1uSaH8q.Google Scholar
- P. Indyk and R. Motwani. Approximate nearest neighbors: Towards removing the curse of dimensionality. In STOC, pages 604--613, 1998. Google ScholarDigital Library
- W. Jiang, Q. Wang, and V. K. Prasanna. Beyond TCAMs: An SRAM-based parallel multi-pipeline architecture for terabit IP lookup. In INFOCOM, pages 1786--1794, 2008.Google ScholarCross Ref
- K. Lakshminarayanan, A. Rangarajan, and S. Venkatachary. Algorithms for advanced packet classification with ternary CAMs. In SIGCOMM, pages 193--204, 2005. Google ScholarDigital Library
- L. Liang, C. Liu, Y.-Q. Xu, B. Guo, and H.-Y. Shum. Real-time texture synthesis by patch-based sampling. ACM ToG, 20(3):127--150, 2001. Google ScholarDigital Library
- S. Lloyd. Least squares quantization in pcm. IEEE Trans. Inf. Theor., 28(2):129--137, Sep 2006. Google ScholarDigital Library
- D. G. Lowe. Object recognition from local scale-invariant features. In ICCV, pages 1150--1157, 1999. Google ScholarDigital Library
- Nvidia. Tesla K80 GPU accelerator, Nov. 2014. http://bit.ly/1C69bVd.Google Scholar
- A. Oliva and A. Torralba. Modeling the shape of the scene: A holistic representation of the spatial envelope. IJCV, 42:145--175, 2001. Google ScholarDigital Library
- Open Networking Foundation. OpenFlow Switch Specification Version 1.3.2, April 2013.Google Scholar
- J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Object retrieval with large vocabularies and fast spatial matching. In CVPR, pages 1--8, 2007.Google ScholarCross Ref
- V. Ravikumar and R. N. Mahapatra. TCAM architecture for IP lookup using prefix properties. Micro, IEEE, 24(2):60--69, 2004. Google ScholarDigital Library
- Renesas Electronics America Inc. 20Mbit QUAD-search content addressable memory. http://bit.ly/18hYySx.Google Scholar
- B. C. Russell, A. Torralba, K. P. Murphy, and W. T. Freeman. Labelme: A database and web-based tool for image annotation. IJCV, 77(1--3):157--173, 2008. Google ScholarDigital Library
- H. Samet. Foundations of multidimensional and metric data structures. Morgan Kaufmann, 2006. Google ScholarDigital Library
- R. Shinde, A. Goel, P. Gupta, and D. Dutta. Similarity search and locality sensitive hashing using ternary content addressable memories. In SIGMOD, pages 375--386, 2010. Google ScholarDigital Library
- R. Weber, H.-J. Schek, and S. Blott. A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In VLDB, pages 194--205, 1998. Google ScholarDigital Library
Index Terms
- Ultra-Fast Similarity Search Using Ternary Content Addressable Memory
Recommendations
Similarity search and locality sensitive hashing using ternary content addressable memories
SIGMOD '10: Proceedings of the 2010 ACM SIGMOD International Conference on Management of dataSimilarity search methods are widely used as kernels in various data mining and machine learning applications including those in computational biology, web search/clustering. Nearest neighbor search (NNS) algorithms are often used to retrieve similar ...
Design and Analysis of STTRAM-Based Ternary Content Addressable Memory Cell
Content Addressable Memory (CAM) is widely used in applications where searching a specific pattern of data is a major operation. Conventional CAMs suffer from area, power, and speed limitations. We propose Spin-Torque Transfer RAM--based Ternary CAM (...
FeFET Multi-Bit Content-Addressable Memories for In-Memory Nearest Neighbor Search
Nearest neighbor (NN) search computations are at the core of many applications such as few-shot learning, classification, and hyperdimensional computing. As such, efficient hardware support for NN search is highly desired. In-memory computing using ...
Comments