Abstract
Hashing has been widely applied to approximate nearest neighbor search for large-scale multimedia retrieval. A variety of hashing methods have been developed for learning an efficient binary data representation, mainly by relaxing some imposed constraints during hash function learning. Although they have achieved good accuracy-speed trade-off, the resulting binary codes may fail sometimes in adequately approximating the input data, thus significantly decreasing the search accuracy. In this paper, we present a new Unsupervised Deep Learning Hashing approach, called Deep Neuron-per-Neuron Hashing, for high dimensional data indexing. Unlike most existing hashing approaches, our method does not seek to binarize the neural network output, but rather relies directly on the continuous output to create an efficient index structure with hash tables. Given the neural network deepest layer, each table indexes separately a neuron output, capturing in this way a particular high level individual structure (feature) of the input. An efficient search is then performed by computing a cumulative collision score of a given query over all the neuron-based hash tables. Experimental comparisons to the state-of-the-art demonstrate the competitiveness of the proposed method for large datasets.






Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
A bucket is a data type that groups objects together. The term is used in hashing algorithms, where different items that have the same hash code (hash value) go into the same bucket.
References
Andoni A E2LSH. http://www.mit.edu/andoni/LSH/
Andoni A, Indyk P (2008) Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Commun ACM 51(1):117–122
Bellegarda JR, Monz C (2016) State of the art in statistical methods for language and speech processing. Comput Speech Lang 35:163–184
Bentley JL (1990) K-d trees for semi dynamic point sets. In: Proceedings of the 6th annual symposium on computational geometry, SCG ’90. ACM, New York, pp 187–197
Bergstra J, Breuleux O, Bastien F, Lamblin P, Pascanu R, Desjardins G, Turian J, Warde-Farley D, Bengio Y (2010) Theano: a CPU and GPU math expression compiler. In: Proceedings of the python for scientific computing conference (SciPy)
Bordes A, Glorot X, Weston J, Bengio Y (2012) Joint learning of words and meaning representations for open-text semantic parsing. In: Proceedings of the 15th international conference on artificial intelligence and statistics, AISTATS 2012. La Palma, Canary Islands, pp 127–135
Cao Y, Long M, Wang J, Zhu H, Wen Q (2016) Deep quantization network for efficient image retrieval. In: Proceedings of the 13th AAAI conference on artificial intelligence. Phoenix, Arizona, USA, pp 3457–3463
Carreira-Perpiñán MÁ, Raziperchikolaei R (2015) Hashing with binary autoencoders. In: IEEE conference on computer vision and pattern recognition, CVPR 2015. Boston, MA, USA, pp 557–566
Carreira-Perpiñán MÁ, Wang W (2012) Distributed optimization of deeply nested systems. CoRR, arXiv:1212.5921
Chafik S, Daoudi I, El-Yacoubi MA, El Ouardi H (2015) Cluster-based data oriented hashing. In: 2015 IEEE International conference on data science and advanced analytics, DSAA 2015, Campus Des Cordeliers, Paris, France, October, pp 1–7
Charikar MS (2002) Similarity estimation techniques from rounding algorithms. In: Proceedings of the 34 Annual ACM symposium on theory of computing, STOC ’02. New York, NY, USA, pp 380–388
Dahl GE, Yu D, Deng L, Acero A (2012) Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans Audio Speech Lang Process 20(1):30–42
Datar M, Indyk P (2004) Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the 20th annual symposium on computational geometry, SCG ’04. ACM Press, pp 253–262
Deng L (2012) The mnist database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Proc Mag 29(6):141–142
Deng L, Yu D (2014) Deep learning: methods and applications. Found Trends Signal Process 7(3-4):197–387
Faro S, Lecroq T (2012) Fast searching in biological sequences using multiple hash functions. In: 12th IEEE international conference on bioinformatics & bioengineering, BIBE 2012. Larnaca, Cyprus, pp 175–180
Gan J, Feng J, Fang Q, Ng W (2012) Locality-sensitive hashing scheme based on dynamic collision counting. In: Proceedings of the 2012 ACM SIGMOD international conference on management of data, SIGMOD ’12. New York, NY, USA, pp 541–552
Garofolo JS, Lamel LF, Fisher WM, Fiscus JG, Pallett DS, Dahlgren NL (1993) Darpa timit acoustic phonetic continuous speech corpus cdrom
Gionis A, Indyk P, Motwani R (1999) Similarity search in high dimensions via hashing. In: Proceedings of the 25th International conference on very large data bases, VLDB ’99. San Francisco, CA, USA, pp 518–529
Gong Y, Lazebnik S, Gordo A, Perronnin F (2013) Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans Pattern Anal Mach Intell 35(12):2916–2929
Goodfellow IJ, Warde-Farley D, Lamblin P, Dumoulin V, Mirza M, Pascanu R, Bergstra J, Bastien F, Bengio Y (2013) Pylearn2: a machine learning research library. CoRR, arXiv:1308.4214
Gorisse D, Cord M, Precioso F (2012) Locality-sensitive hashing for chi2 distance. IEEE Trans Pattern Anal Mach Intell 34(2):402–409
Guttman A (1984) R-trees: a dynamic index structure for spatial searching. In: International conference on management of data. ACM, pp 47–57
Haveliwala T, Gionis A, Indyk P (2000) Scalable techniques for clustering the web (extended abstract). In: 3rd international workshop on the web and databases (WebDB 2000)
Heo J-P, Lee Y, He J, Chang S-F, Yoon S-E (2015) Spherical hashing: binary code embedding with hyperspheres. IEEE Trans Pattern Anal Mach Intell 37(11):2304–2316
Hinton G, Salakhutdinov R (2006) Reducing the dimensionality of data with neural networks. Science 313 (5786):504–507
Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554
Huang Q, Feng J, Zhang Y, Fang Q, Ng W (2015) Query-aware locality-sensitive hashing for approximate nearest neighbor search. Proc VLDB Endowment 9(1):1–12
Indyk P, Motwani R (1998) Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the thirtieth annual ACM symposium on theory of computing, STOC ’98. New York, NY, USA, pp 604–613
Krizhevsky A, Hinton GE (2011) Using very deep autoencoders for content-based image retrieval
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems. Lake Tahoe, Nevada, US, pp 1106–1114
Kulis B, Darrell T (2009) Learning to hash with binary reconstructive embeddings. In: Advances in neural information processing systems. Vancouver, British Columbia, Canada, pp 1042–1050
Kulis B, Grauman K (2009) Kernelized locality-sensitive hashing for scalable image search. In: IEEE 12th international conference on computer vision, ICCV 2009. Kyoto, Japan, pp 2130–2137
Kulis B, Jain P, Grauman K (2009) Fast similarity search for learned metrics. IEEE Trans Pattern Anal Mach Intell 31(12):2143–2157
Lai H, Pan Y, Ye L, Yan S (2015) Simultaneous feature learning and hash coding with deep neural networks. CoRR, arXiv:1504.03410
LeCun Y, Bengio Y (1998) The handbook of brain theory and neural networks. Chapter convolutional networks for images, speech, and time series. MIT Press, Cambridge, pp 255– 258
Lecun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444, 5
Lin J, Morère O, Petta J, Chandrasekhar V, Veillard A (2015) Tiny descriptors for image retrieval with unsupervised triplet hashing. CoRR, arXiv:1511.03055
Liong VE, Lu J, Wang G, Moulin P, Zhou J (2015) Deep hashing for compact binary codes learning. In: IEEE conference on computer vision and pattern recognition, CVPR 2015. Boston, MA, USA, pp 2475–2483
Nguyen VA, Lu J, Do MN (2014) Supervised discriminative hashing for compact binary codes. In: Proceedings of the 22nd ACM international conference on multimedia, MM ’14. ACM, New York, pp 989–992
Qin H, El-Yacoubi MA (2017) Deep representation-based feature extraction and recovering for finger-vein verification. IEEE Trans Inf Forensics Secur 12(8):1816–1829
Qin H, El Yacoubi MA (2017) Deep representation for finger-vein image quality assessment. IEEE Trans Circ Syst Video Technol PP(99):1–1
Salakhutdinov R, Hinton GE (2009) Semantic hashing. Int J Approx Reasoning 50(7):969–978
Shen F, Shen C, Liu W, Shen HT (2015) Supervised discrete hashing. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 37–45
Torralba A, Fergus R, Freeman WT (2008) 80 million tiny images: a large data set for nonparametric object and scene recognition. IEEE Trans Pattern Anal Mach Intell 30(11):1958–1970
Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol P-A (2010) Stacked denoising autoencoders Learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 11:3371–3408
Wang J, Zhang T, Song J, Sebe N, Shen HT (2016) A survey on learning to hash. CoRR, arXiv:1606.00185
Wang J, Kumar S, Chang S-F (2012) Semi-supervised hashing for large-scale search. IEEE Trans Pattern Anal Mach Intell 34(12):2393–2406
Wang J, Liu W, Kumar S, Chang S-F (2016) Learning to hash for indexing big data - a survey. Proc IEEE 104(1):34–57
Wang S, Huang Q, Jiang S, Tian Q (2012) S3mkl: scalable semi-supervised multiple kernel learning for real-world image applications. IEEE Trans Multimed 14 (4):1259–1274
Wang Z, Bovik AC (2009) Mean squared error: love it or leave it? a new look at signal fidelity measures. IEEE Signal Proc Mag 26(1):98–117
Weber R, Schek H-J, Blott S (1998) A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Proceedings of the 24rd international conference on very large data bases, VLDB ’98. San Francisco, CA, USA, pp 194–205
Weiss Y, Torralba A, Fergus R (2008) Spectral hashing. In: Advances in neural information processing systems. Vancouver, British Columbia, Canada, pp 1753–1760
Xia R, Pan Y, Lai H, Liu C, Yan S (2014) Supervised hashing for image retrieval via image representation learning. In: Proceedings of the 28th AAAI conference on artificial intelligence. Québec City, Québec, Canada, pp 2156–2162
Xia Z, Feng X, Peng J, Hadid A (2016) Unsupervised deep hashing for large-scale visual search. In: 6th international conference on image processing theory, tools and applications, IPTA 2016. Oulu, Finland, pp 1–5
Zhu H, Long M , Wang J, Cao Y (2016) Deep hashing network for efficient similarity retrieval. In: Proceedings of the 13th AAAI conference on artificial intelligence, AAAI’16. AAAI Press, pp 2415–2421
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chafik, S., El Yacoubi, M.A., Daoudi, I. et al. Unsupervised deep neuron-per-neuron hashing. Appl Intell 49, 2218–2232 (2019). https://doi.org/10.1007/s10489-018-1353-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-018-1353-5