Skip to main content
Log in

Unsupervised deep neuron-per-neuron hashing

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Hashing has been widely applied to approximate nearest neighbor search for large-scale multimedia retrieval. A variety of hashing methods have been developed for learning an efficient binary data representation, mainly by relaxing some imposed constraints during hash function learning. Although they have achieved good accuracy-speed trade-off, the resulting binary codes may fail sometimes in adequately approximating the input data, thus significantly decreasing the search accuracy. In this paper, we present a new Unsupervised Deep Learning Hashing approach, called Deep Neuron-per-Neuron Hashing, for high dimensional data indexing. Unlike most existing hashing approaches, our method does not seek to binarize the neural network output, but rather relies directly on the continuous output to create an efficient index structure with hash tables. Given the neural network deepest layer, each table indexes separately a neuron output, capturing in this way a particular high level individual structure (feature) of the input. An efficient search is then performed by computing a cumulative collision score of a given query over all the neuron-based hash tables. Experimental comparisons to the state-of-the-art demonstrate the competitiveness of the proposed method for large datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. A bucket is a data type that groups objects together. The term is used in hashing algorithms, where different items that have the same hash code (hash value) go into the same bucket.

  2. http://corpus-texmex.irisa.fr

References

  1. Andoni A E2LSH. http://www.mit.edu/andoni/LSH/

  2. Andoni A, Indyk P (2008) Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Commun ACM 51(1):117–122

    Article  Google Scholar 

  3. Bellegarda JR, Monz C (2016) State of the art in statistical methods for language and speech processing. Comput Speech Lang 35:163–184

    Article  Google Scholar 

  4. Bentley JL (1990) K-d trees for semi dynamic point sets. In: Proceedings of the 6th annual symposium on computational geometry, SCG ’90. ACM, New York, pp 187–197

  5. Bergstra J, Breuleux O, Bastien F, Lamblin P, Pascanu R, Desjardins G, Turian J, Warde-Farley D, Bengio Y (2010) Theano: a CPU and GPU math expression compiler. In: Proceedings of the python for scientific computing conference (SciPy)

  6. Bordes A, Glorot X, Weston J, Bengio Y (2012) Joint learning of words and meaning representations for open-text semantic parsing. In: Proceedings of the 15th international conference on artificial intelligence and statistics, AISTATS 2012. La Palma, Canary Islands, pp 127–135

  7. Cao Y, Long M, Wang J, Zhu H, Wen Q (2016) Deep quantization network for efficient image retrieval. In: Proceedings of the 13th AAAI conference on artificial intelligence. Phoenix, Arizona, USA, pp 3457–3463

  8. Carreira-Perpiñán MÁ, Raziperchikolaei R (2015) Hashing with binary autoencoders. In: IEEE conference on computer vision and pattern recognition, CVPR 2015. Boston, MA, USA, pp 557–566

  9. Carreira-Perpiñán MÁ, Wang W (2012) Distributed optimization of deeply nested systems. CoRR, arXiv:1212.5921

  10. Chafik S, Daoudi I, El-Yacoubi MA, El Ouardi H (2015) Cluster-based data oriented hashing. In: 2015 IEEE International conference on data science and advanced analytics, DSAA 2015, Campus Des Cordeliers, Paris, France, October, pp 1–7

  11. Charikar MS (2002) Similarity estimation techniques from rounding algorithms. In: Proceedings of the 34 Annual ACM symposium on theory of computing, STOC ’02. New York, NY, USA, pp 380–388

  12. Dahl GE, Yu D, Deng L, Acero A (2012) Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans Audio Speech Lang Process 20(1):30–42

    Article  Google Scholar 

  13. Datar M, Indyk P (2004) Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the 20th annual symposium on computational geometry, SCG ’04. ACM Press, pp 253–262

  14. Deng L (2012) The mnist database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Proc Mag 29(6):141–142

    Article  Google Scholar 

  15. Deng L, Yu D (2014) Deep learning: methods and applications. Found Trends Signal Process 7(3-4):197–387

    Article  MathSciNet  MATH  Google Scholar 

  16. Faro S, Lecroq T (2012) Fast searching in biological sequences using multiple hash functions. In: 12th IEEE international conference on bioinformatics & bioengineering, BIBE 2012. Larnaca, Cyprus, pp 175–180

  17. Gan J, Feng J, Fang Q, Ng W (2012) Locality-sensitive hashing scheme based on dynamic collision counting. In: Proceedings of the 2012 ACM SIGMOD international conference on management of data, SIGMOD ’12. New York, NY, USA, pp 541–552

  18. Garofolo JS, Lamel LF, Fisher WM, Fiscus JG, Pallett DS, Dahlgren NL (1993) Darpa timit acoustic phonetic continuous speech corpus cdrom

  19. Gionis A, Indyk P, Motwani R (1999) Similarity search in high dimensions via hashing. In: Proceedings of the 25th International conference on very large data bases, VLDB ’99. San Francisco, CA, USA, pp 518–529

  20. Gong Y, Lazebnik S, Gordo A, Perronnin F (2013) Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans Pattern Anal Mach Intell 35(12):2916–2929

    Article  Google Scholar 

  21. Goodfellow IJ, Warde-Farley D, Lamblin P, Dumoulin V, Mirza M, Pascanu R, Bergstra J, Bastien F, Bengio Y (2013) Pylearn2: a machine learning research library. CoRR, arXiv:1308.4214

  22. Gorisse D, Cord M, Precioso F (2012) Locality-sensitive hashing for chi2 distance. IEEE Trans Pattern Anal Mach Intell 34(2):402–409

    Article  Google Scholar 

  23. Guttman A (1984) R-trees: a dynamic index structure for spatial searching. In: International conference on management of data. ACM, pp 47–57

  24. Haveliwala T, Gionis A, Indyk P (2000) Scalable techniques for clustering the web (extended abstract). In: 3rd international workshop on the web and databases (WebDB 2000)

  25. Heo J-P, Lee Y, He J, Chang S-F, Yoon S-E (2015) Spherical hashing: binary code embedding with hyperspheres. IEEE Trans Pattern Anal Mach Intell 37(11):2304–2316

    Article  Google Scholar 

  26. Hinton G, Salakhutdinov R (2006) Reducing the dimensionality of data with neural networks. Science 313 (5786):504–507

    Article  MathSciNet  MATH  Google Scholar 

  27. Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554

    Article  MathSciNet  MATH  Google Scholar 

  28. Huang Q, Feng J, Zhang Y, Fang Q, Ng W (2015) Query-aware locality-sensitive hashing for approximate nearest neighbor search. Proc VLDB Endowment 9(1):1–12

    Article  Google Scholar 

  29. Indyk P, Motwani R (1998) Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the thirtieth annual ACM symposium on theory of computing, STOC ’98. New York, NY, USA, pp 604–613

  30. Krizhevsky A, Hinton GE (2011) Using very deep autoencoders for content-based image retrieval

  31. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems. Lake Tahoe, Nevada, US, pp 1106–1114

  32. Kulis B, Darrell T (2009) Learning to hash with binary reconstructive embeddings. In: Advances in neural information processing systems. Vancouver, British Columbia, Canada, pp 1042–1050

  33. Kulis B, Grauman K (2009) Kernelized locality-sensitive hashing for scalable image search. In: IEEE 12th international conference on computer vision, ICCV 2009. Kyoto, Japan, pp 2130–2137

  34. Kulis B, Jain P, Grauman K (2009) Fast similarity search for learned metrics. IEEE Trans Pattern Anal Mach Intell 31(12):2143–2157

    Article  Google Scholar 

  35. Lai H, Pan Y, Ye L, Yan S (2015) Simultaneous feature learning and hash coding with deep neural networks. CoRR, arXiv:1504.03410

  36. LeCun Y, Bengio Y (1998) The handbook of brain theory and neural networks. Chapter convolutional networks for images, speech, and time series. MIT Press, Cambridge, pp 255– 258

    Google Scholar 

  37. Lecun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444, 5

    Article  Google Scholar 

  38. Lin J, Morère O, Petta J, Chandrasekhar V, Veillard A (2015) Tiny descriptors for image retrieval with unsupervised triplet hashing. CoRR, arXiv:1511.03055

  39. Liong VE, Lu J, Wang G, Moulin P, Zhou J (2015) Deep hashing for compact binary codes learning. In: IEEE conference on computer vision and pattern recognition, CVPR 2015. Boston, MA, USA, pp 2475–2483

  40. Nguyen VA, Lu J, Do MN (2014) Supervised discriminative hashing for compact binary codes. In: Proceedings of the 22nd ACM international conference on multimedia, MM ’14. ACM, New York, pp 989–992

  41. Qin H, El-Yacoubi MA (2017) Deep representation-based feature extraction and recovering for finger-vein verification. IEEE Trans Inf Forensics Secur 12(8):1816–1829

    Article  Google Scholar 

  42. Qin H, El Yacoubi MA (2017) Deep representation for finger-vein image quality assessment. IEEE Trans Circ Syst Video Technol PP(99):1–1

    Google Scholar 

  43. Salakhutdinov R, Hinton GE (2009) Semantic hashing. Int J Approx Reasoning 50(7):969–978

    Article  Google Scholar 

  44. Shen F, Shen C, Liu W, Shen HT (2015) Supervised discrete hashing. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 37–45

  45. Torralba A, Fergus R, Freeman WT (2008) 80 million tiny images: a large data set for nonparametric object and scene recognition. IEEE Trans Pattern Anal Mach Intell 30(11):1958–1970

    Article  Google Scholar 

  46. Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol P-A (2010) Stacked denoising autoencoders Learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 11:3371–3408

    MathSciNet  MATH  Google Scholar 

  47. Wang J, Zhang T, Song J, Sebe N, Shen HT (2016) A survey on learning to hash. CoRR, arXiv:1606.00185

  48. Wang J, Kumar S, Chang S-F (2012) Semi-supervised hashing for large-scale search. IEEE Trans Pattern Anal Mach Intell 34(12):2393–2406

    Article  Google Scholar 

  49. Wang J, Liu W, Kumar S, Chang S-F (2016) Learning to hash for indexing big data - a survey. Proc IEEE 104(1):34–57

    Article  Google Scholar 

  50. Wang S, Huang Q, Jiang S, Tian Q (2012) S3mkl: scalable semi-supervised multiple kernel learning for real-world image applications. IEEE Trans Multimed 14 (4):1259–1274

    Article  Google Scholar 

  51. Wang Z, Bovik AC (2009) Mean squared error: love it or leave it? a new look at signal fidelity measures. IEEE Signal Proc Mag 26(1):98–117

    Article  Google Scholar 

  52. Weber R, Schek H-J, Blott S (1998) A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Proceedings of the 24rd international conference on very large data bases, VLDB ’98. San Francisco, CA, USA, pp 194–205

  53. Weiss Y, Torralba A, Fergus R (2008) Spectral hashing. In: Advances in neural information processing systems. Vancouver, British Columbia, Canada, pp 1753–1760

  54. Xia R, Pan Y, Lai H, Liu C, Yan S (2014) Supervised hashing for image retrieval via image representation learning. In: Proceedings of the 28th AAAI conference on artificial intelligence. Québec City, Québec, Canada, pp 2156–2162

  55. Xia Z, Feng X, Peng J, Hadid A (2016) Unsupervised deep hashing for large-scale visual search. In: 6th international conference on image processing theory, tools and applications, IPTA 2016. Oulu, Finland, pp 1–5

  56. Zhu H, Long M , Wang J, Cao Y (2016) Deep hashing network for efficient similarity retrieval. In: Proceedings of the 13th AAAI conference on artificial intelligence, AAAI’16. AAAI Press, pp 2415–2421

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sanaa Chafik.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chafik, S., El Yacoubi, M.A., Daoudi, I. et al. Unsupervised deep neuron-per-neuron hashing. Appl Intell 49, 2218–2232 (2019). https://doi.org/10.1007/s10489-018-1353-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-018-1353-5

Keywords

Navigation