Learning binary codes with local and inner data structure

doi:10.1016/j.neucom.2017.12.005

Neurocomputing

Volume 282, 22 March 2018, Pages 32-41

https://doi.org/10.1016/j.neucom.2017.12.005 Get rights and content

Abstract

Recent years have witnessed the promising capacity of hashing techniques in tackling nearest neighbor search because of the high efficiency in storage and retrieval. Data-independent approaches (e.g., Locality Sensitive Hashing) normally construct hash functions using random projections, which neglect intrinsic data properties. To compensate this drawback, learning-based approaches propose to explore local data structure and/or supervised information for boosting hashing performance. However, due to the construction of Laplacian matrix, existing methods usually suffer from the unaffordable training cost. In this paper, we propose a novel supervised hashing scheme, which has the merits of (1) exploring the inherent neighborhoods of samples; (2) significantly saving training cost confronted with massive training data by employing approximate anchor graph; as well as (3) preserving semantic similarity by leveraging pair-wise supervised knowledge. Besides, we integrate discrete constraint to significantly eliminate accumulated errors in learning reliable hash codes and hash functions. We devise an alternative algorithm to efficiently solve the optimization problem. Extensive experiments on various image datasets demonstrate that our proposed method is superior to the state-of-the-arts.

Introduction

During the last decades, visual search for large-scale dataset has drawn much attention in many content-based multimedia applications [1], [2], [3]. Among in large-scale vision problems, binary coding has attracted growing attention in image retrieval [4], [5], [6], image classification [7], [8] and etc. Especially in recent years, the demand for retrieving relevant content among massive images is stronger than ever under such a big-data era. For most occasions, users input a keyword and intend to obtain the semantic similar images precisely and efficiently. For servers, it is almost impossible to linearly search the objects especially when they confront vast amount of images (e.g., the photo sharing website Flickr possesses more than five billion images and people still upload the photos at the rate of over 3000 images per minute) [9]. In contrast to the cost of the storage, the search task consumes more computational resources due to the massive search request. Therefore, recent years people devoted a lot of efforts to handle this problem. Among these proposed methods, hash shows great superiority than others. In simple terms, hashing methods map high-dimensional images into short binary codes and preserve the similarity of the original data at the same time. In this way, searching similar images converts to finding neighbor hashing codes in Hamming space which is simple and practical. Consequently, the technique leads to significant efficiency in multimedia, computer vision, information retrieval, machine learning and pattern matching [10], [11], [12], [13], [14], [15], [16], [17], [18], [19].

For many applications, approximate nearest neighbor (ANN) search is sufficient enough [20], [21], [22], [23], [24]. Given a query instance, the algorithm aims to find the similar instance instead of returning the exact nearest neighbor. Therefore, efficient data structure is required to store data for fast search. Under such background, tree-based indexing approaches are proposed for approximate nearest neighbor (ANN) search typically with the sub-linear complexity of O(log(N)). Among them, KD tree [25], [26], [27], R tree [28] and metric tree [29] are most representative. However, as the image technology develops, the descriptors of one image usually reach to hundreds dimensions and with the dimension increasing, the tree-based methods’ performance dramatically degrades and it need more space to store data which costs a lot. In consideration of the inefficiency of tree-based indexing methods, hashing approaches have been proposed to map entire dataset into discrete codes and preserve the similarity of the data at the same time. The similarity between data points can be measured by Hamming distance which costs little time to calculate for computers.

The existing hash methods can be roughly divided into two groups [30]: data-independent and data-dependent. One of the most classic data-independent methods is Locality-Sensitive Hashing (LSH) [31] that has been widely used to handle massive data due to its simplicity. LSH uses a hash function that randomly projects or permutates nearby data points into same binary codes. However, LSH needs long binary codes to achieve promising retrieval performance which increases the storage space and computation costs. Moreover, LSH ignores the underlying distributions and manifold structure of the data on account of random-projection.

Realizing this deficiency, Weiss et al. proposed Spectral Hashing (SH) [32] utilizing the subset of thresholded eigenvectors of the graph Laplacian by relaxing the original problem which improves the retrieval accuracy to some extent. Yet it charges more time to build a neighborhood graph. Liu et al. delivered some improvements to SH and proposed the Anchor Graph Hashing (AGH) [33] that uses anchor graphs to obtain low-rank adjacency matrices. Formulation of AGH costs constant time by extrapolating graph Laplacian eigenvectors to eigenfunctions. Note that SH and AGH is data-dependent cause it exploits the feature information of the data and preserve the metric structure. This kind of methods are called unsupervised methods like principal component analysis based hashing (PCAH) [34], iterative quantization (ITQ) [35], isotropic hashing (Iso-Hash) [36] and an affinity-preserving k-means hashing (KMH) [37].

However, hashing methods mentioned above can not achieve high retrieval performance with a simple approximate affinity matrix [38]. Due to the semantic gap where the visual similarity often differs from semantic similarity, returning the nearest neighbors in metric space can not guarantee search quality [34], [39]. To solve this problem, images that are artificially labeled as similar or dissimilar are used by supervised hashing methods [5], [40], [41], [42], [43], [44], [45], [46], [47]. The following are most popular ones among them. Kernel-Based Supervised Hash (KSH) [48] maps data into Hamming space where similar items have minimal Hamming distance and simultaneously dissimilar items have maximum distance. Binary reconstructive embeddings (BRE) [49] proposed the hash function which can minimize the reconstruction error between the metric space and Hamming space. Canonical Correlation Analysis with Iterative Quantization (CCA-ITQ) [35] and Supervised Discrete Hashing (SDH) [50] are proposed to satisfy the semantic similarity. By leveraging pairwise labeled information, the performance of supervised methods has been remarkably improved. Moreover, some hashing learning approaches based deep neural networks have been recently proposed to perform simultaneous.

No matter the method is supervised or unsupervised, the object function with discrete constraints involves a mixed binary-integer problem [50] which is NP-hard. To tackle this problem, most hash methods relax the discrete constraints. They first calculate a continuous solution then threshold it to obtain binary codes without realizing the importance of discrete optimization. This technique will leads to the significant information loss during the learning process [51]. It has been shown that the quality of the codes degrades quickly especially when the code length increases if we ignore the discrete constraints. Some methods try to improve the quality by replacing the sign function with more smoothing sigmoid function [52]. However, it does not solve the problem explicitly. Recently, only few methods directly generate hash codes in discrete space. In addition, deep neural network has been applied in recent retrieval progress [53], [54], [55]. Deep-learning-based hashing obtains high efficiency by learning the image representation and hash codes tightly coupled [56], [57]. Nonetheless, due to the resulting computational expense and storage cost are huge, we do not make much comparison.

In this paper, we aim to design a supervised hash method which can efficiently generate high-quality compact codes. We utilize the anchor graph which is built based on the pairwise similarity to exploit the inner structure of the original data, in the process of learning hash function, we also take the supervision information to preserve the pairwise similarity to improve the accuracy of retrieval. To avoid accumulated errors caused by continuous relaxation, we choose to directly optimize the binary codes. With the discrete constraints added to objective function, we propose a novel hashing framework, termed Local and Inner Data Structure Supervised Hashing (LISH) which is able to efficiently generate codes and satisfy the semantic similarity at the same time. We enhance our work by conduct more experiments to assure the results much more detailed and accurate [58]. We also simplify some of our mathematical derivation. Our main contributions are summarized as follows:

•
Our method uses graph Laplacian to captures the local neighborhoods to enhance hashing codes’ quality. And semantic gap can be properly solved by utilizing labeled information. By this way, both metric and semantic similarity are preserved by our method which contribute a lot to improve the performance significantly.
•
Most existing hash method solve the problem with the relaxation of the discrete constraints, since we directly optimize our method and each bit can be sequentially learned by the algorithm, our method outperforms in an alternative and efficient manner.
•
We evaluate our method on three popular large-scale image datasets and obtain superior accuracy than state-of-the-arts.

The remainder of this paper is organized as follows. A brief review of related work is in Section 2. We present the detailed formulation of proposed LISH method in Section 3. Experimental results are given in Section 4. We conclude our work in Section 5.

Section snippets

Related work

Suppose we have n samples $x_{i} \in R^{d},$ $i = 1, \dots, n,$ deposited in matrix $X = {[x_{1}, x_{2}, \dots, x_{n}]}^{T} \in R^{n \times d},$ where d is the dimensionality of the feature space. In the this section, we briefly review a few representative methods including Spectral Hashing (SH), Kernel Supervised Hashing (KSH).

Local and inner data structure supervised hashing

In this section, we mainly introduce the algorithm of our method in detail. We propose an alternative optimization model and sequentially learn each bit. The hash functions are learned during the optimization process simultaneously.

We defined X above and denote the label matrix $Y = {[y_{1}, y_{2}, \dots, y_{n}]}^{T} \in {0, 1}^{n \times c}$ where c is the number of classes of labels. When x_i belongs to class j, y_ij equals 1, otherwise equals 0. By means of finding the mapping relation between the original feature space and Hamming

Experiments

We conduct extensive experiments to evaluate our proposed method on three publicly available large-scale image datasets: CIFAR-10, SUN397 and YouTube Faces. CIFAR-10 dataset is a labeled subset of 80-million tiny images collection [64], which contains 60K 32 × 32 color images of ten categories and each of them has 6000 samples. The entire dataset is partitioned into two parts: a training subset of 59,000 images and a test subset of 1000 images. Each image is represented by a 512-dimensional

Conclusion

In this paper, we exploited underlying manifold structure of samples by graph Laplacian. The approximate anchor graph was used to save training cost. To capture and preserve the semantic label information in the Hamming space, we explicitly formulated the tractable optimization function integrated with ℓ₂ loss and decomposed it into several sub-problems which could be iteratively solved by our algorithm. We proposed a discrete supervised paradigm to directly generate hash codes without

Shiyuan He is currently an undergraduate student in Yingcai Honors College, University of Electronic Science and Technology of China. His major research interests include Image Retrieval, Hash and machine learning.

References (66)

HuR. et al.
Graph self-representation method for unsupervised feature selection
Neurocomputing
(2017)
WangJ. et al.
Linear unsupervised hashing for ANN search in euclidean space
Neurocomputing
(2016)
ZhangY. et al.
Kernelized sparse hashing for scalable image retrieval
Neurocomputing
(2016)
J.K. Uhlmann
Satisfying general proximity/similarity queries with metric trees
Inf. Process. Lett.
(1991)
LuoY. et al.
Robust discrete code modeling for supervised hashing
Pattern Recognit.
(2018)
DengC. et al.
Large-scale multi-task image labeling with adaptive relevance discovery and feature hashing
Signal Process.
(2015)
YangY. et al.
Robust discrete spectral hashing for large-scale image semantic indexing
IEEE Trans. Big Data
(2015)
WangM. et al.
Beyond distance measurement: constructing neighborhood similarity for video annotation
IEEE Trans. Multmed.
(2009)
B. Kulis et al.
Fast similarity search for learned metrics
IEEE Trans. Pattern Anal. Mach. Intell.
(2009)
HuM. et al.
Robust web image annotation via exploring multi-facet and structural knowledge
IEEE Trans. Image Process.
(2017)

GongY. et al.

Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval

IEEE Trans. Pattern Anal. Mach. Intell.

(2013)

LiuX. et al.

Compact kernel hashing with multiple features

Proceedings of the ACM International Conference on Multimedia

(2012)

M. Hu et al.

Hashing with angular reconstructive embeddings

IEEE Trans. Image Process.

(2018)

MuY. et al.

Hash-SVM: scalable kernel machines for large-scale visual classification

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR

(2014)

ZhuX. et al.

Block-row sparse multiview multilabel learning for image classification

IEEE Trans. Cybern.

(2016)

WangJ. et al.

Learning to hash for indexing big data – a survey

Proc. IEEE

(2016)

C. Strecha et al.

LDAHash: improved matching with smaller descriptors

IEEE Trans. Pattern Anal. Mach. Intell.

(2012)

LiX. et al.

Learning hash functions using column generation

Proceedings of the International Conference Machine Learning

(2013)

HeX. et al.

Neural collaborative filtering

Proceedings of the Twenty Sixth International Conference on World Wide Web, WWW

(2017)

SongJ. et al.

Inter-media hashing for large-scale retrieval from heterogeneous data sources

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, SIGMOD

(2013)

ZhuX. et al.

Robust joint graph sparse coding for unsupervised spectral feature selection

IEEE Trans. Neural Netw. Learn. Syst.

(2017)

LiuX. et al.

Mixed image-keyword query adaptive hashing over multilabel images

ACM Trans. Multimed. Comput. Commun. Appl.

(2014)

ZhuX. et al.

A sparse embedding and least variance encoding approach to hashing

IEEE Trans. Image Process.

(2014)

A. Gionis et al.

Similarity search in high dimensions via hashing

Proceedings of the Twenty Fifth International Conference on Very Large Data Bases, VLDB

(1999)

YangY. et al.

Multitask spectral clustering by exploring intertask correlation

IEEE Trans. Cybern.

(2015)

HeX. et al.

Fast matrix factorization for online recommendation with implicit feedback

Proceedings of the Thirty Ninth International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR

(2016)

YangY. et al.

Discrete nonnegative spectral clustering

IEEE Trans. Knowl. Data Eng.

(2017)

A.W.M. Smeulders et al.

Content-based image retrieval at the end of the early years

IEEE Trans. Pattern Anal. Mach. Intell.

(2000)

J.L. Bentley

Multidimensional binary search trees used for associative searching

Commun. ACM

(1975)

J.H. Friedman et al.

An algorithm for finding best matches in logarithmic expected time

ACM Trans. Math. Softw.

(1977)

C. Silpa-Anan et al.

Optimised KD-trees for fast image descriptor matching

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR

(2008)

V. Gaede et al.

Multidimensional access methods

ACM Comput. Surv. (CSUR)

(1998)

WangS. et al.

A multi-label least-squares hashing for scalable image search

Proceedings of the SIAM International Conference on Data Mining, SIAM

(2015)

Cited by (5)

Unsupervised adaptive hashing based on feature clustering
2019, Neurocomputing
Citation Excerpt :
However, the difficulty of unsupervised hashing is yet existing caused by lacking of effective objectives. Thus, developing traditional hashing methods [10,40] is meaningful to provide theoretical inspirations for further deep learning. In this paper, to further prove the effectiveness of our method, we also compare the performances of hashing methods on deep features.
An attractive method for image retrieval is binary hashing, which aims to reduce the dimensionality and generate similarity-preserving binary codes. To map the high-dimensional data into a low-dimensional subspace, majority of current unsupervised hashing approaches reduce the dimensionality by principal component analysis (PCA). However, PCA will yield unbalanced variances of projection directions and cause inconvenience in the quantization step. Besides, preserving the original similarity in existing unsupervised hashing methods remains as an NP-hard problem. For addressing these problems, we explore a novel hashing method based on feature clustering to simultaneously generate low-dimensional data with balanced variance and preserve the similarity in Euclidean space. Furthermore, we also propose an adaptive quantization approach to displace the fixed threshold quantization. Our novel method, dubbed as Feature Clustering Hashing (FCH), has shown its superiority to state-of-the-art methods on three benchmark datasets.
An Online Hashing Algorithm for Image Retrieval Based on Optical-Sensor Network
2023, Sensors
Equivalent Continuous Formulation of General Hashing Problem
2021, IEEE Transactions on Cybernetics
Efficient similarity join for certain graphs
2021, Microsystem Technologies
An efficient LSH indexing on discriminative short codes for high-dimensional nearest neighbors
2019, Multimedia Tools and Applications

Guo Ye is the undergraduate student in Yingcai Honors College, University of Electronic Science and Technology of China. His interests include Image Retrieval, Hash and machine learning.

Mengqiu Hu received the bachelors degree in 2015 from University of Electronic Science and Technology of China. He is currently a master student in University of Electronic Science and Technology of China. His major research interests include computer vision and machine learning.

Yang Yang received the bachelors degree from Jilin University in 2006, the masters degree from Peking University in 2009, and the Ph.D. degree from The University of Queensland, Australia, in 2012, under the supervision of Prof. H. T. Shen and Prof. X. Zhou. He was a Research Fellow with the National University of Singapore from 2012 to 2014. He is currently with the University of Electronic Science and Technology of China.

Fumin Shen received the B.S. from Shandong University in 2007 and the Ph.D. degree from the Nanjing University of Science and Technology, China, in 2014. He is currently an Associate Professor with the School of Computer Science and Engineering, University of Electronic Science and Technology of China, China. His major research interests include computer vision and machine learning, including face recognition, image analysis, hashing methods, and robust statistics with its applications in computer vision.

Heng Tao Shen received the B.Sc. degree (Hons.) and the Ph.D. degree from the Department of Computer Science, National University of Singapore, in 2000 and 2004, respectively. He joined The University of Queensland as a Lecturer, a Senior Lecturer, and a Reader, where he became a Professor in 2011. He is currently a Professor of Computer Science and an ARC Future Fellow with the School of Information Technology and Electrical Engineering, The University of Queensland. He is also a Visiting Professor with Nagoya University and the National University of Singapore. His research interests mainly include multimedia/ mobile/Web search, and big data management on spatial, temporal, multimedia, and social media databases. He has extensively published and served on program committees in most prestigious international publication venues of interests. He received the Chris Wallace Award for outstanding Research Contribution in 2010 conferred by the Computing Research and Education Association, Australasia. He is also an Associate Editor of the IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING. He will serve as the PC Co-Chair of the ACM Multimedia in 2015.

¹: These authors contributed equally to this work.

View full text

Learning binary codes with local and inner data structure

Abstract

Introduction

Section snippets

Related work

Local and inner data structure supervised hashing

Experiments

Conclusion

Neurocomputing

Neurocomputing

Neurocomputing

Inf. Process. Lett.

Pattern Recognit.

Signal Process.

IEEE Trans. Big Data

Beyond distance measurement: constructing neighborhood similarity for video annotation

IEEE Trans. Multmed.

Fast similarity search for learned metrics

IEEE Trans. Pattern Anal. Mach. Intell.

Robust web image annotation via exploring multi-facet and structural knowledge

IEEE Trans. Image Process.

Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval

IEEE Trans. Pattern Anal. Mach. Intell.

Compact kernel hashing with multiple features

Proceedings of the ACM International Conference on Multimedia

Hashing with angular reconstructive embeddings

IEEE Trans. Image Process.

Hash-SVM: scalable kernel machines for large-scale visual classification

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR

Block-row sparse multiview multilabel learning for image classification

IEEE Trans. Cybern.

Learning to hash for indexing big data – a survey

Proc. IEEE

LDAHash: improved matching with smaller descriptors

IEEE Trans. Pattern Anal. Mach. Intell.

Learning hash functions using column generation

Proceedings of the International Conference Machine Learning

Neural collaborative filtering

Proceedings of the Twenty Sixth International Conference on World Wide Web, WWW

Inter-media hashing for large-scale retrieval from heterogeneous data sources

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, SIGMOD

Robust joint graph sparse coding for unsupervised spectral feature selection

IEEE Trans. Neural Netw. Learn. Syst.

Mixed image-keyword query adaptive hashing over multilabel images

ACM Trans. Multimed. Comput. Commun. Appl.

A sparse embedding and least variance encoding approach to hashing

IEEE Trans. Image Process.

Similarity search in high dimensions via hashing

Proceedings of the Twenty Fifth International Conference on Very Large Data Bases, VLDB

Multitask spectral clustering by exploring intertask correlation

IEEE Trans. Cybern.

Fast matrix factorization for online recommendation with implicit feedback

Proceedings of the Thirty Ninth International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR

Discrete nonnegative spectral clustering

IEEE Trans. Knowl. Data Eng.

Content-based image retrieval at the end of the early years

IEEE Trans. Pattern Anal. Mach. Intell.

Multidimensional binary search trees used for associative searching

Commun. ACM

An algorithm for finding best matches in logarithmic expected time

ACM Trans. Math. Softw.

Optimised KD-trees for fast image descriptor matching

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR

Multidimensional access methods

ACM Comput. Surv. (CSUR)

A multi-label least-squares hashing for scalable image search

Proceedings of the SIAM International Conference on Data Mining, SIAM