Binary code learning via optimal class representations

doi:10.1016/j.neucom.2015.12.129

Neurocomputing

Volume 208, 5 October 2016, Pages 59-65

https://doi.org/10.1016/j.neucom.2015.12.129 Get rights and content

Abstract

Hashing is an attracting technique for fast retrieval due to its low storage and computation costs. By hashing, each high-dimensional vector is mapped into a low-dimensional binary code vector and retrieval is performed in the Hamming space. Recently several hashing methods have been proposed, among which, supervised hashing methods have shown great performance by incorporating the supervision information. However, most previous supervised methods simply focused on the pairwise label information of data, and ignored the structure information and relationship within data. To tackle this problem, we propose to learn binary codes by explicitly taking into account class semantic relatedness. Specifically, a set of binary codes is computed according to the intrinsic class similarities in data and serves as the optimal class representations. We show that, by mapping images onto the optimal representation of their corresponding classes, our proposed method outperforms several other state-of-the-art supervised hashing methods in image retrieval on three large-scale datasets.

Introduction

With the explosive growth of images on the web, nearest neighbor search has attracted great attention in computer vision, machine learning, information retrieval and related area [5], [16], [8], [37], [18], [17], [19], [48], [38]. When the images are high dimensional, searching efficiently becomes challenging and crucial. Two mainstream retrieval approaches are tree based and hashing based methods. With tree based methods, searching is speeded up by exploiting spatial partitions of data space via various tree structures. Decision trees [27] and kd-trees [23] are two such methods. However, storage and time consumption grow exponentially with dimension growing, which lead to an inefficient search.

To search high-dimensional data efficiently, hashing becomes a promising approach. Hashing methods map the high-dimensional vector onto a low-dimensional binary code vector, and the mapped binary codes are used for efficient search. Besides search, binary code has also been widely applied in various vision applications [21], [20], [22]. Existing hashing techniques can be divided into two categories: data-dependent and data-independent. Locality Sensitive Hashing (LSH) [5] is one of the most popular data-independent methods. In LSH, the random hyperplane-based hash function involves a random projection sampled from a Gaussian distribution. In addition to Euclidean distance, Developed LSH methods employed several other distance measures such as p-norm distances [2], the Mahalanobis metric [12], and kernel versions [11], [28]. The LSH family, however, needs long binary codes for achieving high search performances, which lead to a high storage consumption.

Instead of generating hash functions randomly, data-dependent hashing methods learn similarity-preserving binary code from training data. Various data-dependent methods have been proposed in the literature. Representative methods in this category can be divided into two parts: supervised methods and unsupervised methods. Unsupervised methods use the sole unlabelled data to generate binary codes. For example, PCA Hashing [47], ITQ [6], Isotropic hashing [9], Spectral Hashing (SH) [41] and Asymmetric Inner Product Binary Coding (AIBC) [29], are some widely used methods. These unsupervised methods, however, do not consider the supervision information. Therefore many supervised methods are proposed to handle this issue such as the supervised minimal loss hashing (MLH) [24], kernel-based supervised hashing (KSH) [15], supervised discrete hashing (SDH) [30], FastHash [13], graph cuts coding (GCC) [4] etc.

A few hashing methods propose to generate the hash functions in the kernel space as the extension of linear hashing methods, such as binary reconstructive embeddings (BRE) [10], KLSH [11]. Recently, it is shown that compact similarity-preserving hash codes can be obtained by considering the non-linear manifold structure. One of the most popular methods in this category is spectral hashing [41], which generates hash codes by solving the relaxed mathematics program that is similar to Laplacian eigenmaps [1]. As the extension of SH, anchor graph hashing (AGH) [16] use the anchor graph affinity, which makes training and the out-of-sample extension problem tractable for large-scale dataset. Inductive manifold hashing (IMH) [31], [32] proposed a new framework for generating nonlinear hash functions. Other related methods include the multidimensional spectral hashing (MDSH) [40] and DGH [14].

In general, supervised methods outperform unsupervised methods due to the usage of supervision information of the training data. In SSH, a matrix S is defined incorporating the pairwise labeled information. In SDH, the label information is used for classifying binary codes. However, most existing supervised methods simply focus on the label information and pay no attention to the relationship between classes. We believe that the semantic relationship between classes gives more detailed and specific information than label information, and improve the retrieval performance.

In this work, we propose a new method to compute the binary codes for classes as the optimal representations, assuming that an optimal representation can be representative for its corresponding class and reflect the relationship with other classes. By considering the semantic similarity between classes, a matrix is constructed to depict the semantic similarity between classes. Thus, the optimal class representations are computed according to this matrix. Our contributions are as follows:

1.
We propose a new supervised hashing method that each class is assigned to an optimal binary code as their class representation, considering that optimal class representations preserve semantic similarity between classes well.
2.
We construct a semantic relatedness matrix to depict the semantic similarity, and then a set of binary codes is computed to preserve the similarity in the Hamming space. The binary codes of data are expected to be close with their corresponding optimal class representation. After solving a straightforward optimization problem, the binary codes and hash functions are learned efficiently.

Section snippets

Learning the optimal class binary representations

Suppose that we have n samples $X = {x_{i}}_{i = 1}^{n}$ . Our aim is to get a set of binary codes $B = {b_{i}}_{i = 1}^{n} \in {- 1, 1}^{n \times L}$ to preserve their semantic similarities well. Here we want to get the optimal class representations which can capture the semantic similarities between classes, and then the set of binary codes B is learnt according to the corresponding optimal class representations.

For c classes, we compute a matrix $P = [p_{1}^{T}; p_{2}^{T}; \dots; p_{c}^{T}] \in {1, - 1}^{c \times L}$ , and every row p_i^T in P is the optimal class representation

Nonlinear embedding

The hash functions map the data onto a Hamming space to obtain binary codes. In this work, we learn hash functions in a kernel embedding, and RBF kernel mapping is a simple yet effectively choose, and such kernel is widely adopted in hash function in, e.g., BRE [10], KSH [15] and SDH [30]: $F (x) = W^{T} ϕ (x)$ where $ϕ (x)$ is a m-dimensional vector obtained by the RBF kernel mapping: $ϕ (x) = {[\exp (∥ x - a_{1} ∥^{2} / σ), \dots, \exp (∥ x - a_{m} ∥^{2} / σ)]}^{T}$ , where ${a_{j}}_{j = 1}^{m}$ are the m anchor points randomly selected from the training data and σ

Experimental results

Our experiments are conducted on three widely used datasets: CIFAR-10,¹ SUN397 [43] and ImageNet [3]. The proposed method is compared against several state-of-art supervised hashing methods including SDH [30], CCA-ITQ [6] and KSH [11]. For the three methods, we employed the implementations of these compared methods provided by the original authors. For KSH, we sampled 2000 images from training data to build pairwise label matrix according to [15]. For

Conclusions

In this paper, we introduced a new hashing method for image retrieval. Different from the existing supervised methods, we focused on the semantic similarity between classes under the assumptions that the semantic relationship between classes provided useful information to improve retrieval performance. The most important part of our method was to find a set of binary codes for classes as their optimal representations. For utilizing the semantic similarity between classes, we built a semantic

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Project 61502081, Project 61472063 and Project 61473154 and Project of Jiangsu Key Laboratory of Image and Video Understanding for Social Safety (Nanjing University of Science and Technology), Grant no. 30920140122007.

Xiang Zhou is currently a undergraduate student in University of Electronic Science and Technology of China. His major research interests include computer vision and machine learning.

References (49)

X. Liu et al.
Structure sensitive hashing with adaptive product quantization
IEEE Trans. Cybern.
(2015)
X. Liu et al.
Multiple feature kernel hashing for large-scale visual search
Pattern Recognit.
(2014)
X. Liu et al.
Large-scale unsupervised hashing with shared structure learning
IEEE Trans. Cybern.
(2015)
F. Shen et al.
Locality constrained representation based classification with spatial pyramid patches
Neurocomputing
(2013)
M. Belkin, P. Niyogi, Laplacian eigenmaps and spectral techniques for embedding and clustering, in: Proceedings of...
M. Datar, N. Immorlica, P. Indyk, V. Mirrokni, Locality-sensitive hashing scheme based on p-stable distributions, in:...
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, L. Fei-Fei, Imagenet: a large-scale hierarchical image database, in:...
T. Ge, K. He, J. Sun, Graph cuts for supervised binary coding, in: Proceedings of the European Conference on Computer...
A. Gionis, P. Indyk, R. Motwani, Similarity search in high dimensions via hashing, in: Proceedings of International...
Y. Gong et al.
Iterative quantizationa procrustean approach to learning binary codes for large-scale image retrieval
IEEE Trans. Pattern Anal. Mach. Intell.
(2013)

D. Haussler, Convolution Kernels on Discrete Structures, Technical Report,...

K. He, F. Wen, J. Sun, K-means hashing: an affinity-preserving quantization method for learning binary compact codes,...

W. Kong, W.-J. Li, Isotropic hashing, in: Proceedings of Advances in Neural Information Processing Systems, 2012, pp....

B. Kulis, T. Darrell, Learning to hash with binary reconstructive embeddings, in: Proceedings of Advances in Neural...

B. Kulis, K. Grauman, Kernelized locality-sensitive hashing for scalable image search, in: Proceedings of IEEE...

B. Kulis et al.

Fast similarity search for learned metrics

IEEE Trans. Pattern Anal. Mach. Intell.

(2009)

G. Lin, C. Shen, Q. Shi, A. van den Hengel, D. Suter, Fast supervised hashing with decision trees for high-dimensional...

W. Liu, C. Mu, S. Kumar, S.-F. Chang. Discrete graph hashing, in: Proceedings of Advances in Neural Information...

W. Liu, J. Wang, R. Ji, Y. Jiang, S. Chang. Supervised hashing with kernels, in: Proceedings of IEEE Conference on...

W. Liu, J. Wang, S. Kumar, S.-F. Chang, Hashing with graphs, in: Proceedings of International Conference on Machine...

J. Lu et al.

Cost-sensitive local binary feature learning for facial age estimation

IEEE Trans. Image Process.

(2015)

J. Lu, V. E. Liong, J. Zhou, Simultaneous local binary feature learning and encoding for face recognition, in:...

J. Lu et al.

Learning compact binary face descriptor for face recognition

IEEE Trans. Pattern Anal. Mach. Intell.

(2015)

M. Muja et al.

Scalable nearest neighbor algorithms for high dimensional data

IEEE Trans. Pattern Anal. Mach. Intell.

(2014)

Cited by (0)

Xiang Zhou is currently a undergraduate student in University of Electronic Science and Technology of China. His major research interests include computer vision and machine learning.

Fumin Shen received his B.S. and Ph.D. degree from Shandong University and Nanjing University of Science and Technology, China, in 2007 and 2014, respectively. Currently he is a lecturer in school of Computer Science and Engineering, University of Electronic of Science and Technology of China, China. His major research interests include computer vision and machine learning, including face recognition, image analysis, hashing methods, and robust statistics with its applications in computer vision.

Yang Yang is currently with University of Electronic Science and Technology of China. He was a Research Fellow in National University of Singapore during 2012–2014. He was conferred his Ph.D. degree in 2012 from The University of Queensland, Australia. During the Ph.D. study, Yang Yang was supervised by Prof. Heng Tao Shen and Prof. Xiaofang Zhou. He obtained Master׳s degree in 2009 and Bachelor׳s degree in 2006 from Peking University and Jilin University, respectively.

Guangwei Gao received the B.S. degree in Information and Computation Science from Nanjing Normal University, Nanjing, China, in 2009, and the Ph.D. degree in Pattern Recognition and Intelligence Systems from Nanjing University of Science and Technology, Nanjing, China, in 2014. From March 2011 to September 2011 and February 2013 to August 2013, he was an exchange student of Department of Computing, Hong Kong Polytechnic University. Now, he is an Assistant Professor in the Institute of Advanced Technology, Nanjing University of Posts and Telecommunications. His research interests include face recognition, face hallucination and biometrics.

Yuan Wang is a Research Fellow at Department of Industrial and Systems Engineering in National University of Singapore. She received her Ph.D. degree in Operations Research, with a specialization in Maritime Transportation Optimization and Simulation, at National University of Singapore. Her research interests include Mathematical Modeling, Complex System simulation and optimization heuristics. She is currently working on Data driven and on-demand scheduling problems in center of next generation logistics NUS.

View full text

Binary code learning via optimal class representations

Abstract

Introduction

Section snippets

Learning the optimal class binary representations

Nonlinear embedding

Experimental results

Conclusions

Acknowledgments

IEEE Trans. Cybern.

Pattern Recognit.

IEEE Trans. Cybern.

Neurocomputing

Iterative quantizationa procrustean approach to learning binary codes for large-scale image retrieval

IEEE Trans. Pattern Anal. Mach. Intell.

Fast similarity search for learned metrics

IEEE Trans. Pattern Anal. Mach. Intell.

Cost-sensitive local binary feature learning for facial age estimation

IEEE Trans. Image Process.

Learning compact binary face descriptor for face recognition

IEEE Trans. Pattern Anal. Mach. Intell.

Scalable nearest neighbor algorithms for high dimensional data

IEEE Trans. Pattern Anal. Mach. Intell.