Abstract
Hashing has been widely applied to large scale semantic image retrieval. Unsupervised hashing cannot work well for semantic image retrieval while supervised hashing requiring full label information for large databases is not practical. Semi-supervised hashing (SSH) solves this problem by learning the semantic information from a small portion of labeled images and the data structure information from the unlabeled images. The major drawback of the current SSH is that they cannot guarantee the maximization of entropy over all hash bits for a better coding efficiency. We propose a SSH which maximizes the conditional entropy of a bit with respect to all previous bits, i.e. the sequential conditional entropy maximization SSH. It is further extended to a nonlinear SSH with a new mapping method to enhance precision and recall rates. Experimental results show that the nonlinear SSH does not work well for all cases, and a heuristic guideline for the selection between linear and nonlinear hashing is given based on the characteristics of the database. We also propose a multiple hashing version of the proposed method for high recall rate with few hash bucket visits. Experimental results show that the proposed method outperforms current SSH methods.
Similar content being viewed by others
References
Friedman JH, Bentley JL, Finkel RA (1977) An algorithm for finding best matches in logarithmic expected time. ACM Trans Math Softw 3(3):209–226
Silpa-Anan C, Hartley R (2008) Optimised kd-trees for fast image descriptor matching. In: Proceedings of IEEE conference on computer vision and pattern recognition. IEEE, pp 1–8
Ciaccia P, Patella M, Zezula P (1997) M-tree: an efficient access method for similarity search in metric spaces. In: Proceedings of the international conference on very large data bases, vol 23. Morgan Kaufmann Pub, San Francisco, pp 426–435
Beygelzimer A, Kakade S, Langford J (2006) Cover trees for nearest neighbor. In: Proceedings of the 23rd international conference on machine learning. ACM, New York, pp 97–104
Uhlmann JK (1991) Satisfying general proximity/similarity queries with metric trees. Inf Process Lett 40(4):175–179
Indyk P (2004) Nearest neighbors in high-dimensional spaces. CRC Press LLC, FL
Kundu MK, Chowdhury M, Banerjee M (2012) Interactive image retrieval using m-band wavelet, earth movers distance and fuzzy relevance feedback. Int J Mach Learn Cybern 3(4):285–296
Indyk P, Motwani R (1998) Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the thirtieth annual ACM symposium on theory of computing. ACM, New York, pp 604–613
Gionis A, Indyk P, Motwani R et al (1999) Similarity search in high dimensions via hashing. In: Proceedings of the international conference on very large data bases, vol 99, pp 518–529
Andoni A, Indyk P (2006) Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In: 47th annual IEEE symposium on foundations of computer science. IEEE, pp 459–468
Weiss Y, Torralba A, Fergus R (2008) Spectral hashing. NIPS 9(1):6
Liu W, Wang J, Kumar S, Chang SF (2011) Hashing with graphs. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 1–8
He J, Radhakrishnan R, Chang SF, Bauer C (2011) Compact hashing with joint optimization of search accuracy and time. In: Proceedings of IEEE conference on computer vision and pattern recognition. IEEE, pp 753–760
Gong Y, Lazebnik S, Gordo A, Perronnin F (2013) Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans Pattern Anal Mach Intell 35(12):2916–2929
Norouzi M, Fleet DJ (2013) Cartesian k-means. In: Proceedings of IEEE conference on computer vision and pattern recognition. IEEE, pp 3017–3024
Lin Y, Jin R, Cai D, Yan S, Li X (2013) Compressed hashing. In: Proceedings of IEEE conference on computer vision and pattern recognition, ser. CVPR’13. IEEE Computer Society, Washington, DC, pp 446–451. doi:10.1109/CVPR.2013.64
He K, Wen F, Sun J (2013) K-means hashing: an affinity-preserving quantization method for learning binary compact codes. In: Proceedings of IEEE conference on computer vision and pattern recognition. IEEE, pp 2938–2945
Gong Y, Kumar S, Rowley HA, Lazebnik S (2013) Learning binary codes for high-dimensional data using bilinear projections. In: Proceedings of IEEE conference on computer vision and pattern recognition. IEEE, pp 484–491
Liu W, Wang J, Ji R, Jiang YG, Chang SF (2012) Supervised hashing with kernels. In: Proceedings of IEEE conference on computer vision and pattern recognition. IEEE, pp 2074–2081
Strecha C, Bronstein AM, Bronstein MM, Fua P (2012) Ldahash: improved matching with smaller descriptors. IEEE Trans Pattern Anal Mach Intell 34(1):66–78
Salakhutdinov R, Hinton G (2009) Semantic hashing. Int J Approx Reason 50(7):969–978
Norouzi M, Blei DM (2011) Minimal loss hashing for compact binary codes. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 353–360
Xia Z, Feng X, Peng J, Wu J, Fan J (2015) A regularized optimization framework for tag completion and image retrieval. Neurocomputing 147:500–508. (Advances in self-organizing maps subtitle of the special issue: selected papers from the workshop on self-organizing maps 2012 (WSOM’12)
Wang J, Kumar S, Chang S-F (2012) Semi-supervised hashing for large-scale search. IEEE Trans Pattern Anal Mach Intell 34(12):2393–2406
Wu C, Zhu J, Cai D, Chen C, Bu J (2013) Semi-supervised nonlinear hashing using bootstrap sequential projection learning. IEEE Trans Knowl Data Eng 25(6):1380–1393
Datar M, Immorlica N, Indyk P, Mirrokni VS (2004) Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the twentieth annual symposium on computational geometry. ACM, New York, pp 253–262
Jain P, Kulis B, Grauman K (2008) Fast image search for learned metrics. In: Proceedings of IEEE conference on computer vision and pattern recognition. IEEE, pp 1–8
Grauman K, Darrell T (2007) Pyramid match hashing: sub-linear time indexing over partial correspondences. In: Proceedings of IEEE conference on computer vision and pattern recognition. IEEE, pp 1–8
Wang J, Kumar S, Chang S-F (2010) Semi-supervised hashing for scalable image retrieval. In: Proceedings of IEEE conference on computer vision and pattern recognition. IEEE, pp 3424–3431
Wang J, Kumar S, Chang S-F (2010) Sequential projection learning for hashing with compact codes. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp 1127–1134
Tanha J, van Someren M, Afsarmanesh H (2015) Semi-supervised self-training for decision tree classifiers. Int J Mach Learn Cybern 1–16. doi:10.1007/s13042-015-0328-7
Chen W-J, Shao Y-H, Hong N (2014) Laplacian smooth twin support vector machine for semi-supervised classification. Int J Mach Learn Cybern 5(3):459–468. doi:10.1007/s13042-013-0183-3
Jiang J, Yan X, Yu Z, Guo J, Tian W (2014) A chinese expert disambiguation method based on semi-supervised graph clustering. Int J MachLearn Cybern 1–8. doi:10.1007/s13042-014-0255-z
Alok A, Saha S, Ekbal A (2015) Semi-supervised clustering for gene-expression data in multiobjective optimization framework. Int J Mach Learn Cybern 1–19. doi:10.1007/s13042-015-0335-8
Yao C, Bu J, Wu C, Chen G (2013) Semi-supervised spectral hashing for fast similarity search. Neurocomputing 101:52–58
Xu H, Wang J, Li Z, Zeng G, Li S, Yu N (2011) Complementary hashing for approximate nearest neighbor search. In: IEEE international conference on computer vision. IEEE, pp 1631–1638
Li P, Cheng J, Lu H (2013) Hashing with dual complementary projection learning for fast image retrieval. Neurocomputing 120:83–89
Fu H, Kong X, Lu J (2013) Large-scale image retrieval based on boosting iterative quantization hashing with query-adaptive reranking. Neurocomputing 122:480–489
Fei-Fei L, Fergus R, Perona P (2007) Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. Comput Vis Image Underst 106(1):59–70
Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010) Locality-constrained linear coding for image classification. In: Proceedings of IEEE conference on computer vision and pattern recognition. IEEE, pp 3360–3367
Acknowledgments
This work is supported by a National Natural Science Foundation of China (61272201) and a Program for New Century Excellent Talents in University of China (NCET-11-0162).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ng, W.W.Y., Lv, Y., Zeng, Z. et al. Sequential conditional entropy maximization semi-supervised hashing for semantic image retrieval. Int. J. Mach. Learn. & Cyber. 8, 571–586 (2017). https://doi.org/10.1007/s13042-015-0350-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-015-0350-9