Abstract
With the rapid growth of the number of images on the Internet, it has become more necessary to ensure the content security of images. The key problem is retrieving relevant images from the large database. Binary embedding is an effective way to improve the efficiency of calculating similarities for image content security as binary code is storage efficient and fast to compute. It tries to convert real-valued signatures into binary codes while preserving similarity of the original data, and most binary embedding methods quantize each projected dimension to one bit (presented as 0/1). As a consequence, it greatly decreases the discriminability of original signatures. In this paper, we first propose a novel triple-bit quantization strategy to solve the problem by assigning 3-bit to each dimension. Then, asymmetric distance algorithm is applied to re-rank candidates obtained from Hamming space for the final nearest neighbors. For simplicity, we call the framework triple-bit quantization with asymmetric distance (TBAD). The inherence of TBAD is combining the best of binary codes and real-valued signatures to get nearest neighbors quickly and concisely. Moreover, TBAD is applicable to a wide variety of embedding techniques. Experimental comparisons on BIGANN set show that the proposed method can achieve remarkable improvement in query accuracy compared to original binary embedding methods.






Similar content being viewed by others
References
Zhang, S., Liang, J., He, R., Sun, Z.: Code consistent hashing based on information-theoretic criterion. IEEE Trans. Big Data 1(3), 84–94 (2015)
Wang, H., Feng, L., Zhang, J., Liu, Y.: Semantic discriminative metric learning for image similarity measurement. IEEE Trans. Multimed. 18(8), 1579–1589 (2016)
Yahiaoui, I., Hervé, N., Boujemaa, N.: Shape-based image retrieval in botanical collections. In: Advances in Multimedia Information Processing - PCM 2006, 7th Pacific Rim Conference on Multimedia, Hangzhou, China, November 2-4, 2006, Proceedings, pp. 357–364 (2006)
Megrhi, S., Souidène, W., Beghdadi, A.: Spatio-temporal SURF for human action recognition. In: Advances in Multimedia Information Processing - PCM 2013 - 14th Pacific-Rim Conference on Multimedia, Nanjing, China, December 13–16, 2013. Proceedings, pp. 505–516 (2013)
Chen, Z.-N., Ngo, C.-W., Zhang, W., Cao, J., Jiang, Y.-G.: Name-face association in web videos: a large-scale dataset, baselines, and open issues. J. Comput. Sci. Technol. 29(5), 785–798 (2014)
Chen, Z., Cao, J., Xia, T., Song, Y., Zhang, Y., Li, J.: Web video retagging. Multimed. Tools Appl. 55(1), 53–82 (2011)
Hong, R., Yang, Y., Wang, M., Hua, X.-S.: Learning visual semantic relationships for efficient visual retrieval. IEEE Trans. Big Data 1(4), 152–161 (2015)
Lei, J., Wang, B., Fang, Y., Lin, W., Le Callet, P., Ling, N., Hou, C.: A universal framework for salient object detection. IEEE Trans. Multimedia 18(9), 1783–1795 (2016)
Erhan, D., Szegedy, C., Toshev, A., Anguelov, D.: Scalable object detection using deep neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, June 23–28, 2014, pp. 2155–2162 (2014)
Revaud, J., Douze, M., Schmid, C., Jegou, H.: Event retrieval in large video collections with circulant temporal encoding. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, June 23–28, 2013, pp. 2459–2466 (2013)
Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. Comput. Vis. Image Underst. 106(1), 59–70 (2007)
Xie, H., Gao, K., Zhang, Y., Li, J., Ren, H.: Common visual pattern discovery via graph matching. In: Proceedings of the 19th ACM international conference on multimedia, pp. 1385–1388. ACM (2011)
Satoh, S., Kanade, T.: Name-it: Association of face and name in video. In: 1997 Conference on Computer Vision and Pattern Recognition (CVPR ’97), June 17–19, 1997, San Juan, Puerto Rico, pp. 368–373, (1997)
Li, T., Nian, F., Xinyu, W., Gao, Q., Yixiang, L.: Efficient video copy detection using multi-modality and dynamic path search. Multimed. Syst. 22(1), 1–11 (2016)
Jegou, H., Tavenard, R., Douze, M., Amsaleg, L.: Searching in one billion vectors: Re-rank with source coding. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011, May 22–27, 2011, Prague Congress Center, Prague, Czech Republic, pp. 861–864 (2011)
Oscar, K., Hermann, N., Richard, B.: Deep hand: how to train a cnn on 1 million hand images when your data is continuous and weakly labelled. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 3793–3802 (2016)
Xie, H., Zhang, Y., Tan, J., Guo, L., Li, J.: Contextual query expansion for image retrieval. IEEE Trans. Multimed. 16(4), 1104–1114 (2014)
Li, J., Li, X., Yang, B., Sun, X.: Segmentation-based image copy-move forgery detection scheme. IEEE Trans. Inf. Foren. Sec. 10(3), 507–518 (2015)
Herranz, L., Jiang, S., Li, X.: Scene recognition with cnns: Objects, scales and dataset bias. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27–30, 2016, pp. 571–579 (2016)
Cruz-Roa, A.A., Ovalle, J.E.A., Madabhushi, A., Osorio, F.A.G.: A deep learning architecture for image representation, visual interpretability and automated basal-cell carcinoma cancer detection. In: Medical Image Computing and Computer-Assisted Intervention - MICCAI 2013 - 16th International Conference, Nagoya, Japan, September 22–26, 2013, Proceedings, Part II, pp. 403–410, (2013)
Yoo, Y., Brosch, T., Traboulsee, A., Li, D.K.B., Tam, R.C.: Deep learning of image features from unlabeled data for multiple sclerosis lesion segmentation. In: Machine Learning in Medical Imaging - 5th International Workshop, MLMI 2014, Held in Conjunction with MICCAI 2014, Boston, MA, USA, September 14, 2014. Proceedings, pp. 117–124 (2014)
Gao, X., Hoi, S.C.H., Zhang, Y., Wan, J., Li, J.: SOML: Sparse online metric learning with application to image retrieval. In: AAAI, pp. 1206–1212 (2014)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Xie, H., Gao, K., Zhang, Y., Li, J., Liu, Y.: Pairwise weak geometric consistency for large scale image search. In: Proceedings of the 1st International Conference on Multimedia Retrieval, ICMR 2011, Trento, Italy, April 18 –20, 2011, p. 42 (2011)
Chen, B., Shu, H., Coatrieux, G., Chen, G., Sun, X., Coatrieux, J.L.: Color image analysis by quaternion-type moments. J. Math. Imaging Vis. 51(1), 124–144 (2015)
Xie, H., Gao, K., Zhang, Y., Tang, S., Li, J., Liu, Y.: Efficient feature detection and effective post-verification for large scale near-duplicate image search. IEEE Trans. Multimed. 13(6), 1319–1332 (2011)
Murillo, A.C, Kosecka, J.: Experiments in place recognition using gist panoramas. In: Computer Vision Workshops (ICCV Workshops), 2009 IEEE 12th International Conference on, pp. 2196–2203 (2009)
Tuytelaars, T., Schmid, C.: Vector quantizing feature space with a regular lattice. In: IEEE 11th International Conference on Computer Vision, ICCV 2007, Rio de Janeiro, Brazil, October 14–20, 2007, pp. 1–8 (2007)
Zhang, L., Zhang, Y., Hong, R., Tian, Q.: Full-space local topology extraction for cross-modal retrieval. IEEE Trans. Image Process. 24(7), 2212–2224 (2015)
Jegou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), 117–128 (2011)
Gordo, A., Perronnin, F., Gong, Y., Lazebnik, S.: Asymmetric distances for binary embeddings. IEEE Trans. Pattern Anal. Mach. Intell. 36(1), 33–47 (2014)
Raginsky, M., Lazebnik, S.: Locality-sensitive binary codes from shift-invariant kernels. In: Advances in neural information processing systems, pp. 1509–1517 (2009)
Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the Twentieth Annual Symposium on Computational Geometry, pp. 253–262. ACM (2004)
Smith, L.I.: A Tutorial on Principal Components Analysis. Cornell University, Ithaca (2002)
Jian, Y., Zhang, D., Frangi, A.F., Yang, J.-Y.: Two-dimensional pca: a new approach to appearance-based face representation and recognition. IEEE Trans. Pattern Anal. Mach. Intell. 26(1), 131–137 (2004)
Weiss, Y., Torralba, A., Fergus, R.: Spectral hashing. In: Advances in Neural Information Processing Systems. pp. 1753–1760 (2009)
Weiss, Y., Fergus, R., Torralba, A.: Multidimensional spectral hashing. In: European Conference on Computer Vision. pp. 340–353. Springer (2012)
Bawa, M., Condie, T. Ganesan, P.: LSH forest: self-tuning indexes for similarity search. In: Proceedings of the 14th international conference on World Wide Web, WWW 2005, Chiba, Japan, May 10–14, 2005, pp. 651–660 (2005)
Gong, Y., Lazebnik, S.: Iterative quantization: A procrustean approach to learning binary codes. In: Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pp. 817–824. IEEE (2011)
Gong, Y., Lazebnik, S., Gordo, A., Perronnin, F.: Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 35(12), 2916–2929 (2013)
Norouzi, M., Punjani, A., Fleet, D.J.: Fast exact search in hamming space with multi-index hashing. IEEE Trans. Pattern Anal. Mach. Intell. 36(6), 1107–1119 (2014)
Zhang, W., Gao, K., Zhang, Y., Li, J.: Efficient approximate nearest neighbor search with integrated binary codes. In: Proceedings of the 19th ACM International Conference on Multimedia, pp. 1189–1192. ACM (2011)
Chenggang Clarence, Y., Hongtao, X., Bing, Z., Yanping, M., Qiong, D., Yizhi, L.: Fast approximate matching of binary codes with distinctive bits. Front. Comput. Sci. 9(5), 741–750 (2015)
Wang, X.-J., Zhang, L., Jing, F., Ma, W.-Y.: Annosearch: image auto-annotation by search. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 2, pp. 1483–1490. IEEE (2006)
Zhang, X., Zhang, L., Shum, H.-Y.: Qsrank: query-sensitive hash code ranking for efficient epsilon-neighbor search. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012, pp. 2058–2065. IEEE (2012)
Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975)
Hadjieleftheriou, M., Manolopoulos, Y., Theodoridis, Y., Tsotras, V.J.: R-trees–a dynamic index structure for spatial searching. In: Encyclopedia of GIS, pp. 993–1002. Springer (2008)
Acknowledgements
This work was supported by the Science and Technology Planning Project of He’nan Province (162102310405), Youth Innovation Promotion Association Chinese Academy of Sciences (2017209), National Nature Science Foundation of China (61671196, 61327902), Zhejiang Province Nature Science Foundation of China (LR17F030006).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Xu, D., Xie, H. & Yan, C. Triple-Bit Quantization with Asymmetric Distance for Image Content Security. Machine Vision and Applications 28, 771–779 (2017). https://doi.org/10.1007/s00138-017-0853-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00138-017-0853-3