Abstract
Hashing techniques have recently gained increasing research interest in multimedia studies. Most existing hashing methods only employ single features for hash code learning. Multiview data with each view corresponding to a type of feature generally provides more comprehensive information. How to efficiently integrate multiple views for learning compact hash codes still remains challenging. In this article, we propose a novel unsupervised hashing method, dubbed multiview discrete hashing (MvDH), by effectively exploring multiview data. Specifically, MvDH performs matrix factorization to generate the hash codes as the latent representations shared by multiple views, during which spectral clustering is performed simultaneously. The joint learning of hash codes and cluster labels enables that MvDH can generate more discriminative hash codes, which are optimal for classification. An efficient alternating algorithm is developed to solve the proposed optimization problem with guaranteed convergence and low computational complexity. The binary codes are optimized via the discrete cyclic coordinate descent (DCC) method to reduce the quantization errors. Extensive experimental results on three large-scale benchmark datasets demonstrate the superiority of the proposed method over several state-of-the-art methods in terms of both accuracy and scalability.
- Mikhail Belkin and Partha Niyogi. 2001. Laplacian eigenmaps and spectral techniques for embedding and clustering. In Proceedings of Advances in Neural Information Processing Systems, Vol. 14. 585--591. Google ScholarDigital Library
- Dimitri P. Bertsekas. 1999. Nonlinear Programming. Athena Scientific.Google Scholar
- Michael M. Bronstein, Alexander M. Bronstein, Fabrice Michel, and Nikos Paragios. 2010. Data fusion through cross-modality metric learning using similarity-sensitive hashing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3594--3601.Google ScholarCross Ref
- Xinlei Chen and Deng Cai. 2011. Large scale spectral clustering with landmark-based representation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 5. 14. Google ScholarDigital Library
- Tat-Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, and Yantao Zheng. 2009. NUS-WIDE: A real-world web image database from National University of Singapore. In Proceedings of the ACM International Conference on Image and Video Retrieval. 1--9. Google ScholarDigital Library
- Guiguang Ding, Yuchen Guo, and Jile Zhou. 2014. Collective matrix factorization hashing for multimodal data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2083--2090. Google ScholarDigital Library
- Aristides Gionis, Piotr Indyk, and Rajeev Motwani. 1999. Similarity search in high dimensions via hashing. In Proceedings of the International Conference on Very Large Data Bases. 518--529. Google ScholarDigital Library
- Yunchao Gong, Svetlana Lazebnik, Albert Gordo, and Florent Perronnin. 2013. Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 12 (2013), 2916--2929. Google ScholarDigital Library
- Gregory Griffin, Alex Holub, and Pietro Perona. 2007. Caltech-256 Object Category Dataset. Technical Report.Google Scholar
- Bin Gu and Victor S. Sheng. 2017. A robust regularization path algorithm for -support vector classification. IEEE Transactions on Neural Networks and Learning Systems 28, 5 (2017), 1241--1248.Google ScholarCross Ref
- Bin Gu, Victor S. Sheng, KengYeow Tay, Walter Romano, and Shuo Li. 2015. Incremental support vector learning for ordinal regression. IEEE Transactions on Neural Networks and Learning Systems 26, 7 (2015), 1403--1416.Google ScholarCross Ref
- Bin Gu, Xingming Sun, and Victor S. Sheng. 2017. Structural minimax probability machine. IEEE Transactions on Neural Networks and Learning Systems 28, 7 (2017), 1646--1656.Google ScholarCross Ref
- Saehoon Kim, Yoonseop Kang, and Seungjin Choi. 2012. Sequential spectral learning to hash with multiple representations. In Proceedings of the European Conference on Computer Vision. 538--551. Google ScholarDigital Library
- Alex Krizhevsky. 2009. Learning Multiple Layers of Features from Tiny Images. Master’s thesis. Department of Computer Science, University of Toronto.Google Scholar
- Brian Kulis and Kristen Grauman. 2012. Kernelized locality-sensitive hashing. IEEE Transactions on Pattern Analysis and Machine Intelligence 34, 6 (2012), 1092--1104. Google ScholarDigital Library
- Shaishav Kumar and Raghavendra Udupa. 2011. Learning hash functions for cross-view similarity search. In Proceedings of the International Joint Conference on Artificial Intelligence. 1360--1365. Google ScholarDigital Library
- Daniel D. Lee and H. Sebastian Seung. 2001. Algorithms for non-negative matrix factorization. In Proceedings of Advances in Neural Information Processing Systems. 556--562.Google Scholar
- Li Liu, Mengyang Yu, and Ling Shao. 2015. Multiview alignment hashing for efficient image search. IEEE Transactions on Image Processing 24, 3 (2015), 956--966.Google ScholarDigital Library
- Meng Liu, Yong Luo, Dacheng Tao, Chao Xu, and Yonggang Wen. 2015b. Low-rank multi-view learning in matrix completion for multi-label image classification. In Proceedings of the AAAI Conference on Artificial Intelligence. 2778--2784. Google ScholarDigital Library
- Wei Liu, Junfeng He, and Shih-Fu Chang. 2010. Large graph construction for scalable semi-supervised learning. In Proceedings of the International Conference on Machine Learning. 679--686. Google ScholarDigital Library
- Wei Liu, Cun Mu, Sanjiv Kumar, and Shih-Fu Chang. 2014. Discrete graph hashing. In Proceedings of Advances in Neural Information Processing Systems. 3419--3427. Google ScholarDigital Library
- Wei Liu, Jun Wang, Sanjiv Kumar, and Shih-Fu Chang. 2011. Hashing with graphs. In Proceedings of the International Conference on Machine Learning. 1--8. Google ScholarDigital Library
- Xianglong Liu, Junfeng He, and Bo Lang. 2014. Multiple feature kernel hashing for large-scale visual search. Pattern Recognition 47, 2 (2014), 748--757. Google ScholarDigital Library
- Xianglong Liu, Lei Huang, Cheng Deng, Bo Lang, and Dacheng Tao. 2016. Query-adaptive hash code ranking for large-scale multi-view visual search. IEEE Transactions on Image Processing 25, 10 (2016), 4514--4524. Google ScholarDigital Library
- Xianglong Liu, Lei Huang, Cheng Deng, Jiwen Lu, and Bo Lang. 2015a. Multi-view complementary hash tables for nearest neighbor search. In Proceedings of the IEEE International Conference on Computer Vision. 1107--1115. Google ScholarDigital Library
- Xianglong Liu, Yadong Mu, Danchen Zhang, Bo Lang, and Xuelong Li. 2015. Large-scale unsupervised hashing with shared structure learning. IEEE Transactions on Cybernetics 45, 9 (2015), 1811--1822.Google ScholarCross Ref
- Yong Luo, Dacheng Tao, Kotagiri Ramamohanarao, Chao Xu, and Yonggang Wen. 2015. Tensor canonical correlation analysis for multi-view dimension reduction. IEEE Transactions on Knowledge and Data Engineering 27, 11 (2015), 3111--3124. Google ScholarDigital Library
- Andrew Y. Ng, Michael I. Jordan, and Yair Weiss. 2001. On spectral clustering: Analysis and an algorithm. In Proceedings of Advances in Neural Information Processing Systems. 849--856. Google ScholarDigital Library
- Fumin Shen, Yadong Mu, Yang Yang, Wei Liu, Li Liu, Jingkuan Song, and Heng Tao Shen. 2017b. Classification by retrieval: Binarizing data and classifiers. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. 595--604. Google ScholarDigital Library
- Fumin Shen, Chunhua Shen, Wei Liu, and Heng Tao Shen. 2015. Supervised discrete hashing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 37--45.Google ScholarCross Ref
- Fumin Shen, Chunhua Shen, Qinfeng Shi, Anton Van Den Hengel, and Zhenmin Tang. 2013. Inductive hashing on manifolds. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1562--1569. Google ScholarDigital Library
- Xiaobo Shen, Weiwei Liu, Ivor W. Tsang, Fumin Shen, and Quan-Sen Sun. 2017a. Compressed K-means for large-scale clustering. In Proceedings of the AAAI Conference on Artificial Intelligence. 2527--2533.Google Scholar
- Xiaobo Shen, Fumin Shen, Quan-Sen Sun, Yang Yang, Yunhao Yuan, and Heng Tao Shen. 2017. Semi-paired discrete hashing: Learning latent hash codes for semi-paired cross-view retrieval. IEEE Transactions on Cybernetics 47, 12 (2017), 4275--4288.Google ScholarCross Ref
- Xiaobo Shen, Fumin Shen, Quan-Sen Sun, and Yun-Hao Yuan. 2015. Multi-view latent hashing for efficient multimedia search. In Proceedings of the ACM International Conference on Multimedia. 831--834. Google ScholarDigital Library
- Ajit P. Singh and Geoffrey J. Gordon. 2008. Relational learning via collective matrix factorization. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 650--658. Google ScholarDigital Library
- Jingkuan Song, Yi Yang, Zi Huang, Heng Tao Shen, and Jiebo Luo. 2013a. Effective multiple feature hashing for large-scale near-duplicate video retrieval. IEEE Transactions on Multimedia 15, 8 (2013), 1997--2008. Google ScholarDigital Library
- Jingkuan Song, Yang Yang, Yi Yang, Zi Huang, and Heng Tao Shen. 2013b. Inter-media hashing for large-scale retrieval from heterogeneous data sources. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 785--796. Google ScholarDigital Library
- Qing Tian and Songcan Chen. 2017. Cross-heterogeneous-database age estimation through correlation representation learning. Neurocomputing 238 (2017), 286--295. Google ScholarDigital Library
- Daixin Wang, Peng Cui, Mingdong Ou, and Wenwu Zhu. 2015. Deep multimodal hashing with orthogonal regularization. In Proceedings of the International Joint Conference on Artificial Intelligence. 2291--2297. Google ScholarDigital Library
- Jun Wang, Sanjiv Kumar, and Shih-Fu Chang. 2012. Semi-supervised hashing for large-scale search. IEEE Transactions on Pattern Analysis and Machine Intelligence 34, 12 (2012), 2393--2406. Google ScholarDigital Library
- Jingdong Wang, Heng Tao Shen, Jingkuan Song, and Jianqiu Ji. 2014. Hashing for similarity search: A survey. arXiv Preprint arXiv:1408.2927 (2014).Google Scholar
- Meng Wang and Xian-Sheng Hua. 2011. Active learning in multimedia annotation and retrieval: A survey. ACM Transactions on Intelligent Systems and Technology 2, 2 (2011), 10:1--10:21. Google ScholarDigital Library
- Yunchao Wei, Yao Zhao, Zhenfeng Zhu, Shikui Wei, Yanhui Xiao, Jiashi Feng, and Shuicheng Yan. 2016. Modality-dependent cross-media retrieval. ACM Transactions on Intelligent Systems and Technology 7, 4 (2016), 57:1--57:13. Google ScholarDigital Library
- Yair Weiss, Antonio Torralba, and Rob Fergus. 2009. Spectral hashing. In Proceedings of Advances in Neural Information Processing Systems. 1753--1760. Google ScholarDigital Library
- Liping Xie, Dacheng Tao, and Haikun Wei. 2017. Joint structured sparsity regularized multiview dimension reduction for video-based facial expression recognition. ACM Transactions on Intelligent Systems and Technology 8, 2 (2017), 28:1--28:21. Google ScholarDigital Library
- Chang Xu, Dacheng Tao, and Chao Xu. 2015. Multi-view intact space learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 37, 12 (2015), 2531--2544. Google ScholarDigital Library
- Yi Yang, Heng Tao Shen, Feiping Nie, Rongrong Ji, and Xiaofang Zhou. 2011. Nonnegative spectral clustering with discriminative regularization.. In Proceedings of the AAAI Conference on Artificial Intelligence. 2--4. Google ScholarDigital Library
- Deming Zhai, Hong Chang, Shiguang Shan, Xilin Chen, and Wen Gao. 2012. Multiview metric learning with global consistency and local smoothness. ACM Transactions on Intelligent Systems and Technology 3, 3 (2012), 53:1--53:22. Google ScholarDigital Library
- Dan Zhang, Fei Wang, and Luo Si. 2011. Composite hashing with multiple information sources. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. 225--234. Google ScholarDigital Library
- Zhili Zhou, Q. M. Jonathan Wu, Fang Huang, and Xingming Sun. 2017. Fast and accurate near-duplicate image elimination for visual sensor networks. International Journal of Distributed Sensor Networks 13, 2 (2017).Google ScholarCross Ref
- Xiaofeng Zhu, Zi Huang, Heng Tao Shen, and Xin Zhao. 2013. Linear cross-modal hashing for efficient multimedia search. In Proceedings of the ACM International Conference on Multimedia. 143--152. Google ScholarDigital Library
Index Terms
- Multiview Discrete Hashing for Scalable Multimedia Search
Recommendations
Discrete Multi-view Hashing for Effective Image Retrieval
ICMR '17: Proceedings of the 2017 ACM on International Conference on Multimedia RetrievalRecently, hashing techniques have witnessed an increase in popularity due to their low storage cost and high query speed for large scale data retrieval task, e.g., image retrieval. Many methods have been proposed; however, most existing hashing ...
Multi-view Latent Hashing for Efficient Multimedia Search
MM '15: Proceedings of the 23rd ACM international conference on MultimediaHashing techniques have attracted broad research interests in recent multimedia studies. However, most of existing hashing methods focus on learning binary codes from data with only one single view, and thus cannot fully utilize the rich information ...
Sparse hashing for fast multimedia search
Hash-based methods achieve fast similarity search by representing high-dimensional data with compact binary codes. However, both generating binary codes and encoding unseen data effectively and efficiently remain very challenging tasks. In this article, ...
Comments