Abstract
Image re-ranking is effective in improving text-based image retrieval experience. However, to construct an efficacious algorithm to achieve such a target is limited by two important issues: one is that visual features extracted for image re-ranking from images are too superficial to represent the whole information contained within images; the other is that the corresponding text information often mismatches semantics of images. In this paper, we utilize autoencoders to extract deeper features of images and exploit click data to bridge the semantic gap between query words and image semantics. A graph-based algorithm(MIR-AC) is proposed to adaptively integrate features from autoencoders and click information by constructing two manifolds with updating weights. In particular, MIR-AC completes image re-ranking by conducting an iterative optimization process in which image ranking scores and weights of manifolds are updated alternatively. Experiments are conducted on a real world dataset and results demonstrate that MIR-AC outperforms given state-of-arts in image re-ranking.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bay, H., Ess, A., Tuytelaars, T., Gool, L.V.: Speeded-up robust features (surf). Computer Vis. Image Underst. 110, 346–359 (2008)
Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 24(4), 509–522 (2002)
Cai, X., Han, G., Xiao, S.: An image registration method based on similarity of edge information. In: Proceedings of the IEEE International Symposium on Industrial Electronics, pp. 1111–1115. IEEE Press, May 2012
Carterette, B., Jones, R.: Evaluating search engines by modeling the relationship between relevance and clicks. In: Advances in Neural Information Processing Systems, pp. 217–224. MIT Press (2009)
Chen, M., Weinberger, K.Q., Sha, F., Bengio, Y.: Marginalized denoising autoencoders for nonlinear representations. In: IEEE International Conference on Machine Learning, pp. 1476–1484. IEEE (2014)
Cheng, B., Liu, G., Wang, J., Huang, Z., Yan, S.: Multi-task low-rank affinity pursuit for image segmentation. In: IEEE International Conference on Computer Vision, pp. 2439–2446. IEEE (2011)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, pp. 886–893. IEEE Press (2005)
G. Smith, H.A.: Evaluating implicit judgments from image search interactions. In: WebSci 2009: Society On-line (2009)
Gao, Y., Wang, M., Tao, D., Ji, R., Dai, Q.: 3-d object retrieval and recognition with hypergraph analysis. IEEE Trans. Image Process. 21, 4290–4303 (2012)
Hong, C., Yu, J., Li, J., Chen, X.: Multi-view hypergraph learning by patch alignment framework. Neurocomputing 118(22), 79–86 (2013)
Hua, X.S., Yang, L., Wang, J., Wang, J., Ye, M., Wang, K., Rui, Y., Li, J.: Clickage: towards bridging semantic and intent gaps via mining click logs of search engines. In: ACM International Conference on Multimedia, pp. 243–252. ACM (2013)
Lowe, D.G.: Distinctive image features from scale-invariant key points. Int. J. Comput. Vis. 60, 91–110 (2004)
Baghshah, M.S., Shouraki, S.B.: Metric learning for semi-supervised clustering using pairwise constraints and the geometrical structure of data. Intell. Data Anal. 13, 887–899 (2009)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014)
Szummer, M., Jaakkola, T.: Partially labeled classification with markov random walks. In: Advances in Neural Information Processing Systems, vol. 14, pp. 945–952. MIT Press (2001)
Wang, M., Hua, X., Yuan, X., Song, Y., Dai, L.: Optimizing multi-graph learning: towards a unified video annotation scheme. In: International Conference on MultiMedia Modelling, pp. 862–871. ACM MM (2007)
Wang, M., Hua, X.S., Hong, R., Tang, J., Qi, G.J., Song, Y.: Unified video annotation via multigraph learning. IEEE Trans. Circuits Syst. Video Technol. 19, 733–746 (2009)
Wang, M., Li, G., Lu, Z., Gao, Y., Chua, T.S.: When amazon meets google: product visualization by exploring multiple information sources. ACM Trans. Internet Technol. 12(4), 12 (2013)
Wang, M., Li, H., Tao, D., Lu, K., Wu, X.: Multimodal graph-based reranking for web image search. IEEE Trans. Image Process. 21(11), 4649–4661 (2012)
Wang, M., Yang, K., Hua, X.S., Zhang, H.J.: Towards a relevant and diverse search of social images. IEEE Trans. Multimedia 12, 829–842 (2010)
Wu, M., Scholkopf, B.: Transductive classification via local learning regularization. In: International Conference on Artificial Intelligence and Statistics, pp. 628–635. Microtome (2007)
Yoshua, B.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)
Yu, J., Liu, D., Tao, D., Seah, H.S.: On combining multiple features for cartoon character retrieval and clip synthesis. IEEE Trans. Syst. Man Cybern. Part B 42(5), 1413–1427 (2012)
Yu, J., Rui, Y., Chen, B.: Exploiting click constraints and multi-view features for image re-ranking. IEEE Trans. Multimedia 16(1), 159–168 (2014)
Zhang, T., Tao, D., Li, X., Yang, J.: Patch alignment for dimensionality reduction. IEEE Trans. Knowl. Data Eng. 21, 1299–1313 (2009)
Zhou, D., Bousquet, O., Lal, T.N., Weston, J., Scholkopf, B.: Learning with local and global consistency. In: Advances in Neural Information Processing Systems, vol. 16, pp. 321–328. MIT Press (2004)
Zhu, X., Ghahramani, Z., Lafferty, J.: Semi-supervised learning using Gaussian fields and harmonic functions. In: International Conference on Machine Learning, vol. 2, pp. 912–919. ACM, August 2003
Acknowledgement
This work is supported by the Natural Science Foundation of China (No. 61202145, No. 61300192, No. 61472110), the Program for New Century Excellent Talents in University (No. NECT-12-0323), Zhejiang Provincial Natural Science Foundation of China for Distinguished Young Scholars (No. LR15F020002), the Natural Science Foundation of Fujian Province of China under Grants (No. 2014J01256), and the education and research Foundation of Fujian Province of China under Grants (No. JB14082, JB12252S).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Tang, C., Zhu, Q., Hong, C., Yu, J. (2016). Multi-modal Image Re-ranking with Autoencoders and Click Semantics. In: Tian, Q., Sebe, N., Qi, GJ., Huet, B., Hong, R., Liu, X. (eds) MultiMedia Modeling. MMM 2016. Lecture Notes in Computer Science(), vol 9516. Springer, Cham. https://doi.org/10.1007/978-3-319-27671-7_51
Download citation
DOI: https://doi.org/10.1007/978-3-319-27671-7_51
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27670-0
Online ISBN: 978-3-319-27671-7
eBook Packages: Computer ScienceComputer Science (R0)