Multi-modal Image Re-ranking with Autoencoders and Click Semantics

Tang, Chaohui; Zhu, Qingxin; Hong, Chaoqun; Yu, Jun

doi:10.1007/978-3-319-27671-7_51

Chaohui Tang^19,20,
Qingxin Zhu¹⁹,
Chaoqun Hong²⁰ &
…
Jun Yu²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9516))

Included in the following conference series:

International Conference on Multimedia Modeling

2951 Accesses

Abstract

Image re-ranking is effective in improving text-based image retrieval experience. However, to construct an efficacious algorithm to achieve such a target is limited by two important issues: one is that visual features extracted for image re-ranking from images are too superficial to represent the whole information contained within images; the other is that the corresponding text information often mismatches semantics of images. In this paper, we utilize autoencoders to extract deeper features of images and exploit click data to bridge the semantic gap between query words and image semantics. A graph-based algorithm(MIR-AC) is proposed to adaptively integrate features from autoencoders and click information by constructing two manifolds with updating weights. In particular, MIR-AC completes image re-ranking by conducting an iterative optimization process in which image ranking scores and weights of manifolds are updated alternatively. Experiments are conducted on a real world dataset and results demonstrate that MIR-AC outperforms given state-of-arts in image re-ranking.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bay, H., Ess, A., Tuytelaars, T., Gool, L.V.: Speeded-up robust features (surf). Computer Vis. Image Underst. 110, 346–359 (2008)
Article Google Scholar
Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 24(4), 509–522 (2002)
Article Google Scholar
Cai, X., Han, G., Xiao, S.: An image registration method based on similarity of edge information. In: Proceedings of the IEEE International Symposium on Industrial Electronics, pp. 1111–1115. IEEE Press, May 2012
Google Scholar
Carterette, B., Jones, R.: Evaluating search engines by modeling the relationship between relevance and clicks. In: Advances in Neural Information Processing Systems, pp. 217–224. MIT Press (2009)
Google Scholar
Chen, M., Weinberger, K.Q., Sha, F., Bengio, Y.: Marginalized denoising autoencoders for nonlinear representations. In: IEEE International Conference on Machine Learning, pp. 1476–1484. IEEE (2014)
Google Scholar
Cheng, B., Liu, G., Wang, J., Huang, Z., Yan, S.: Multi-task low-rank affinity pursuit for image segmentation. In: IEEE International Conference on Computer Vision, pp. 2439–2446. IEEE (2011)
Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, pp. 886–893. IEEE Press (2005)
Google Scholar
G. Smith, H.A.: Evaluating implicit judgments from image search interactions. In: WebSci 2009: Society On-line (2009)
Google Scholar
Gao, Y., Wang, M., Tao, D., Ji, R., Dai, Q.: 3-d object retrieval and recognition with hypergraph analysis. IEEE Trans. Image Process. 21, 4290–4303 (2012)
Article MathSciNet Google Scholar
Hong, C., Yu, J., Li, J., Chen, X.: Multi-view hypergraph learning by patch alignment framework. Neurocomputing 118(22), 79–86 (2013)
Article Google Scholar
Hua, X.S., Yang, L., Wang, J., Wang, J., Ye, M., Wang, K., Rui, Y., Li, J.: Clickage: towards bridging semantic and intent gaps via mining click logs of search engines. In: ACM International Conference on Multimedia, pp. 243–252. ACM (2013)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant key points. Int. J. Comput. Vis. 60, 91–110 (2004)
Article Google Scholar
Baghshah, M.S., Shouraki, S.B.: Metric learning for semi-supervised clustering using pairwise constraints and the geometrical structure of data. Intell. Data Anal. 13, 887–899 (2009)
Google Scholar
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014)
MATH MathSciNet Google Scholar
Szummer, M., Jaakkola, T.: Partially labeled classification with markov random walks. In: Advances in Neural Information Processing Systems, vol. 14, pp. 945–952. MIT Press (2001)
Google Scholar
Wang, M., Hua, X., Yuan, X., Song, Y., Dai, L.: Optimizing multi-graph learning: towards a unified video annotation scheme. In: International Conference on MultiMedia Modelling, pp. 862–871. ACM MM (2007)
Google Scholar
Wang, M., Hua, X.S., Hong, R., Tang, J., Qi, G.J., Song, Y.: Unified video annotation via multigraph learning. IEEE Trans. Circuits Syst. Video Technol. 19, 733–746 (2009)
Article Google Scholar
Wang, M., Li, G., Lu, Z., Gao, Y., Chua, T.S.: When amazon meets google: product visualization by exploring multiple information sources. ACM Trans. Internet Technol. 12(4), 12 (2013)
Article Google Scholar
Wang, M., Li, H., Tao, D., Lu, K., Wu, X.: Multimodal graph-based reranking for web image search. IEEE Trans. Image Process. 21(11), 4649–4661 (2012)
Article MathSciNet Google Scholar
Wang, M., Yang, K., Hua, X.S., Zhang, H.J.: Towards a relevant and diverse search of social images. IEEE Trans. Multimedia 12, 829–842 (2010)
Article Google Scholar
Wu, M., Scholkopf, B.: Transductive classification via local learning regularization. In: International Conference on Artificial Intelligence and Statistics, pp. 628–635. Microtome (2007)
Google Scholar
Yoshua, B.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)
Article MATH Google Scholar
Yu, J., Liu, D., Tao, D., Seah, H.S.: On combining multiple features for cartoon character retrieval and clip synthesis. IEEE Trans. Syst. Man Cybern. Part B 42(5), 1413–1427 (2012)
Article Google Scholar
Yu, J., Rui, Y., Chen, B.: Exploiting click constraints and multi-view features for image re-ranking. IEEE Trans. Multimedia 16(1), 159–168 (2014)
Article Google Scholar
Zhang, T., Tao, D., Li, X., Yang, J.: Patch alignment for dimensionality reduction. IEEE Trans. Knowl. Data Eng. 21, 1299–1313 (2009)
Article Google Scholar
Zhou, D., Bousquet, O., Lal, T.N., Weston, J., Scholkopf, B.: Learning with local and global consistency. In: Advances in Neural Information Processing Systems, vol. 16, pp. 321–328. MIT Press (2004)
Google Scholar
Zhu, X., Ghahramani, Z., Lafferty, J.: Semi-supervised learning using Gaussian fields and harmonic functions. In: International Conference on Machine Learning, vol. 2, pp. 912–919. ACM, August 2003
Google Scholar

Download references

Acknowledgement

This work is supported by the Natural Science Foundation of China (No. 61202145, No. 61300192, No. 61472110), the Program for New Century Excellent Talents in University (No. NECT-12-0323), Zhejiang Provincial Natural Science Foundation of China for Distinguished Young Scholars (No. LR15F020002), the Natural Science Foundation of Fujian Province of China under Grants (No. 2014J01256), and the education and research Foundation of Fujian Province of China under Grants (No. JB14082, JB12252S).

Author information

Authors and Affiliations

School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
Chaohui Tang & Qingxin Zhu
School of Computer and Information Engineering, Xiamen University of Technology, Xiamen, 361024, China
Chaohui Tang & Chaoqun Hong
School of Computer Science, Hangzhou Dianzi University, Hangzhou, 310018, China
Jun Yu

Authors

Chaohui Tang
View author publications
You can also search for this author in PubMed Google Scholar
Qingxin Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Chaoqun Hong
View author publications
You can also search for this author in PubMed Google Scholar
Jun Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chaoqun Hong .

Editor information

Editors and Affiliations

University of Texas at San Antonio, San Antonio, USA
Qi Tian
Dept. of Information Engineering, University of Trento, Povo, Trento, Italy
Nicu Sebe
EECS, University of Central Florida, Orlando, Florida, USA
Guo-Jun Qi
EURECOM, Sophia-Antipolis, France
Benoit Huet
Hefei University of Technology, Hefei, Anhui, China
Richang Hong
School of Computing and Information, Hefei University of Technology, Hefei, Anhui, China
Xueliang Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tang, C., Zhu, Q., Hong, C., Yu, J. (2016). Multi-modal Image Re-ranking with Autoencoders and Click Semantics. In: Tian, Q., Sebe, N., Qi, GJ., Huet, B., Hong, R., Liu, X. (eds) MultiMedia Modeling. MMM 2016. Lecture Notes in Computer Science(), vol 9516. Springer, Cham. https://doi.org/10.1007/978-3-319-27671-7_51

Download citation

DOI: https://doi.org/10.1007/978-3-319-27671-7_51
Published: 03 January 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27670-0
Online ISBN: 978-3-319-27671-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics