Skip to main content
Log in

Click data guided query modeling with click propagation and sparse coding

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

We address the problem of fine-grained image recognition using user click data, wherein each image is represented as a semantical query-click feature vector. Usually, the query set obtained from search engines is large-scale and redundant, making the click feature be high-dimensional and sparse. We propose a novel query modeling approach to merge semantically similar queries, and construct a compact click feature with the merged queries. To deal with the sparsity and in-consistency in click feature, we design a graph based propagation approach to predict the zero-clicks, ensuring similar images have similar clicks for each query. Afterwards, using the propagated click feature, we formulate the query merging problem as a sparse coding based recognition task. In addition, the hot queries are utilized to construct the dictionary. We evaluate our method for fine-grained image recognition on the public Clickture-Dog dataset. It is shown that, the propagated click feature performs much better than the original one. In the query merging procedure, sparse coding performs better than traditional K-mean algorithm. Also, the “hot queries” outperform K-SVD in dictionary learning.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. https://github.com/Zjutanmin/SCodeByClick.git.

  2. The optimal α is 0.9 and 0.5 for Prop-E and Prop-W respectively.

  3. We use VGG-net [13] with 16-layers to learn a CNN model, including 13 convolutional layers and 3 fully connected layers. It is pre-trained on ImageNet Large-Scale Visual Recognition Challenge (ILSVRC)-2012 dataset.

References

  1. Berg T, Liu J, Lee SW, Alexander ML, Jacobs DW, Belhumeur PN (2014) Birdsnap: large-scale fine-grained visual categorization of birds. In: IEEE Conference on computer vision and pattern recognition, pp 2019–2026

  2. Chang YS (2017) Fine-grained attention for image caption generation. Multimed Tool Appl PP(7):1–13

  3. Cilibrasi RL, Vitanyi P (2007) The google similarity distance. IEEE Trans Knowl Data Eng 19(3):370–383

    Article  Google Scholar 

  4. Datta D, Singh SK, Chowdary CR (2017) Bridging the gap: effect of text query reformulation in multimodal retrieval. Multimed Tool Appl 76:1–18

    Article  Google Scholar 

  5. Feng L, Bhanu B (2016) Semantic concept co-occurrence patterns for image annotation and retrieval. IEEE Trans Pattern Anal Mach Intell 38(4):1–1

    Article  Google Scholar 

  6. Hua XS, Yang L, Wang J, Wang J, Ye M, Wang K, Rui Y, Li J (2013) Clickage: towards bridging semantic and intent gaps via mining click logs of search engines. In: ACM International conference on multimedia. ACM, pp 243–252

  7. Khosla A, Jayadevaprakash N, Yao B, Fei-Fei L (2011) Novel dataset for fine-grained image categorization. In: First workshop on fine-grained visual categorization, IEEE conference on computer vision and pattern recognition. Colorado Springs, CO

  8. Li C, Song Q, Wang Y, Song H, Kang Q, Cheng J, Lu H (2016) Learning to recognition from bing clickture data. In: IEEE International conference on multimedia and expo workshops, pp 1–4

  9. Liu T, Tao D (2016) Classification with noisy labels by importance reweighting. IEEE Trans Pattern Anal Mach Intell 38(3):447–461

    Article  Google Scholar 

  10. Nie L, Wang M, Zha Z, Li G, Chua TS (2011) Multimedia answering: enriching text qa with media information. In: ACM SIGIR Conference on research and development in information retrieval, SIGIR ‘11. ACM, pp 695–704

  11. Nie L, Wang M, Zha ZJ, Chua TS (2012) Oracle in image search: a content-based approach to performance prediction. ACM Trans Inf Syst 30(2):13:1–13:23

    Article  Google Scholar 

  12. Nie L, Yan S, Wang M, Hong R, Chua TS (2012) Harvesting visual concepts for image search with complex queries. In: ACM International conference on multimedia, MM’12. ACM, pp 59–68

  13. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  14. Tan M, Wang Y, Pan G (2012) Feature reduction for efficient object detection via L1-norm latent SVM. In: Intelligent science and intelligent data engineering

  15. Tan M, Pan G, Wang Y, Zhang Y, Wu Z (2014) L1-norm latent svm for compact features in object detection. Neurocomputing 139(139):56–64

    Article  Google Scholar 

  16. Tan M, Hu Z, Wang B, Zhao J, Wang Y (2016) Robust object recognition via weakly supervised metric and template learning. Neurocomputing 101:96–107

    Article  Google Scholar 

  17. Tan M, Wang B, Wu Z, Wang J, Pan G (2016) Weakly supervised metric learning for traffic sign recognition in a lidar-equipped vehicle. IEEE Trans Intell Transp Syst 17(5):1415–1427. https://doi.org/10.1109/TITS.2015.2506182

    Article  Google Scholar 

  18. Tan M, Yu J, Zheng G, Wu W, Sun K (2016) Deep neural network boosted large scale image recognition using user click data. In: International conference on internet multimedia computing and service, pp 118–121

  19. Tsung-Yu Lin AR, Maji S (2015) Bilinear CNN models for fine-grained visual recognition. In: IEEE International conference on computer vision

  20. Wang R, Liu T, Tao D (2017) Multiclass learning with partially corrupted labels. IEEE Trans Neural Netw Learn Syst PP(99):1–13

    Google Scholar 

  21. Yan Y, Nie F, Li W, Gao C, Yang Y, Xu D (2016) Image classification by cross-media active learning with privileged information. IEEE Trans Multimedia 18(12):2494–2502

  22. Yan C, Luo M, Liu W, Zheng Q (2017) Robust dictionary learning with graph regularization for unsupervised person re-identification. Multimed Tool Appl (2):1–25

  23. Yang Y, Nie F, Xu D, Luo J, Zhuang Y, Pan Y (2012) A multimedia retrieval framework based on semi-supervised ranking and relevance feedback. IEEE Trans Pattern Anal Mach Intell 34(4):723–742

    Article  Google Scholar 

  24. Yu J, Wang M, Tao D (2012) Semisupervised multiview distance metric learning for cartoon synthesis. IEEE Trans Image Process 21(11):4636–4648

    Article  MathSciNet  MATH  Google Scholar 

  25. Yu J, Rui Y, Chen B (2014) Exploiting click constraints and multi-view features for image re-ranking. IEEE Trans Multimedia 16(1):159–168

    Article  Google Scholar 

  26. Yu J, Rui Y, Tao D (2014) Click prediction for web image reranking using multimodal sparse coding. IEEE Trans Image Process 23(5):2019–2032

    Article  MathSciNet  MATH  Google Scholar 

  27. Yu J, Tao D, Meng W, Yong R (2015) Learning to rank using user clicks and visual features for image retrieval. IEEE Trans Cybern 45(4):767–779

    Article  Google Scholar 

  28. Zhang H, Zha ZJ, Yang Y, Yan S, Chua TS (2014) Robust (semi) nonnegative graph embedding. IEEE Trans Image Process A Publ the IEEE Signal Process Society 23(7):2996–3012

  29. Zhang H, Zha ZJ, Yang Y, Yan S, Gao Y, Chua TS (2014) Attribute-augmented semantic hierarchy:towards a unified framework for content-based image retrieval. ACM Trans Multimed Comput Commun Appl 11(1s):1–21

    Article  Google Scholar 

  30. Zhang J, Nie L, Wang X, He X, Huang X, Chua TS (2016) Shorter-is-better: venue category estimation from micro-video. In: ACM On multimedia conference, pp 1415–1424

  31. Zhang Y, Wei XS, Wu J, Cai J (2016) Weakly supervised fine-grained categorization with part-based image representation. IEEE Trans Image Process 25(4):1713–1725

    Article  MathSciNet  Google Scholar 

  32. Zhang H, Huang Y, Xu X, Zhu Z, Deng C (2017) Latent semantic factorization for multimedia representation learning. Multimed Tool Appl (1):1–16

  33. Zheng G, Tan M, Yu J, Wu Q, Fan J (2017) Fine-grained image recongnition via weakly supervised click data guided bilinear cnn model. In: IEEE International conference on multimedia and expo (accpet). IEEE

Download references

Acknowledgments

This work was partly supported by National Natural Science Foundation of China (No. 61602136, No.61622205, No. 61472110), and Zhejiang Provincial Natural Science Foundation of China under Grant LR15F020002.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun Yu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tan, M., Yu, J., Huang, Q. et al. Click data guided query modeling with click propagation and sparse coding. Multimed Tools Appl 77, 22145–22158 (2018). https://doi.org/10.1007/s11042-018-5703-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-5703-4

Keywords

Navigation