Skip to main content
Log in

aMM: Towards adaptive ranking of multi-modal documents

  • Regular Paper
  • Published:
International Journal of Multimedia Information Retrieval Aims and scope Submit manuscript

Abstract

Information reranking aims to recover the true order of the initial search results. Traditional reranking approaches have achieved great success in uni-modal document retrieval. They, however, suffer from the following limitations when reranking multi-modal documents: (1) they are unable to capture and model the relations among multiple modalities within the same document; (2) they usually concatenate diverse features extracted from different modalities into one single vector, rather than adaptively fusing them by considering their discriminative capabilities with respect to the given query; and (3) most of them consider the pairwise relations among documents but discard their higher-order grouping relations, which leads to information loss. Towards this end, we propose an adaptive multi-modal multi-view (\(\mathbf{aMM }\)) reranking model. This model is able to jointly regularize the relatedness among modalities, the effects of feature views extracted from different modalities, as well as the complex relations among multi-modal documents. Extensive experiments on three datasets well validated the effectiveness and robustness of our proposed model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. https://answers.yahoo.com/.

  2. A study in [30] shows that a failed image query tends to be longer than the average successful query, which indicates longer queries’ higher specificity of contents and also reveals the limitations of current web image search engines for complex queries.

    Table 1 The representative complex queries collected from Google Image Search Engine
  3. http://nlp.stanford.edu/software/tmt/.

  4. http://www-nlpir.nist.gov/projects/tv2012/tv2012.html.

References

  1. Bolla M (1993) Spectra, euclidean representations and clusterings of hypergraphs. Discrete Math 117(1):19–39

    Article  MATH  MathSciNet  Google Scholar 

  2. Brin S, Page L (1998) The anatomy of a large-scale hypertextual web search engine. In: Proceedings of the Seventh World Wide Web Conference

  3. Cai J, Zha ZJ, Wang M, Zhang S, Tian Q (2015) An attribute-assisted reranking model for web image search. Image Process IEEE Trans 24(1):261–272

    Article  MathSciNet  Google Scholar 

  4. Deng C, Ji R, Tao D, Gao X, Li X (2014) Weakly supervised multi-graph learning for robust image reranking. Multimed IEEE Trans 16(3):785–795

    Article  Google Scholar 

  5. Dollár P, Tu Z, Tao H, Belongie S (2007) Feature mining for image classification. In: Computer Vision and Pattern Recognition, 2007. IEEE Conference on, pp 1–8

  6. Etter D, Domeniconi C (2014) Semi-supervised rank learning for multimedia known-item search. In: Proceedings of International Conference on Multimedia Retrieval, p 257

  7. Faria FF, Veloso A, Almeida HM, Valle E, Torres RdS, Gonçalves MA, Meira Jr W (2010) Learning to rank for content-based image retrieval. In: Proceedings of the international conference on Multimedia information retrieval, pp 285–294

  8. Farseev A, Nie L, Akbari M, Chua TS (2015) Harvesting multiple sources for user profile learning: a big data study. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, pp 235–242

  9. Fu Y, Hospedales TM, Xiang T, Fu Z, Gong S (2014) Transductive multi-view embedding for zero-shot recognition and annotation. In: Computer Vision–ECCV, pp 584–599

  10. Gao Y, Wang M, Tao D, Ji R, Dai Q (2012) 3-d object retrieval and recognition with hypergraph analysis. Image Process IEEE Trans 21(9):4290–4303

    Article  MathSciNet  Google Scholar 

  11. Gao Y, Wang M, Zha ZJ, Shen J, Li X, Wu X (2013) Visual-textual joint relevance learning for tag-based social image search. Image Process IEEE Trans 22(1):363–376

    Article  MathSciNet  Google Scholar 

  12. Gehler P, Nowozin S (2009) On feature combination for multiclass object classification. In: Computer Vision, IEEE 12th International Conference on, pp 221–228

  13. He J, Li M, Zhang HJ, Tong H, Zhang C (2004) Manifold-ranking based image retrieval. In: Proceedings of the 12th annual ACM international conference on Multimedia, pp 9–16

  14. He X, Ma WY, Zhang HJ (2004) Learning an image manifold for retrieval. In: Proceedings of the 12th annual ACM international conference on Multimedia, pp 17–23

  15. Hsu WH, Kennedy LS, Chang SF (2007) Video search reranking through random walk over document-level context graph. In: Proceedings of the 15th international conference on Multimedia, pp 971–980

  16. Huang Y, Liu Q, Zhang S, Metaxas DN (2010) Image retrieval via probabilistic hypergraph ranking. In: Computer Vision and Pattern Recognition, IEEE Conference on, pp 3376–3383

  17. Indyk P, Motwani R (1998) Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the thirtieth annual ACM symposium on Theory of computing, pp 604–613

  18. Jeon J, Lavrenko V, Manmatha R (2003) Automatic image annotation and retrieval using cross-media relevance models. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pp 119–126

  19. Jin T, Yu J, You J, Zeng K, Li C, Yu Z (2015) Low-rank matrix factorization with multiple hypergraph regularizer. Pattern Recognit 48(3):1011–1022

    Article  Google Scholar 

  20. Li H, Wang M, Hua XS (2009) Msra-mm 2.0: A large-scale web multimedia dataset. In: Data Mining Workshops, IEEE International Conference on, pp 164–169

  21. Liu J, Lai W, Hua XS, Huang Y, Li S (2007) Video search re-ranking via multi-graph propagation. In: Proceedings of the 15th international conference on Multimedia, pp 208–217

  22. Liu Y, Mei T, Hua XS (2009) Crowdreranking: exploring multiple search engines for visual search reranking. In: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, pp 500–507

  23. Marchenko Y, Chua TS, Jain R (2006) Semi-supervised annotation of brushwork in paintings domain using serial combinations of multiple experts. In: Proceedings of the ACM International Conference on Multimedia, pp 529–538

  24. McFee B, Lanckriet GR (2010) Metric learning to rank. In: Proceedings of the 27th International Conference on Machine Learning, pp 775–782

  25. Mei T, Rui Y, Li S, Tian Q (2014) Multimedia search reranking: a literature survey. ACM Comput Surv 46(3):38

    Article  Google Scholar 

  26. Nie L, Akbari M, Li T, Chua TS (2014) A joint local-global approach for medical terminology assignment. In: Medical Information Retrieval Workshop at SIGIR, p 24

  27. Nie L, Yan S, Wang M, Hong R, Chua TS (2012) Harvesting visual concepts for image search with complex queries. In: Proceedings of the 20th ACM international conference on Multimedia, pp 59–68

  28. Nie L, Zhao YL, Akbari M, Shen J, Chua TS (2015) Bridging the vocabulary gap between health seekers and healthcare knowledge. Knowl Data Eng IEEE Trans 27(2):396–409

    Article  Google Scholar 

  29. Nie L, Zhao YL, Wang X, Shen J, Chua TS (2014) Learning to recommend descriptive tags for questions in social forums. ACM Trans Inf Syst 32(1):5

    Article  Google Scholar 

  30. Pu HT (2008) An analysis of failed queries for web image retrieval. J Inf Sci 34(3):275–289

    Article  Google Scholar 

  31. Qiu S, Wang X, Tang X (2013) Anchor concept graph distance for web image re-ranking. In: Proceedings of the 21st ACM international conference on Multimedia, pp 713–716

  32. Snoek CG, Worring M, Smeulders AW (2005) Early versus late fusion in semantic video analysis. In: Proceedings of the 13th annual ACM international conference on Multimedia, pp 399–402

  33. Song X, Nie L, Zhang L, Akbari M, Chua TS (2015) Multiple social network learning and its application in volunteerism tendency prediction. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp 213–222

  34. Tian X, Yang L, Wang J, Yang Y, Wu X, Hua XS (2008) Bayesian video search reranking. In: Proceedings of the 16th ACM international conference on Multimedia, pp 131–140

  35. Wang C, Zhang L, Zhang HJ (2008) Learning to reduce the semantic gap in web image retrieval and annotation. In: Proceedings of the 31st annual international ACM conference on Research and development in information retrieval, pp 355–362

  36. Wang L, Yang L, Tian X (2009) Query aware visual similarity propagation for image search reranking. In: Proceedings of the 17th ACM international conference on Multimedia, pp 725–728

  37. Wang M, Li H, Tao D, Lu K, Wu X (2012) Multimodal graph-based reranking for web image search. Image Process IEEE Trans 21(11):4649–4661

    Article  MathSciNet  Google Scholar 

  38. Wang X, Qiu S, Liu K, Tang X (2014) Web image re-ranking usingquery-specific semantic signatures. Pattern Anal Mach Intell IEEE Trans 36(4):810–823

    Article  Google Scholar 

  39. Wang Y, Wu F, Song J, Li X, Zhuang Y (2014) Multi-modal mutual topic reinforce modeling for cross-media retrieval. In: Proceedings of the ACM International Conference on Multimedia, pp 307–316

  40. Wu J, Hong Z, Pan S, Zhu X, Cai Z, Zhang C (2014) Exploring features for complicated objects: Cross-view feature selection for multi-instance learning. In: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, pp 1699–1708

  41. Xu B, Bu J, Chen C, Cai D, He X, Liu W, Luo J (2011) Efficient manifold ranking for image retrieval. In: Proceedings of the 34th international ACM conference on Research and development in Information Retrieval, pp 525–534

  42. Ye G, Liu D, Jhuo IH, Chang SF (2012) Robust late fusion with rank minimization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

  43. Yu J, Rui Y, Chen B (2014) Exploiting click constraints and multi-view features for image re-ranking. Multimed IEEE Trans 16(1):159–168

    Article  Google Scholar 

  44. Yu J, Tao D, Wang M (2012) Adaptive hypergraph learning and its application in image classification. Image Process IEEE Trans 21(7):3262–3272

    Article  MathSciNet  Google Scholar 

  45. Yu J, Tao D, Wang M, Rui Y (2015) Learning to rank using user clicks and visual features for image retrieval. Cybern IEEE Trans 45(4):767–779

    Article  Google Scholar 

  46. Yuan J, Zhao YL, Luan H, Wang M, Chua TS (2014) Memory recall based video search: finding videos you have seen before based on your memory. ACM Trans Multimed Comput Commun Appl 10(2):21

    Article  Google Scholar 

  47. Zhou D, Bousquet O, Lal TN, Weston J, Schölkopf B (2004) Learning with local and global consistency. Adv Neural Inf Process Syst 16(16):321–328

    Google Scholar 

  48. Zhou D, Huang J, Schölkopf B (2005) Learning from labeled and unlabeled data on a directed graph. In: Proceedings of the 22nd international conference on Machine learning, pp 1036–1043

  49. Zhou D, Huang J, Schölkopf B (2006) Learning with hypergraphs: Clustering, classification, and embedding. In: Advances in neural information processing systems, pp 1601–1608

  50. Zien JY, Schlag MD, Chan PK (1999) Multilevel spectral hypergraph partitioning with arbitrary vertex sizes. Comput-Aided Des Integr Circuits Syst, IEEE Trans 18(9):1389–1399

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Akbari.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Akbari, M., Nie, L. & Chua, TS. aMM: Towards adaptive ranking of multi-modal documents. Int J Multimed Info Retr 4, 233–245 (2015). https://doi.org/10.1007/s13735-015-0088-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13735-015-0088-x

Keywords

Navigation