Abstract
Information reranking aims to recover the true order of the initial search results. Traditional reranking approaches have achieved great success in uni-modal document retrieval. They, however, suffer from the following limitations when reranking multi-modal documents: (1) they are unable to capture and model the relations among multiple modalities within the same document; (2) they usually concatenate diverse features extracted from different modalities into one single vector, rather than adaptively fusing them by considering their discriminative capabilities with respect to the given query; and (3) most of them consider the pairwise relations among documents but discard their higher-order grouping relations, which leads to information loss. Towards this end, we propose an adaptive multi-modal multi-view (\(\mathbf{aMM }\)) reranking model. This model is able to jointly regularize the relatedness among modalities, the effects of feature views extracted from different modalities, as well as the complex relations among multi-modal documents. Extensive experiments on three datasets well validated the effectiveness and robustness of our proposed model.
Similar content being viewed by others
Notes
A study in [30] shows that a failed image query tends to be longer than the average successful query, which indicates longer queries’ higher specificity of contents and also reveals the limitations of current web image search engines for complex queries.
References
Bolla M (1993) Spectra, euclidean representations and clusterings of hypergraphs. Discrete Math 117(1):19–39
Brin S, Page L (1998) The anatomy of a large-scale hypertextual web search engine. In: Proceedings of the Seventh World Wide Web Conference
Cai J, Zha ZJ, Wang M, Zhang S, Tian Q (2015) An attribute-assisted reranking model for web image search. Image Process IEEE Trans 24(1):261–272
Deng C, Ji R, Tao D, Gao X, Li X (2014) Weakly supervised multi-graph learning for robust image reranking. Multimed IEEE Trans 16(3):785–795
Dollár P, Tu Z, Tao H, Belongie S (2007) Feature mining for image classification. In: Computer Vision and Pattern Recognition, 2007. IEEE Conference on, pp 1–8
Etter D, Domeniconi C (2014) Semi-supervised rank learning for multimedia known-item search. In: Proceedings of International Conference on Multimedia Retrieval, p 257
Faria FF, Veloso A, Almeida HM, Valle E, Torres RdS, Gonçalves MA, Meira Jr W (2010) Learning to rank for content-based image retrieval. In: Proceedings of the international conference on Multimedia information retrieval, pp 285–294
Farseev A, Nie L, Akbari M, Chua TS (2015) Harvesting multiple sources for user profile learning: a big data study. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, pp 235–242
Fu Y, Hospedales TM, Xiang T, Fu Z, Gong S (2014) Transductive multi-view embedding for zero-shot recognition and annotation. In: Computer Vision–ECCV, pp 584–599
Gao Y, Wang M, Tao D, Ji R, Dai Q (2012) 3-d object retrieval and recognition with hypergraph analysis. Image Process IEEE Trans 21(9):4290–4303
Gao Y, Wang M, Zha ZJ, Shen J, Li X, Wu X (2013) Visual-textual joint relevance learning for tag-based social image search. Image Process IEEE Trans 22(1):363–376
Gehler P, Nowozin S (2009) On feature combination for multiclass object classification. In: Computer Vision, IEEE 12th International Conference on, pp 221–228
He J, Li M, Zhang HJ, Tong H, Zhang C (2004) Manifold-ranking based image retrieval. In: Proceedings of the 12th annual ACM international conference on Multimedia, pp 9–16
He X, Ma WY, Zhang HJ (2004) Learning an image manifold for retrieval. In: Proceedings of the 12th annual ACM international conference on Multimedia, pp 17–23
Hsu WH, Kennedy LS, Chang SF (2007) Video search reranking through random walk over document-level context graph. In: Proceedings of the 15th international conference on Multimedia, pp 971–980
Huang Y, Liu Q, Zhang S, Metaxas DN (2010) Image retrieval via probabilistic hypergraph ranking. In: Computer Vision and Pattern Recognition, IEEE Conference on, pp 3376–3383
Indyk P, Motwani R (1998) Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the thirtieth annual ACM symposium on Theory of computing, pp 604–613
Jeon J, Lavrenko V, Manmatha R (2003) Automatic image annotation and retrieval using cross-media relevance models. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pp 119–126
Jin T, Yu J, You J, Zeng K, Li C, Yu Z (2015) Low-rank matrix factorization with multiple hypergraph regularizer. Pattern Recognit 48(3):1011–1022
Li H, Wang M, Hua XS (2009) Msra-mm 2.0: A large-scale web multimedia dataset. In: Data Mining Workshops, IEEE International Conference on, pp 164–169
Liu J, Lai W, Hua XS, Huang Y, Li S (2007) Video search re-ranking via multi-graph propagation. In: Proceedings of the 15th international conference on Multimedia, pp 208–217
Liu Y, Mei T, Hua XS (2009) Crowdreranking: exploring multiple search engines for visual search reranking. In: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, pp 500–507
Marchenko Y, Chua TS, Jain R (2006) Semi-supervised annotation of brushwork in paintings domain using serial combinations of multiple experts. In: Proceedings of the ACM International Conference on Multimedia, pp 529–538
McFee B, Lanckriet GR (2010) Metric learning to rank. In: Proceedings of the 27th International Conference on Machine Learning, pp 775–782
Mei T, Rui Y, Li S, Tian Q (2014) Multimedia search reranking: a literature survey. ACM Comput Surv 46(3):38
Nie L, Akbari M, Li T, Chua TS (2014) A joint local-global approach for medical terminology assignment. In: Medical Information Retrieval Workshop at SIGIR, p 24
Nie L, Yan S, Wang M, Hong R, Chua TS (2012) Harvesting visual concepts for image search with complex queries. In: Proceedings of the 20th ACM international conference on Multimedia, pp 59–68
Nie L, Zhao YL, Akbari M, Shen J, Chua TS (2015) Bridging the vocabulary gap between health seekers and healthcare knowledge. Knowl Data Eng IEEE Trans 27(2):396–409
Nie L, Zhao YL, Wang X, Shen J, Chua TS (2014) Learning to recommend descriptive tags for questions in social forums. ACM Trans Inf Syst 32(1):5
Pu HT (2008) An analysis of failed queries for web image retrieval. J Inf Sci 34(3):275–289
Qiu S, Wang X, Tang X (2013) Anchor concept graph distance for web image re-ranking. In: Proceedings of the 21st ACM international conference on Multimedia, pp 713–716
Snoek CG, Worring M, Smeulders AW (2005) Early versus late fusion in semantic video analysis. In: Proceedings of the 13th annual ACM international conference on Multimedia, pp 399–402
Song X, Nie L, Zhang L, Akbari M, Chua TS (2015) Multiple social network learning and its application in volunteerism tendency prediction. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp 213–222
Tian X, Yang L, Wang J, Yang Y, Wu X, Hua XS (2008) Bayesian video search reranking. In: Proceedings of the 16th ACM international conference on Multimedia, pp 131–140
Wang C, Zhang L, Zhang HJ (2008) Learning to reduce the semantic gap in web image retrieval and annotation. In: Proceedings of the 31st annual international ACM conference on Research and development in information retrieval, pp 355–362
Wang L, Yang L, Tian X (2009) Query aware visual similarity propagation for image search reranking. In: Proceedings of the 17th ACM international conference on Multimedia, pp 725–728
Wang M, Li H, Tao D, Lu K, Wu X (2012) Multimodal graph-based reranking for web image search. Image Process IEEE Trans 21(11):4649–4661
Wang X, Qiu S, Liu K, Tang X (2014) Web image re-ranking usingquery-specific semantic signatures. Pattern Anal Mach Intell IEEE Trans 36(4):810–823
Wang Y, Wu F, Song J, Li X, Zhuang Y (2014) Multi-modal mutual topic reinforce modeling for cross-media retrieval. In: Proceedings of the ACM International Conference on Multimedia, pp 307–316
Wu J, Hong Z, Pan S, Zhu X, Cai Z, Zhang C (2014) Exploring features for complicated objects: Cross-view feature selection for multi-instance learning. In: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, pp 1699–1708
Xu B, Bu J, Chen C, Cai D, He X, Liu W, Luo J (2011) Efficient manifold ranking for image retrieval. In: Proceedings of the 34th international ACM conference on Research and development in Information Retrieval, pp 525–534
Ye G, Liu D, Jhuo IH, Chang SF (2012) Robust late fusion with rank minimization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Yu J, Rui Y, Chen B (2014) Exploiting click constraints and multi-view features for image re-ranking. Multimed IEEE Trans 16(1):159–168
Yu J, Tao D, Wang M (2012) Adaptive hypergraph learning and its application in image classification. Image Process IEEE Trans 21(7):3262–3272
Yu J, Tao D, Wang M, Rui Y (2015) Learning to rank using user clicks and visual features for image retrieval. Cybern IEEE Trans 45(4):767–779
Yuan J, Zhao YL, Luan H, Wang M, Chua TS (2014) Memory recall based video search: finding videos you have seen before based on your memory. ACM Trans Multimed Comput Commun Appl 10(2):21
Zhou D, Bousquet O, Lal TN, Weston J, Schölkopf B (2004) Learning with local and global consistency. Adv Neural Inf Process Syst 16(16):321–328
Zhou D, Huang J, Schölkopf B (2005) Learning from labeled and unlabeled data on a directed graph. In: Proceedings of the 22nd international conference on Machine learning, pp 1036–1043
Zhou D, Huang J, Schölkopf B (2006) Learning with hypergraphs: Clustering, classification, and embedding. In: Advances in neural information processing systems, pp 1601–1608
Zien JY, Schlag MD, Chan PK (1999) Multilevel spectral hypergraph partitioning with arbitrary vertex sizes. Comput-Aided Des Integr Circuits Syst, IEEE Trans 18(9):1389–1399
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Akbari, M., Nie, L. & Chua, TS. aMM: Towards adaptive ranking of multi-modal documents. Int J Multimed Info Retr 4, 233–245 (2015). https://doi.org/10.1007/s13735-015-0088-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13735-015-0088-x