Abstract
Correlation measure is a new hot topic in multimedia retrieval compared to distance metric like Euclidean and Mahalanobis distances. However, most correlation learning algorithms focused on multimedia data of single modality. For heterogeneous multi-modal data of different modalities correlation learning is more complicated. In this paper, we analyze multi-modal correlation among text, image and audio to understand underlying semantics for multi-modal retrieval. First, Kernel Canonical Correlation is used to build a kernel space where global inter-media correlation is analyzed; based on local geometrical topology in the kernel space a weighted graph and corresponding affinity matrix are formed for data and correlation representation; then correlation ranking is used to generate retrieval results; we also provide active learning strategies in relevance feedback to improve retrieval results. Experiment and comparison results are encouraging and show that the performance of our approach is effective.
This work is supported by Scientific Research Project funded by Education Department of Hubei Province (Q20091101), Science Foundation of Wuhan University of Science and Technology(2008TD04).
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Zhang, R., Zhang, Z.(M.): Effective Image Retrieval Based on Hidden Concept Discovery in Image Database. IEEE Transactions on Image Processing 16(2), 562–572 (2006)
Zhao, X., Zhuang, Y., Wu, F.: Audio Clip Retrieval with Fast Relevance Feedback based on Constrained Fuzzy Clustering and Stored Index Table. In: Chen, Y.-C., Chang, L.-W., Hsu, C.-T. (eds.) PCM 2002. LNCS, vol. 2532, pp. 237–244. Springer, Heidelberg (2002)
Fan, J., Elmagarmid, A.K., Zhu, X.q., Aref, W.G., Wu, L.: ClassView: Hierarchical Video Shot Classification, Indexing, and Accessing. IEEE Transactions on Multimedia 6(1), 70–86 (2004)
Zhuang, Y., Yi, Y., Fei, W.: Mining Semantic Correlation of Heterogeneous Multimedia Data for Cross-Media Retrieval. IEEE Transactions on Multimedia 10(2), 221–229 (2008)
Wu, F., Zhang, H., Zhuang, Y.: Learning Semantic Correlations for Cross-media Retrieval. In: The 13th International Conference on Image Processing, pp. 1465–1468 (2006)
Lew, M., Sebe, N., Djeraba, C., Jain, R.: Content-based Mul-timedia Information Retrieval: State-of-the-art and Challenges. ACM Transactions on Multimedia Computing, Communication, and Applications 2(1), 1–19 (2006)
Ma, Y., Lao, S., Takikawa, E., Kawade, M.: Discriminant Analysis in Correlation Similarity Measure Space. In: The 24th International Conference on Machine Learning, pp. 577–584 (2007)
Peterson, M.R., Doom, T.E., Raymer, M.L.: Facilitated KNN Classifier Optimization with Varying Similarity Measures. In: IEEE Congress on Evolutionary Computation, pp. 2514–2521 (2005)
Xie, C.Y., Savvides, M., Kumar, B.V.: Redundant Class-dependence Feature Analysis based on Correlation Filters using FRGC2.0 Data. In: Proceedings of the Computer Vision and Pattern Recognition, vol. 3, pp. 153–153 (2005)
Zhang, H., Guangweng, J.: Measuring Multi-modality Similarities via Subspace Learning for Cross-media Retrieval. In: Zhuang, Y.-t., Yang, S.-Q., Rui, Y., He, Q. (eds.) PCM 2006. LNCS, vol. 4261, pp. 979–988. Springer, Heidelberg (2006)
Zhang, H., Wang, Y.-y., Pan, H., Wu, F.: Understanding Visual-Auditory Correlation from Heterogeneous Features for Cross-media Retrieval. Journal of Zhejiang University Science-A 9, 241–249 (2008)
Hardoon, D.R., Szedmak, S., Shawe-Taylor, J.: Canonical Correlation Analysis: An Overview with Application to Learning Methods. Neural Computing 6, 2639–2664 (2004)
Melzer, T., Reiter, M., Bischof, H.: Appearance Models based on Kernel Canonical Correlation Analysis. Pattern Recognition 36, 1961–1971 (2003)
Zhou, D., Bousquet, O., Lal, T.N., et al.: Learning with Local and Global Consistency. In: Conference on Neural Information Processing Systems, NIPS (2003)
Zhou, D., Bousquet, O., Lal, T.N., et al.: Ranking on Data Manifolds. In: Conference on Neural Information Processing Systems (NIPS) (2003)
He, J., Li, M., Zhang, H.J., Tong, H., Zhang, C.: Manifold-ranking Based Image Retrieval. In: ACM Multimedia Conference (2004)
Kokare, M., Chatterji, B.N., Biswas, P.K.: Comparison of Similarity Metrics for texture Image Retrieval. In: IEEE Conf. on Convergent Technologies for Asia-Pacific Region, vol. 2, pp. 571–575 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, H., Meng, F. (2009). Multi-modal Correlation Modeling and Ranking for Retrieval. In: Muneesawang, P., Wu, F., Kumazawa, I., Roeksabutr, A., Liao, M., Tang, X. (eds) Advances in Multimedia Information Processing - PCM 2009. PCM 2009. Lecture Notes in Computer Science, vol 5879. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10467-1_56
Download citation
DOI: https://doi.org/10.1007/978-3-642-10467-1_56
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10466-4
Online ISBN: 978-3-642-10467-1
eBook Packages: Computer ScienceComputer Science (R0)