skip to main content
10.1145/1101149.1101337acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
Article

Graph based multi-modality learning

Published:06 November 2005Publication History

ABSTRACT

To better understand the content of multimedia, a lot of research efforts have been made on how to learn from multi-modal feature. In this paper, it is studied from a graph point of view: each kind of feature from one modality is represented as one independent graph; and the learning task is formulated as inferring from the constraints in every graph as well as supervision information (if available). For semi-supervised learning, two different fusion schemes, namely linear form and sequential form, are proposed. For each scheme, it is derived from optimization point of view; and further justified from two sides: similarity propagation and Bayesian interpretation. By doing so, we reveal the regular optimization nature, transductive learning nature as well as prior fusion nature of the proposed schemes, respectively. Moreover, the proposed method can be easily extended to unsupervised learning, including clustering and embedding. Systematic experimental results validate the effectiveness of the proposed method.

References

  1. Belkin, M., and Niyogi, P. Laplacian Eigenmaps and spectral techniques for embedding and clustering. Neural Computation, pp. 1373--1396, 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bickel, S., and Scheffer, T. Multi-view clustering. Proc. of Int. Conf. on Data Mining, pp. 19--26, 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Blum, A., and Mitchell, T. Combining labeled and unlabeled data with Co-Training. Proc. of the Conf. on Computational Learning Theory, pp. 92--100, 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Cai, D., He, X., Li, Z., Ma, W.Y., and Wen, J.R. Hierarchical clustering of WWW image search results using visual, textual and link information. Proc. of the ACM Conf. on Information Retrieval, pp. 952--959, 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Cascia, M.L., Sethi, S., and Sclaroff, S. Combining textural and visual cues for content-based image retrieval on the world wide web. IEEE Workshop on Content-based Access of Image and Video Libaries, pp. 24--28, 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Dupont, S., and Luettin, J. Audio-visual speech modeling for continuous speech recognition. IEEE Trans. on Multimedia, 2(3): 141--151, 2000.]]Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Feng, H., Shi, R., and Chua, T.S. A bootstrapping framework for annotating and retrieving WWW images. Proc. of the ACM Int. Conf. on Multimedia, pp. 960--967, 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Garg, A., Potamianos, G., Neti, C., and Huang, T.S. Frame-dependent multi-stream reliability indications for audio-visual speech recognition, Proc. of Int. Conf. on Acoustics, Speech and Signal Processing, vol. 1, pp. 24--27, 2003.]]Google ScholarGoogle Scholar
  9. Ghani, R. Combining labeled and unlabeled data multi-class text categorization. Proc. of the Intl. Conf. on Machine Learning, pp. 187--194, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. He, J., Li, M., Zhang, H.J., Tong, H., and Zhang, C. Manifold ranking based image retrieval. Proc. of the ACM Conf. on Information Retrieval, pp. 9--16, 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Heckmann, M., Berthommier, F., and Kroschel, K. Noise adaptive stream weighting in audio-visual speech recognition, EURASIP Journal on Applied Signal Process, pp. 1260--1273, 2002.]]Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Huang, J., Kumar, S.R., Mitra, M., Zhu, W.J., and Zabih, R. Image indexing using color correlograms. Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 762--768, 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Kailing, K., Kriegel, H., Pryakhin, A., and Schubert, M. Clustering multi-represented objects with noise. Proc. of the Pacific-Asia Conf. on Knowledge Discovery and Data Mining, pp. 394--403, 2004.]]Google ScholarGoogle ScholarCross RefCross Ref
  14. Kittler, J., Hatef, M., and Duin, R.P.W. Combining classifiers. Pattern Recognition, pp. 897--901, 1996.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Mallat, S.G., A theory for multiresolution signal decomposition: the wavelet representation. IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 11, no. 7, pp. 674--693, 1989.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Ng, A.Y., Jordan, M.I., and Weiss, Y. On spectral clustering: analysis and an algorithm. Advances in Neural Information Processing Systems, 2001.]]Google ScholarGoogle Scholar
  17. Nigam, K., and Ghani, R. Analyzing the effectiveness and applicability of Co-Training. Proc. of Information and Knowledge Management, pp. 86--93, 2000]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Swain, M., and Ballard, D. Color indexing. Int. Journal of Computer Vision, 7(1): 11--32, 1991.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Suen, C.Y., and Lam, L. Multiple classifier combination methodologies for different output level. Proc. of the First Int. Workshop on Multiple Classifier, pp. 52--66, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Reference removed for double-blind review]]Google ScholarGoogle Scholar
  21. Tamura, H., Mori, S., and Yamawaki, T. Textural features corresponding to visual perception. IEEE Trans. on Systems., Man and Cybernetics, pp. 460--472, 1978.]]Google ScholarGoogle ScholarCross RefCross Ref
  22. The WebKB dataset. http://meganesia.int.gu.edu.au/~phmartin/WebKB/.]]Google ScholarGoogle Scholar
  23. Wang, J., Zeng, H., Chen, Z., Lu, H., Tao, L., and Ma. W.Y. Recom: reinforcement clustering of multi-type interrelated data objects. Proc. of the ACM Conf. on Information Retrieval, pp. 274--281, 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Wu, Y., Chang, E.Y., Chang, K.C.C., and Smith, J.R. Optimal multimodal fusion for multimedia data analysis. Proc. of the ACM Int. Conf. on Multimedia, pp. 572--579, 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Yan, R., and Hauptmann, A.G. The combination limit in multimedia retrieval. Proc. of the ACM Int. Conf. on Multimedia, pp. 339--342, 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Yi, X. Zhang, C, and Wang, J. Multi-view EM algorithm and its application to color image segmentation. IEEE Int. Conf. on Multimedia and Expo, pp. 351--354, 2004.]]Google ScholarGoogle Scholar
  27. Zheng, X., Cai, D., He, X., Ma, W.Y., and Lin, X. Locality preserving clustering for image database. Proc. of the ACM Conf. on Information Retrieval, pp. 885--891, 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Zhou, D., and Schölkopf, B. A regularization framework for learning from graph data. Workshop on Statistical Relational Learning at Int. Conf. on Machine Learning, pp. 132--137, 2004.]]Google ScholarGoogle Scholar
  29. Zhou, D., and Schölkopf, B. Transductive Inference with Graphs. MPI Technical Report, 2004.]]Google ScholarGoogle Scholar
  30. Zhou, D., Bousquet, O., Lal, T.N., Weston, J., and Schölkopf, B. Learning with local and global consistency. 18th Annual Conf. on Neural Information Processing Systems, pp. 237--244, 2003.]]Google ScholarGoogle Scholar
  31. Zhou, D., Bousquet, O., Lal, T.N., Weston, J., and Schölkopf, B. Ranking on data manifolds. 18th Annual Conf. on Neural Information Processing System, pp. 169--176, 2003.]]Google ScholarGoogle Scholar

Index Terms

  1. Graph based multi-modality learning

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      MULTIMEDIA '05: Proceedings of the 13th annual ACM international conference on Multimedia
      November 2005
      1110 pages
      ISBN:1595930442
      DOI:10.1145/1101149

      Copyright © 2005 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 6 November 2005

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      MULTIMEDIA '05 Paper Acceptance Rate49of312submissions,16%Overall Acceptance Rate995of4,171submissions,24%

      Upcoming Conference

      MM '24
      MM '24: The 32nd ACM International Conference on Multimedia
      October 28 - November 1, 2024
      Melbourne , VIC , Australia

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader