skip to main content
research-article

Up-Fusion: An Evolving Multimedia Fusion Method

Published: 04 September 2014 Publication History

Abstract

The amount of multimedia data on the Internet has increased exponentially in the past few decades and this trend is likely to continue. Multimedia content inherently has multiple information sources, therefore effective fusion methods are critical for data analysis and understanding. So far, most of the existing fusion methods are static with respect to time, making it difficult for them to handle the evolving multimedia content. To address this issue, in recent years, several evolving fusion methods were proposed, however, their requirements are difficult to meet, making them useful only in limited applications. In this article, we propose a novel evolving fusion method based on the online portfolio selection theory. The proposed method takes into account the correlation among different information sources and evolves the fusion model when new multimedia data is added. It performs effectively on both crisp and soft decisions without requiring additional context information. Extensive experiments on concept detection and human detection tasks over the TRECVID dataset and surveillance data have been conducted and significantly better performance has been obtained.

References

[1]
E. Acar, F. Hopfgartner, and S. Albayrak. 2013. Violence detection in hollywood movies by the fusion of visual and mid-level audio cues. In Proceedings of the 21st ACM International Conference on Multimedia. 717--720.
[2]
P. K. Atrey, M. A. Hossain, A. E. Saddik, and M. S. Kankanhalli. 2010. Multimodal fusion for multimedia analysis: A survey. Multimedia Syst. 16, 6, 345--379.
[3]
P. K. Atrey and A. E. Saddik. 2008. Confidence evolution in multimedia systems. IEEE Trans. Multimedia 10, 7, 1288--1298.
[4]
R. E. Bellman. 1961. Adaptive Control Processes - A Guided Tour. Princeton University Press.
[5]
X. Benavent, A. Garcia-Serrano, R. Granados, J. Benavent, and E. De Ves. 2013. Multimedia information retrieval based on late semantic fusion approaches: Experiments on a wikipedia image collection. IEEE Trans. Multimedia 15, 8, 2009--2021.
[6]
A. Blum and T. Mitchell. 1998. Combining labeled and unlabeled data with co-training. In Proceedings of the Annual Conference on Computational Learning Theory. 92--100.
[7]
C.-C. Chang and C.-J. Lin. 2001. LIBSVM: A library for support vector machines. http://www.csie.ntu.edu.tw/~cjlin/libsvm.
[8]
J.-G. Chen and N. Ansari. 1998. Adaptive fusion of correlated local decisions. IEEE Trans. Syst. Man, Cybernet. 28, 2, 276--281.
[9]
K. Crammer, M. Dredze, and F. Pereira. 2008. Exact convex confidence-weighted learning. In Proceedings of the Conference on Advances in Neural Information Processing Systems. 345--352.
[10]
B. V. Dasarathy. 1994. Decision Fusion. Computer Society Press.
[11]
X. Geng, K. Smith-Miles, L. Wang, M. Li, and Q. Wu. 2010. Context-aware fusion: A case study on fusion of gait and face for human identification in video. Pattern Recogn. 43, 10, 3660--3673.
[12]
D. P. Helmbold, R. E. Schapire, Y. Singer, and M. K. Warmuth. 1998. On-line portfolio selection using multiplicative updates. Math. Finance 8, 4, 325--347.
[13]
J. M. Keller, P. D. Gader, and C. W. Caldwell. 1995. Principle of least commitment in the analysis of chromosome images. Appl. Fuzzy Logic Technol. II 2493, 1, 178--186.
[14]
L. I. Kuncheva. 2004. Combining Pattern Classifiers: Methods and Algorithms. Wiley-Interscience.
[15]
J.-S. Lee and C. H. Park. 2008. Adaptive decision fusion for audio-visual speech recognition. In Speech Recognition, Technologies and Applications, InTech, 275--296.
[16]
B. Li and S. C. Hoi. 2012. Online portfolio selection: A survey. ACM Comput. Surv. 46, 3.
[17]
M. Li, Y. Zheng, S. Lin, Y.-D. Zhang, and T.-S. Chua. 2009. Multimedia evidence fusion for video concept detection via owa operator. In Proceedings of the 15th International Multimedia Modeling Conference on Advances in Multimedia Modeling. 208--216.
[18]
J. Ma, A. Kulesza, M. Dredze, K. Crammer, L. K. Saul, and F. Pereira. 2010. Exploiting feature covariance in high-dimensional online learning. In Proceedings of the International Conference on Artificial Intelligence and Statistics. 493--500.
[19]
J. R. Movellan and P. Mineiro. 1998. Robust sensor fusion: Analysis and application to audio visual speech recognition. Mach. Learn. 32, 85--100.
[20]
G. Myers, R. Nallapati, J. Hout, S. Pancoast, R. Nevatia, C. Sun, A. Habibian, D. Koelma, K. Sande, A. Smeulders, and C. Snoek. 2014. Evaluating multimedia features and fusion for example-based event detection. Mach. Vis. Appl. 25, 1, 17--32.
[21]
N. Poh and S. Bengio. 2005. How do correlation and variance of base-experts affect fusion in biometric authentication tasks? IEEE Trans. Signal Process. 53, 11, 4384--4396.
[22]
A. Sayedelahl, R. Araujo, and M. Kamel. 2013. Audio-visual feature-decision level fusion for spontaneous emotion estimation in speech conversations. In Proceedings of the IEEE International Conference on Multimedia and Expo Workshops (ICMEW'13). 1--6.
[23]
A. F. Smeaton, P. Over, and W. Kraaij. 2009. High-level feature detection from video in trecvid: A 5-year retrospective of achievements. In Multimedia Content Analysis, Theory and Applications, Springer, 151--174.
[24]
D. M. Tax, M. V. Breukelen, R. P. Duin, and J. Kittler. 2000. Combining multiple classifiers by averaging or by multiplying? Pattern Recogn. 33, 1475--1485.
[25]
M. Wang, X.-S. Hua, X. Yuan, Y. Song, and L.-R. Dai. 2007. Optimizing multi-graph learning: Towards a unified video annotation scheme. In Proceedings of the ACM International Conference on Multimedia. 862--871.
[26]
X. Wang and M. Kankanhalli. 2013. Multimedia fusion with mean-covariance analysis. IEEE Trans. Multimedia 15, 1, 120--128.
[27]
X. Wang and M. S. Kankanhalli. 2010. Portfolio theory of multimedia fusion. In Proceedings of the ACM International Conference on Multimedia. 723--726.
[28]
X. Wang, Y. Rui, and M. S. Kankanhalli. 2011. Up-fusion: An evolving multimedia decision fusion method. In Proceedings of the ACM International Conference on Multimedia. 1089--1092.
[29]
Y. Wu, E. Y. Chang, K. C.-C. Chang, and J. R. Smith. 2004. Optimal multimodal fusion for multimedia data analysis. In Proceedings of the ACM International Conference on Multimedia. 572--579.
[30]
R. Yan and M. Naphade. 2005. Multi-modal video concept extraction using co-training. In Proceedings of the International Conference on Multimedia and Expo. 514--517.
[31]
A. Yanagawa, S.-F. Chang, L. Kennedy, and W. Hsu. 2007. Columbia university's baseline detectors for 374 lscom semantic visual concepts. Tech. rep., 222-2006-8, Columbia University. http://www.ee.columbia. edu/ln/dvmm/columbia374/.
[32]
A. Yanagawa, W. Hsu, and S.-F. Chang. 2006. Brief descriptions of visual features for baseline trecvid concept detectors. Tech. rep., Columbia University. http://www.ee.columbia.edu/ln/dvmm/publications/06/akira-baseline-tr.pdf.

Cited By

View all
  • (2024)Coarse-Grained Task Parallelization by Dynamic Profiling for Heterogeneous SoC-Based Embedded SystemACM Transactions on Embedded Computing Systems10.1145/370463524:1(1-32)Online publication date: 15-Nov-2024
  • (2023)Mapi-Pro: An Energy Efficient Memory Mapping Technique for Intermittent ComputingACM Transactions on Architecture and Code Optimization10.1145/362952420:4(1-25)Online publication date: 20-Oct-2023
  • (2023)Human Variability and the Explore–Exploit Trade‐Off in RecommendationCognitive Science10.1111/cogs.1327947:4Online publication date: 13-Apr-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications
ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 11, Issue 1
August 2014
151 pages
ISSN:1551-6857
EISSN:1551-6865
DOI:10.1145/2665935
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 September 2014
Accepted: 01 April 2014
Revised: 01 March 2014
Received: 01 October 2013
Published in TOMM Volume 11, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Up-fusion
  2. fusion
  3. portfolio theory

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Coarse-Grained Task Parallelization by Dynamic Profiling for Heterogeneous SoC-Based Embedded SystemACM Transactions on Embedded Computing Systems10.1145/370463524:1(1-32)Online publication date: 15-Nov-2024
  • (2023)Mapi-Pro: An Energy Efficient Memory Mapping Technique for Intermittent ComputingACM Transactions on Architecture and Code Optimization10.1145/362952420:4(1-25)Online publication date: 20-Oct-2023
  • (2023)Human Variability and the Explore–Exploit Trade‐Off in RecommendationCognitive Science10.1111/cogs.1327947:4Online publication date: 13-Apr-2023
  • (2023)Benchmarking Optimization Algorithms for Auto-Tuning GPU KernelsIEEE Transactions on Evolutionary Computation10.1109/TEVC.2022.321065427:3(550-564)Online publication date: 1-Jun-2023
  • (2021)A Novel approach towards Implicit Authentication System by using Multi-share visual key Cryptography MechanismJournal of Physics: Conference Series10.1088/1742-6596/1963/1/0121411963:1(012141)Online publication date: 1-Jul-2021
  • (2020)Deep Program Structure Modeling Through Multi-Relational Graph-based LearningProceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques10.1145/3410463.3414670(111-123)Online publication date: 30-Sep-2020
  • (2019)Sleepy-LRUThe Journal of Supercomputing10.1007/s11227-019-02758-075:7(3945-3974)Online publication date: 31-Jul-2019
  • (2017)Endurance management for resistive logic-in-memory computing architecturesProceedings of the Conference on Design, Automation & Test in Europe10.5555/3130379.3130638(1092-1097)Online publication date: 27-Mar-2017
  • (2017)Leveraging multi-dimensional user models for personalized next-track music recommendationProceedings of the Symposium on Applied Computing10.1145/3019612.3019756(1635-1642)Online publication date: 3-Apr-2017
  • (2017)Enable back memory and global synchronization on LLC bufferThe Journal of Supercomputing10.1007/s11227-017-2093-873:12(5414-5439)Online publication date: 1-Dec-2017
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media