research-article

Incremental Multimodal Query Construction for Video Search

Authors:
Shicheng Xu

Carnegie Mellon University, Pittsburgh, PA, USA

Carnegie Mellon University, Pittsburgh, PA, USA
View Profile

,
Huan Li

Carnegie Mellon University, Pittsburgh, PA, USA

Carnegie Mellon University, Pittsburgh, PA, USA
View Profile

,
Xiaojun Chang

University of Queensland, Brisbane, Australia

University of Queensland, Brisbane, Australia
View Profile

,
Shoou-I Yu

Carnegie Mellon University, Pittsburgh, PA, USA

Carnegie Mellon University, Pittsburgh, PA, USA
View Profile

,
Xingzhong Du

University of Queensland, Brisbane, Australia

University of Queensland, Brisbane, Australia
View Profile

,
Xuanchong Li

Carnegie Mellon University, Pittsburgh, PA, USA

Carnegie Mellon University, Pittsburgh, PA, USA
View Profile

,
Lu Jiang

Carnegie Mellon University, Pittsburgh, PA, USA

Carnegie Mellon University, Pittsburgh, PA, USA
View Profile

,
Zexi Mao

Carnegie Mellon University, Pittsburgh, PA, USA

Carnegie Mellon University, Pittsburgh, PA, USA
View Profile

,
Zhenzhong Lan

Carnegie Mellon University, Pittsburgh, PA, USA

Carnegie Mellon University, Pittsburgh, PA, USA
View Profile

,
Susanne Burger

Carnegie Mellon University, Pittsburgh, PA, USA

Carnegie Mellon University, Pittsburgh, PA, USA
View Profile

,
Alexander Hauptmann

Carnegie Mellon University, Pittsburgh, PA, USA

Carnegie Mellon University, Pittsburgh, PA, USA
View Profile

ICMR '15: Proceedings of the 5th ACM on International Conference on Multimedia RetrievalJune 2015Pages 675–678https://doi.org/10.1145/2671188.2749413

Published:22 June 2015Publication History

ICMR '15: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval

Pages 675–678

ABSTRACT

Recent improvements in content-based video search have led to systems with promising accuracy, thus opening up the possibility for interactive content-based video search to the general public. We present an interactive system based on a state-of-the-art content-based video search pipeline which enables users to do multimodal text-to-video and video-to-video search in large video collections, and to incrementally refine queries through relevance feedback and model visualization. Also, the comprehensive functionalities enhance a flexible formulation of multimodal queries with different characteristics. Quantitative and qualitative analysis shows that our system is capable of assisting users to incrementally build effective queries over complex event topics.

References

P. Natarajan, S. Wu, S. Vitaladevuni, X. Zhuang, S. Tsakalidis, U. Park, and R. Prasad. Multimodal feature fusion for robust event detection in web videos. In CVPR, 2012. Google ScholarDigital Library
A. Tamrakar, S. Ali, Q. Yu, J. Liu, O. Javed, A. Divakaran, H. Cheng, and H. Sawhney. Evaluation of low-level features and their combinations for complex event detection in open source videos. In CVPR, 2012.Google ScholarCross Ref
S.-I. Yu, L. Jiang, Z. Mao, et al. Cmu-informedia @ trecvid. In TRECVID Video Retrieval Evaluation Workshop, 2014.Google Scholar
A. Habibian, M. Mazloom, and C. G. Snoek. On-the-fly video event search by semantic signatures. In Proceedings of International Conference on Multimedia Retrieval. ACM, 2014. Google ScholarDigital Library
A. G. Hauptmann, M. G. Christel, and R. Yan. Video retrieval based on semantic concepts. Proceedings of the IEEE, 2008.Google ScholarCross Ref
L. Jiang, D. Meng, T. Mitamura, and A. G. Hauptmann. Easy samples first: self-paced reranking for zero-example multimedia search. In Proceedings of the ACM International Conference on Multimedia, pages 547--556. ACM, 2014. Google ScholarDigital Library
L. Jiang, D. Meng, S.-I. Yu, Z. Lan, S. Shan, and A. Hauptmann. Self-paced learning with diversity. In Advances in Neural Information Processing Systems 27. 2014.Google Scholar
A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei-Fei. Large-scale video classification with convolutional neural networks. In CVPR, 2014. Google ScholarDigital Library
T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems, 2013.Google ScholarDigital Library
P. Over, G. Awad, J. Fiscus, and G. Sanders. Trecvid 2013 - an introduction to the goals, tasks, data, evaluation mechanisms, and metrics. TRECVID Workshop, 2013.Google Scholar
S. Strassel, A. Morris, J. G. Fiscus, C. Caruso, H. Lee, P. Over, J. Fiumara, B. Shaw, B. Antonishek, and M. Michel. Creating havic: Heterogeneous audio visual internet collection. In LREC. Citeseer, 2012.Google Scholar
Y. Miao, F. Metze, and S. Rawat. Deep maxout networks for low-resource speech recognition. In ASRU, 2013.Google ScholarCross Ref
H. Wang and C. Schmid. Action recognition with improved trajectories. In IEEE International Conference on Computer Vision, Sydney, Australia, 2013. Google ScholarDigital Library
C.-C. Chang and C.-J. Lin. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2011. Google ScholarDigital Library
H. Jegou, M. Douze, and C. Schmid. Product quantization for nearest neighbor search. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 2011. Google ScholarDigital Library

Index Terms

Incremental Multimodal Query Construction for Video Search
1. Information systems
  1. Information retrieval

Recommendations

Mutual relevance feedback for multimodal query formulation in video retrieval
MIR '05: Proceedings of the 7th ACM SIGMM international workshop on Multimedia information retrieval

Video indexing and retrieval systems allow users to find relevant video segments for a given information need. A multimodal video index may include speech indices, a text-from-screen (OCR) index, semantic visual concepts, content-based image features, ...
Read More
Web-scale Multimedia Search for Internet Video Content
WWW '16 Companion: Proceedings of the 25th International Conference Companion on World Wide Web

The World Wide Web has been witnessing an explosion of video content. Video data are becoming one of the most valuable sources to assess insights and information. However, existing video search methods are still based on text matching (text-to-text ...
Read More
Video Indexing, Search, Detection, and Description with Focus on TRECVID
ICMR '17: Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval
There has been a tremendous growth in video data the last decade. People are using mobile phones and tablets to take, share or watch videos more than ever before. Video cameras are around us almost everywhere in the public domain (e.g. stores, streets, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICMR '15: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval
June 2015
700 pages
ISBN:9781450332743
DOI:10.1145/2671188
General Chairs:
Alex Hauptmann
Carnegie Mellon University, USA
,
Chong-Wah Ngo
City University of Hong Kong, China
,
Xiangyang Xue
Fudan University, China
,
Program Chairs:
Yu-Gang Jiang
Fudan University, China
,
Cees Snoek
University of Amsterdam and Qualcomm Research Netherlands
,
Nuno Vasconcelos
University of California, San Diego, USA
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 June 2015
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Badges
- Best Demo
Author Tags
content-based video search
multimedia event detection
multimodal query generation
Qualifiers
- research-article
Conference

Acceptance Rates
ICMR '15 Paper Acceptance Rate48of127submissions,38%Overall Acceptance Rate254of830submissions,31%
More
Upcoming Conference
ICMR '24

Sponsor:

sigmm

International Conference on Multimedia Retrieval

June 10 - 14, 2024

Phuket , Thailand
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 10
  Total Citations
  View Citations
- 260
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Incremental Multimodal Query Construction for Video Search

ICMR '15: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Mutual relevance feedback for multimodal query formulation in video retrieval

Web-scale Multimedia Search for Internet Video Content

Video Indexing, Search, Detection, and Description with Focus on TRECVID