short-paper

An Intelligent Advertisement Short Video Production System via Multi-Modal Retrieval

Authors:

Lianghua Huang,

Pan PanAuthors Info & Claims

SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 3368 - 3372

https://doi.org/10.1145/3477495.3536323

Published: 07 July 2022 Publication History

Abstract

In its most basic form, advertising video production communicates a message about a product or service to the public. In the age of digital marketing, where the most popular way to connect with audiences is through advertising videos. However, advertising video production is a costly and complicated process from creation, material shooting, editing to the final commercial video. Therefore, producing qualified advertising videos is a capital and talent-intensive task, which poses a huge challenge for start-ups or inexperienced ad creators. paper proposes an intelligent advertising video production system driven by multi-modal retrieval, which only requires the input of descriptive copy. This system can automatically generate scripts, then extract key queries, retrieve related short video materials in the video library, and finally synthesize short advertising videos. The whole process minimizes human input, greatly reduces the threshold for advertising video production and greatly improves output and efficiency. It has a modular design to encourage the study of new multi-modal algorithms, which can be evaluated in batch mode. It can also integrate with a user interface, which allows user studies and data collection in an interactive mode, where the back end can be fully algorithmic or a wizard of oz setup. The proposed system has been fully verified and has broad prospects in the production of short videos for commodity advertisements within Alibaba.

References

[1]

Michael Bianchi. 2004. Automatic video production of lectures using an intelligent and aware environment. In Proceedings of the 3rd international conference on Mobile and ubiquitous multimedia. 117--123.

Digital Library

[2]

Kyu-Hyoung Choi, Sang-Wook Lee, and Yong-Duek Seo. 2009. Automatic broadcast video generation for ball sports from multiple views. In Proceedings of the Korean Society of Broadcast Engineers Conference. The Korean Institute of Broadcast and Media Engineers, 193--198.

[3]

Peng Hu, Liangli Zhen, Dezhong Peng, and Pei Liu. 2019. Scalable deep multimodal learning for cross-modal retrieval. In Proceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval. 635--644.

Digital Library

[4]

A Kaklauskas, EK Zavadskas, A Banaitis, I Meidute-Kavaliauskiene, A Liberman, S Dzitac, I Ubarte, A Binkyte, J Cerkauskas, A Kuzminske, et al. 2018. A neuroadvertising property video recommendation system. Technological Forecasting and Social Change 131 (2018), 78--93.

[5]

Honghai Liu, Shengyong Chen, and Naoyuki Kubota. 2013. Intelligent video systems and analytics: A survey. IEEE Transactions on Industrial Informatics 9, 3 (2013), 1222--1233.

[6]

Tao Mei, Xian-Sheng Hua, Linjun Yang, and Shipeng Li. 2007. VideoSense: towards effective online video advertising. In Proceedings of the 15th ACM international conference on Multimedia. 1075--1084.

Digital Library

[7]

Motoyuki Ozeki, Yuichi Nakamura, and Yuichi Ohta. 2001. Camerawork for intelligent video production-capturing desktop manipulations. In IEEE International Conference on Multimedia and Expo, 2001. ICME 2001. IEEE Computer Society, 11--11.

[8]

Motoyuki Ozeki, Yuichi Nakamura, and Yuichi Ohta. 2002. Human behavior recognition for an intelligent video production system. In Pacific-Rim Conference on Multimedia. Springer, 1153--1160.

[9]

Yingwei Pan, Yue Chen, Qian Bao, Ning Zhang, Ting Yao, Jingen Liu, and Tao Mei. 2021. Smart Director: An Event-Driven Directing System for Live Broadcasting. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 17, 4 (2021), 1--18.

Digital Library

[10]

Reza Rassool. 2017. VMAF reproducibility: Validating a perceptual practical video quality metric. In 2017 IEEE international symposium on broadband multimedia systems and broadcasting (BMSB). IEEE, 1--2.

[11]

Srinivasan H Sengamedu, Neela Sawant, and SmitaWadhwa. 2007. vADeo: video advertising system. In Proceedings of the 15th ACM international conference on Multimedia. 455--456.

Digital Library

[12]

Than Htut Soe. 2021. AI video editing tools. What editors want and how far is AI from delivering? arXiv preprint arXiv:2109.07809 (2021).

[13]

Md Azher Uddin, Aftab Alam, Nguyen Anh Tu, Md Siyamul Islam, and Young-Koo Lee. 2019. SIAT: A distributed video analytics framework for intelligent video surveillance. Symmetry 11, 7 (2019), 911.

[14]

Hirotada Ueda, Takafumi Miyatake, Shigeo Sumino, and Akio Nagasaka. 1993. Automatic structure visualization for video editing. In Proceedings of the INTERACT'93 and CHI'93 Conference on Human Factors in Computing Systems. 137--141.

Digital Library

[15]

Karthik Yadati, Harish Katti, and Mohan Kankanhalli. 2013. CAVVA: Computational affective video-in-video advertising. IEEE Transactions on Multimedia 16, 1 (2013), 15--23.

[16]

Haijun Zhang, Xiong Cao, John KL Ho, and Tommy WS Chow. 2016. Object-level video advertising: an optimization framework. IEEE Transactions on industrial informatics 13, 2 (2016), 520--531.

Cited By

Wang TLi FZhu LLi JZhang ZShen H(2024)Invisible Black-Box Backdoor Attack against Deep Cross-Modal Hashing RetrievalACM Transactions on Information Systems10.1145/365020542:4(1-27)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3650205
Kim JKim H(2024)Unlocking Creator-AI Synergy: Challenges, Requirements, and Design Opportunities in AI-Powered Short-Form Video ProductionProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642476(1-23)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642476
Fan FJing PNie LGu HSu Y(2024)SADCMF: Self-Attentive Deep Consistent Matrix Factorization for Micro-Video Multi-Label ClassificationIEEE Transactions on Multimedia10.1109/TMM.2024.340619626(10331-10341)Online publication date: 2024
https://doi.org/10.1109/TMM.2024.3406196
Show More Cited By

Index Terms

An Intelligent Advertisement Short Video Production System via Multi-Modal Retrieval
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Novelty in information retrieval
  2. Information systems applications
    1. Computational advertising
    2. Multimedia information systems
      1. Multimedia content creation

Recommendations

A semantic model for cross-modal and multi-modal retrieval
ICMR '13: Proceedings of the 3rd ACM conference on International conference on multimedia retrieval

In this paper, a semantic model for cross-modal and multi-modal retrieval is studied. We assume that the semantic correlation of multimedia data from different modalities can be depicted in a probabilistic generation framework. Media data from different ...
Multimodal Video Retrieval with the 2017 IMOTION System
ICMR '17: Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval

The IMOTION system is a multimodal content-based video search and browsing application offering a rich set of query modes on the basis of a broad range of different features. It is able to scale with the size of the collection due to its underlying ...
Multi-modal Dictionary BERT for Cross-modal Video Search in Baidu Advertising
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

Due to their attractiveness, video advertisements are adored by advertisers. Baidu, as one of the leading search advertisement platforms in China, is putting more and more effort into video advertisements for its advertisement customers. Search-based ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 2022

3569 pages

ISBN:9781450387323

DOI:10.1145/3477495

General Chairs:
Enrique Amigo
UNED
,
Pablo Castells
UAM and Amazon
,
Julio Gonzalo
UNED
,
Program Chairs:
Ben Carterette
Spotify
,
J. Shane Culpepper
RMIT University
,
Gabriella Kazai
Waseda University

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 July 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

SIGIR '22

Sponsor:

SIGIR

SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 11 - 15, 2022

Madrid, Spain

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
306
Total Downloads

Downloads (Last 12 months)75
Downloads (Last 6 weeks)12

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wang TLi FZhu LLi JZhang ZShen H(2024)Invisible Black-Box Backdoor Attack against Deep Cross-Modal Hashing RetrievalACM Transactions on Information Systems10.1145/365020542:4(1-27)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3650205
Kim JKim H(2024)Unlocking Creator-AI Synergy: Challenges, Requirements, and Design Opportunities in AI-Powered Short-Form Video ProductionProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642476(1-23)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642476
Fan FJing PNie LGu HSu Y(2024)SADCMF: Self-Attentive Deep Consistent Matrix Factorization for Micro-Video Multi-Label ClassificationIEEE Transactions on Multimedia10.1109/TMM.2024.340619626(10331-10341)Online publication date: 2024
https://doi.org/10.1109/TMM.2024.3406196
Ramagundam SKarne N(2024)A Survey of Generative AI: A Game Changer for Free Streaming Services and Ad Personalization with Current Techniques, Identifying Research Gaps and Addressing Challenges2024 4th International Conference on Electrical, Computer, Communications and Mechatronics Engineering (ICECCME)10.1109/ICECCME62383.2024.10796059(1-7)Online publication date: 4-Nov-2024
https://doi.org/10.1109/ICECCME62383.2024.10796059

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten