skip to main content
10.1145/3477495.3536323acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper

An Intelligent Advertisement Short Video Production System via Multi-Modal Retrieval

Published: 07 July 2022 Publication History

Abstract

In its most basic form, advertising video production communicates a message about a product or service to the public. In the age of digital marketing, where the most popular way to connect with audiences is through advertising videos. However, advertising video production is a costly and complicated process from creation, material shooting, editing to the final commercial video. Therefore, producing qualified advertising videos is a capital and talent-intensive task, which poses a huge challenge for start-ups or inexperienced ad creators. paper proposes an intelligent advertising video production system driven by multi-modal retrieval, which only requires the input of descriptive copy. This system can automatically generate scripts, then extract key queries, retrieve related short video materials in the video library, and finally synthesize short advertising videos. The whole process minimizes human input, greatly reduces the threshold for advertising video production and greatly improves output and efficiency. It has a modular design to encourage the study of new multi-modal algorithms, which can be evaluated in batch mode. It can also integrate with a user interface, which allows user studies and data collection in an interactive mode, where the back end can be fully algorithmic or a wizard of oz setup. The proposed system has been fully verified and has broad prospects in the production of short videos for commodity advertisements within Alibaba.

References

[1]
Michael Bianchi. 2004. Automatic video production of lectures using an intelligent and aware environment. In Proceedings of the 3rd international conference on Mobile and ubiquitous multimedia. 117--123.
[2]
Kyu-Hyoung Choi, Sang-Wook Lee, and Yong-Duek Seo. 2009. Automatic broadcast video generation for ball sports from multiple views. In Proceedings of the Korean Society of Broadcast Engineers Conference. The Korean Institute of Broadcast and Media Engineers, 193--198.
[3]
Peng Hu, Liangli Zhen, Dezhong Peng, and Pei Liu. 2019. Scalable deep multimodal learning for cross-modal retrieval. In Proceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval. 635--644.
[4]
A Kaklauskas, EK Zavadskas, A Banaitis, I Meidute-Kavaliauskiene, A Liberman, S Dzitac, I Ubarte, A Binkyte, J Cerkauskas, A Kuzminske, et al. 2018. A neuroadvertising property video recommendation system. Technological Forecasting and Social Change 131 (2018), 78--93.
[5]
Honghai Liu, Shengyong Chen, and Naoyuki Kubota. 2013. Intelligent video systems and analytics: A survey. IEEE Transactions on Industrial Informatics 9, 3 (2013), 1222--1233.
[6]
Tao Mei, Xian-Sheng Hua, Linjun Yang, and Shipeng Li. 2007. VideoSense: towards effective online video advertising. In Proceedings of the 15th ACM international conference on Multimedia. 1075--1084.
[7]
Motoyuki Ozeki, Yuichi Nakamura, and Yuichi Ohta. 2001. Camerawork for intelligent video production-capturing desktop manipulations. In IEEE International Conference on Multimedia and Expo, 2001. ICME 2001. IEEE Computer Society, 11--11.
[8]
Motoyuki Ozeki, Yuichi Nakamura, and Yuichi Ohta. 2002. Human behavior recognition for an intelligent video production system. In Pacific-Rim Conference on Multimedia. Springer, 1153--1160.
[9]
Yingwei Pan, Yue Chen, Qian Bao, Ning Zhang, Ting Yao, Jingen Liu, and Tao Mei. 2021. Smart Director: An Event-Driven Directing System for Live Broadcasting. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 17, 4 (2021), 1--18.
[10]
Reza Rassool. 2017. VMAF reproducibility: Validating a perceptual practical video quality metric. In 2017 IEEE international symposium on broadband multimedia systems and broadcasting (BMSB). IEEE, 1--2.
[11]
Srinivasan H Sengamedu, Neela Sawant, and SmitaWadhwa. 2007. vADeo: video advertising system. In Proceedings of the 15th ACM international conference on Multimedia. 455--456.
[12]
Than Htut Soe. 2021. AI video editing tools. What editors want and how far is AI from delivering? arXiv preprint arXiv:2109.07809 (2021).
[13]
Md Azher Uddin, Aftab Alam, Nguyen Anh Tu, Md Siyamul Islam, and Young-Koo Lee. 2019. SIAT: A distributed video analytics framework for intelligent video surveillance. Symmetry 11, 7 (2019), 911.
[14]
Hirotada Ueda, Takafumi Miyatake, Shigeo Sumino, and Akio Nagasaka. 1993. Automatic structure visualization for video editing. In Proceedings of the INTERACT'93 and CHI'93 Conference on Human Factors in Computing Systems. 137--141.
[15]
Karthik Yadati, Harish Katti, and Mohan Kankanhalli. 2013. CAVVA: Computational affective video-in-video advertising. IEEE Transactions on Multimedia 16, 1 (2013), 15--23.
[16]
Haijun Zhang, Xiong Cao, John KL Ho, and Tommy WS Chow. 2016. Object-level video advertising: an optimization framework. IEEE Transactions on industrial informatics 13, 2 (2016), 520--531.

Cited By

View all
  • (2024)Invisible Black-Box Backdoor Attack against Deep Cross-Modal Hashing RetrievalACM Transactions on Information Systems10.1145/365020542:4(1-27)Online publication date: 26-Apr-2024
  • (2024)Unlocking Creator-AI Synergy: Challenges, Requirements, and Design Opportunities in AI-Powered Short-Form Video ProductionProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642476(1-23)Online publication date: 11-May-2024
  • (2024)SADCMF: Self-Attentive Deep Consistent Matrix Factorization for Micro-Video Multi-Label ClassificationIEEE Transactions on Multimedia10.1109/TMM.2024.340619626(10331-10341)Online publication date: 2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
July 2022
3569 pages
ISBN:9781450387323
DOI:10.1145/3477495
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 July 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. cross-modal retrieval
  2. multi-modal retrieval
  3. neural networks
  4. video production

Qualifiers

  • Short-paper

Conference

SIGIR '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)75
  • Downloads (Last 6 weeks)12
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Invisible Black-Box Backdoor Attack against Deep Cross-Modal Hashing RetrievalACM Transactions on Information Systems10.1145/365020542:4(1-27)Online publication date: 26-Apr-2024
  • (2024)Unlocking Creator-AI Synergy: Challenges, Requirements, and Design Opportunities in AI-Powered Short-Form Video ProductionProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642476(1-23)Online publication date: 11-May-2024
  • (2024)SADCMF: Self-Attentive Deep Consistent Matrix Factorization for Micro-Video Multi-Label ClassificationIEEE Transactions on Multimedia10.1109/TMM.2024.340619626(10331-10341)Online publication date: 2024
  • (2024)A Survey of Generative AI: A Game Changer for Free Streaming Services and Ad Personalization with Current Techniques, Identifying Research Gaps and Addressing Challenges2024 4th International Conference on Electrical, Computer, Communications and Mechatronics Engineering (ICECCME)10.1109/ICECCME62383.2024.10796059(1-7)Online publication date: 4-Nov-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media