Work in Progress

QA-FastPerson: Extending Video Platform Search Capabilities by Creating Summary Videos in Response to User Queries

Authors:
Kazuki Kawamura

Rekimoto Laboratory, The University of Tokyo, Japan and Kyoto Laboratory, Sony Computer Science Laboratories, Inc., Japan

Rekimoto Laboratory, The University of Tokyo, Japan and Kyoto Laboratory, Sony Computer Science Laboratories, Inc., Japan

0000-0002-5181-320X
View Profile

,
Jun Rekimoto

Rekimoto Laboratory, The University of Tokyo, JP and Kyoto Laboratory, Sony Computer Science Laboratories, Inc., Japan

Rekimoto Laboratory, The University of Tokyo, JP and Kyoto Laboratory, Sony Computer Science Laboratories, Inc., Japan

0000-0002-3629-2514
View Profile

AHs '24: Proceedings of the Augmented Humans International Conference 2024April 2024Pages 290–293https://doi.org/10.1145/3652920.3653052

Published:01 May 2024Publication History

AHs '24: Proceedings of the Augmented Humans International Conference 2024

Pages 290–293

ABSTRACT

In the rapidly evolving field of digital education, the need for efficient and targeted access to information within video content has become critical. This study presents a system designed to enhance the search capabilities of video platforms by generating summary videos that answer user queries. The system uses machine learning and natural language processing techniques to understand complex user queries, pinpoint the exact video segment that provides the answer, and answer user queries more efficiently by providing the user with a summary video around that segment. Preliminary evaluations have demonstrated the system’s potential to accurately identify relevant content and generate effective summaries.

References

Evlampios Apostolidis, Eleni Adamantidou, Alexandros I Metsai, Vasileios Mezaris, and Ioannis Patras. 2021. Video summarization using deep neural networks: A survey. Proc. of the IEEE 109, 11 (2021), 1838–1863.Google ScholarCross Ref
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, 2020. Language models are few-shot learners. Proc. of Advances in neural information processing systems 33 (2020), 1877–1901.Google Scholar
Longlong Jing and Yingli Tian. 2020. Self-supervised visual feature learning with deep neural networks: A survey. IEEE transactions on pattern analysis and machine intelligence 43, 11 (2020), 4037–4058.Google ScholarCross Ref
Kazuki Kawamura and Jun Rekimoto. 2024. FastPerson: Enhancing Video-Based Learning through Video Summarization that Preserves Linguistic and Visual Contexts. In Proc. of the Augmented Humans International Conference 2024.Google Scholar
Peter H Martorella. 1983. Interactive Video Systems in the Classroom.Social Education 47, 5 (1983), 325–27.Google Scholar
Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, 2022. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems 35 (2022), 27730–27744.Google Scholar
Linda C Petty and Ellen F Rosen. 1987. Computer-based interactive video systems. Behavior Research Methods, Instruments, & Computers 19, 2 (1987), 160–166.Google ScholarCross Ref
Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever, 2018. Improving language understanding by generative pre-training. (2018).Google Scholar
Wasifur Rahman, Md Kamrul Hasan, Sangwu Lee, AmirAli Bagher Zadeh, Chengfeng Mao, Louis-Philippe Morency, and Ehsan Hoque. 2020. Integrating Multimodal Information in Large Pretrained Transformers. In Proc. of the 58th Annual Meeting of the Association for Computational Linguistics. 2359–2369.Google ScholarCross Ref
Catharyn Shelton, Annie Hale, and Leanna Archambault. 2016. Exploring the Use of Interactive Digital Storytelling Video: Promoting Student Engagement and Learning in a University Hybrid Course. TechTrends 60 (06 2016).Google Scholar
Ba Tu Truong and Svetha Venkatesh. 2007. Video abstraction: A systematic review and classification. ACM Trans. Multimedia Comput. Commun. Appl. 3, 1 (feb 2007), 3–es.Google ScholarDigital Library
Yao-Hung Hubert Tsai, Shaojie Bai, Paul Pu Liang, J Zico Kolter, Louis-Philippe Morency, and Ruslan Salakhutdinov. 2019. Multimodal transformer for unaligned multimodal language sequences. In Proc. of the conference. Association for Computational Linguistics. Meeting, Vol. 2019. 6558.Google ScholarCross Ref
Sirui Wang and Huei-Lien Chen. 2016. Video That Matters: Enhancing Student Engagement Through Interactive Video-Centric Program in Online Courses. thannual (2016), 136.Google Scholar
Kaiyang Zhou, Yu Qiao, and Tao Xiang. 2018. Deep reinforcement learning for unsupervised video summarization with diversity-representativeness reward. In Proc. of the AAAI Conference on Artificial Intelligence, Vol. 32.Google ScholarCross Ref
Luowei Zhou, Yingbo Zhou, Jason J Corso, Richard Socher, and Caiming Xiong. 2018. End-to-end dense video captioning with masked transformer. In Proc. of the IEEE conference on computer vision and pattern recognition. 8739–8748.Google ScholarCross Ref

Index Terms

QA-FastPerson: Extending Video Platform Search Capabilities by Creating Summary Videos in Response to User Queries
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Video summarization
2. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interactive systems and tools

Recommendations

FastPerson: Enhancing Video-Based Learning through Video Summarization that Preserves Linguistic and Visual Contexts
AHs '24: Proceedings of the Augmented Humans International Conference 2024

Quickly understanding lengthy lecture videos is essential for learners with limited time and interest in various topics to improve their learning efficiency. To this end, video summarization has been actively researched to enable users to view only ...
Read More
Efficient top-k retrieval for user preference queries
SAC '11: Proceedings of the 2011 ACM Symposium on Applied Computing

Efficient retrieval of the most relevant (i.e. top-k) tuples is an important requirement in information systems which access large amounts of data. In general answering a top-k query request means to retrieve the k-objects which score best for an ...
Read More
Impact of search results on user queries
WIDM '09: Proceedings of the eleventh international workshop on Web information and data management

In this paper, we experimentally study how web searchers select the keywords to describe their information needs and specifically we investigate whether query keyword selections are influenced by the results the users reviewed for a previous search. For ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
AHs '24: Proceedings of the Augmented Humans International Conference 2024
April 2024
355 pages
ISBN:9798400709807
DOI:10.1145/3652920
Editors:
Anusha Withana
The University of Sydney, AU
,
Mark Billinghurst
University of South Australia, AU
,
Karola Marky
Ruhr-University Bochum, DE
,
Zhanna Sarsenbayeva
The University of Sydney, AU
,
Don Samitha Elvitigala
Monash University, AU
,
Benjamin Tag
Monash University, AU
,
Steeven Villa
LMU Munich, DE
,
Yun Suen Pai
University of Auckland, NZ
Copyright © 2024 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 May 2024
Check for updates
Author Tags
Video summarization
e-learning
human–computer interaction
large language model
learning efficiency
user-centered design
Qualifiers
- Work in Progress
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 6
  Total Downloads
- Downloads (Last 12 months)6
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

QA-FastPerson: Extending Video Platform Search Capabilities by Creating Summary Videos in Response to User Queries

AHs '24: Proceedings of the Augmented Humans International Conference 2024

ABSTRACT

References

Cited By

Index Terms

Recommendations

FastPerson: Enhancing Video-Based Learning through Video Summarization that Preserves Linguistic and Visual Contexts

Efficient top-k retrieval for user preference queries

Impact of search results on user queries