ABSTRACT
AI-assisted tools have become more prevalent than ever in the last few years. However, applying them to build a lifelog retrieval system is still non-trivial due to the disparity in interfaces and interactions. The Lifelog Search Challenge (LSC) aims to provide a testing ground where systems can be benchmarked in a highly competitive setting. In this paper, we present the fourth iteration of our participating system FIRST. For this year, we adopt generative models to equip the system with predictive ability rather than entirely relying on the user to input the query. We also index a sequence of images as an event for improved search speed. Finally, we demonstrate how the additional features can assist users in searching.
- Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin (Eds.). Vol. 33. Curran Associates, Inc., 1877–1901.Google Scholar
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, 4171–4186. https://doi.org/10.18653/v1/n19-1423Google Scholar
- Cathal Gurrin, Björn Þór Jónsson, Duc Tien Dang Nguyen, Graham Healy, Jakub Lokoc, Liting Zhou, Luca Rossetto, Minh-Triet Tran, Wolfgang Hürst, Werner Bailer, and Klaus Schoeffmann. 2023. Introduction to the Sixth Annual Lifelog Search Challenge, LSC’23. In Proc. International Conference on Multimedia Retrieval (ICMR’23) (Thessaloniki, Greece) (ICMR ’23). New York, NY, USA. https://doi.org/10.1145/3591106.3592304Google ScholarDigital Library
- Nhat Hoang-Xuan, Hoang-Phuc Trang-Trung, E-Ro Nguyen, Thanh-Cong Le, Mai-Khiem Tran, Tu-Khiem Le, Van-Tu Ninh, Cathal Gurrin, and Minh-Triet Tran. 2022. Flexible Interactive Retrieval SysTem 3.0 for Visual Lifelog Exploration at LSC 2022. In Proceedings of the 5th Annual on Lifelog Search Challenge(LSC ’22). Association for Computing Machinery, New York, NY, USA, 20–26. https://doi.org/10.1145/3512729.3533013Google ScholarDigital Library
- Nhat Hoang-Xuan, Hoang-Phuc Trang-Trung, Khiem Tran, Thanh-Cong Le, E-Ro Nguyen, Tu-Khiem Le, and Minh-Triet Tran. 2023. FIRST-Flexible Interactive Retrieval SysTem for Visual Lifelog Exploration. Multimedia Tools and Applications (2023).Google Scholar
- Jeff Johnson, Matthijs Douze, and Hervé Jégou. 2019. Billion-scale similarity search with GPUs. IEEE Transactions on Big Data 7, 3 (2019), 535–547.Google ScholarCross Ref
- Natalia Konstantinova and Constantin Orasan. 2013. Interactive Question Answering. In Emerging Applications of Natural Language Processing: Concepts and New Research. 149. https://doi.org/10.4018/978-1-4666-2169-5.ch007 Journal Abbreviation: Emerging Applications of Natural Language Processing: Concepts and New Research.Google Scholar
- Thao-Nhu Nguyen, Tu-Khiem Le, Van-Tu Ninh, Minh-Triet Tran, Thanh Binh Nguyen, Graham Healy, Sinéad Smyth, Annalina Caputo, and Cathal Gurrin. 2022. LifeSeeker 4.0: An Interactive Lifelog Search Engine for LSC’22. In Proceedings of the 5th Annual on Lifelog Search Challenge(LSC ’22). Association for Computing Machinery, New York, NY, USA, 14–19. https://doi.org/10.1145/3512729.3533014Google ScholarDigital Library
- OpenAI. 2023. GPT-4 Technical Report. https://doi.org/10.48550/arXiv.2303.08774 arXiv:2303.08774 [cs].Google Scholar
- Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Gray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Christiano, Jan Leike, and Ryan Lowe. 2022. Training language models to follow instructions with human feedback. https://openreview.net/forum?id=TG8KACxEONGoogle Scholar
- Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. 2021. Learning Transferable Visual Models From Natural Language Supervision. In Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event(Proceedings of Machine Learning Research, Vol. 139), Marina Meila and Tong Zhang (Eds.). PMLR, 8748–8763. http://proceedings.mlr.press/v139/radford21a.htmlGoogle Scholar
- Ly-Duyen Tran, Manh-Duy Nguyen, Binh T. Nguyen, Hyowon Lee, Liting Zhou, and Cathal Gurrin. 2022. E-Myscéal: Embedding-based Interactive Lifelog Retrieval System for LSC’22. https://doi.org/10.1145/3512729.3533012Google ScholarDigital Library
- Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, brian ichter, Fei Xia, Ed H. Chi, Quoc V Le, and Denny Zhou. 2022. Chain of Thought Prompting Elicits Reasoning in Large Language Models. In Advances in Neural Information Processing Systems, Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho (Eds.).Google Scholar
- Zhuosheng Zhang, Aston Zhang, Mu Li, and Alex Smola. 2023. Automatic Chain of Thought Prompting in Large Language Models. In The Eleventh International Conference on Learning Representations.Google Scholar
- Zhuosheng Zhang, Aston Zhang, Mu Li, Hai Zhao, George Karypis, and Alex Smola. 2023. Multimodal Chain-of-Thought Reasoning in Language Models. arxiv:2302.00923 [cs.CL]Google Scholar
- Yongchao Zhou, Andrei Ioan Muresanu, Ziwen Han, Keiran Paster, Silviu Pitis, Harris Chan, and Jimmy Ba. 2023. Large Language Models are Human-Level Prompt Engineers. In The Eleventh International Conference on Learning Representations.Google Scholar
Index Terms
- Lifelog Discovery Assistant: Suggesting Prompts and Indexing Event Sequences for FIRST at LSC 2023
Recommendations
MyEachtra: Event-Based Interactive Lifelog Retrieval System for LSC’23
LSC '23: Proceedings of the 6th Annual ACM Lifelog Search ChallengeRetrieval is a fundamental challenge within the research community of lifelog and the Lifelog Search Challenge (LSC) has been an important annual benchmarking activity for interactive lifelog retrieval systems since 2018. This paper proposes MyEachtra (...
MemoriEase: An Interactive Lifelog Retrieval System for LSC’23
LSC '23: Proceedings of the 6th Annual ACM Lifelog Search ChallengeLifelogging is an activity of recording all events that happen in the daily life of an individual. The events can contain images, audio, health index, etc which are collected through various devices such as wearable cameras, smartwatches, and other ...
Introduction to the Fourth Annual Lifelog Search Challenge, LSC'21
ICMR '21: Proceedings of the 2021 International Conference on Multimedia RetrievalThe Lifelog Search Challenge (LSC) is an annual benchmarking challenge for comparing approaches to interactive retrieval from multi-modal lifelogs. LSC'21, the fourth challenge, attracted sixteen participants, each of which had developed interactive ...
Comments