skip to main content
10.1145/3592573.3593104acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

Lifelog Discovery Assistant: Suggesting Prompts and Indexing Event Sequences for FIRST at LSC 2023

Published:12 June 2023Publication History

ABSTRACT

AI-assisted tools have become more prevalent than ever in the last few years. However, applying them to build a lifelog retrieval system is still non-trivial due to the disparity in interfaces and interactions. The Lifelog Search Challenge (LSC) aims to provide a testing ground where systems can be benchmarked in a highly competitive setting. In this paper, we present the fourth iteration of our participating system FIRST. For this year, we adopt generative models to equip the system with predictive ability rather than entirely relying on the user to input the query. We also index a sequence of images as an event for improved search speed. Finally, we demonstrate how the additional features can assist users in searching.

References

  1. Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin (Eds.). Vol. 33. Curran Associates, Inc., 1877–1901.Google ScholarGoogle Scholar
  2. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, 4171–4186. https://doi.org/10.18653/v1/n19-1423Google ScholarGoogle Scholar
  3. Cathal Gurrin, Björn Þór Jónsson, Duc Tien Dang Nguyen, Graham Healy, Jakub Lokoc, Liting Zhou, Luca Rossetto, Minh-Triet Tran, Wolfgang Hürst, Werner Bailer, and Klaus Schoeffmann. 2023. Introduction to the Sixth Annual Lifelog Search Challenge, LSC’23. In Proc. International Conference on Multimedia Retrieval (ICMR’23) (Thessaloniki, Greece) (ICMR ’23). New York, NY, USA. https://doi.org/10.1145/3591106.3592304Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Nhat Hoang-Xuan, Hoang-Phuc Trang-Trung, E-Ro Nguyen, Thanh-Cong Le, Mai-Khiem Tran, Tu-Khiem Le, Van-Tu Ninh, Cathal Gurrin, and Minh-Triet Tran. 2022. Flexible Interactive Retrieval SysTem 3.0 for Visual Lifelog Exploration at LSC 2022. In Proceedings of the 5th Annual on Lifelog Search Challenge(LSC ’22). Association for Computing Machinery, New York, NY, USA, 20–26. https://doi.org/10.1145/3512729.3533013Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Nhat Hoang-Xuan, Hoang-Phuc Trang-Trung, Khiem Tran, Thanh-Cong Le, E-Ro Nguyen, Tu-Khiem Le, and Minh-Triet Tran. 2023. FIRST-Flexible Interactive Retrieval SysTem for Visual Lifelog Exploration. Multimedia Tools and Applications (2023).Google ScholarGoogle Scholar
  6. Jeff Johnson, Matthijs Douze, and Hervé Jégou. 2019. Billion-scale similarity search with GPUs. IEEE Transactions on Big Data 7, 3 (2019), 535–547.Google ScholarGoogle ScholarCross RefCross Ref
  7. Natalia Konstantinova and Constantin Orasan. 2013. Interactive Question Answering. In Emerging Applications of Natural Language Processing: Concepts and New Research. 149. https://doi.org/10.4018/978-1-4666-2169-5.ch007 Journal Abbreviation: Emerging Applications of Natural Language Processing: Concepts and New Research.Google ScholarGoogle Scholar
  8. Thao-Nhu Nguyen, Tu-Khiem Le, Van-Tu Ninh, Minh-Triet Tran, Thanh Binh Nguyen, Graham Healy, Sinéad Smyth, Annalina Caputo, and Cathal Gurrin. 2022. LifeSeeker 4.0: An Interactive Lifelog Search Engine for LSC’22. In Proceedings of the 5th Annual on Lifelog Search Challenge(LSC ’22). Association for Computing Machinery, New York, NY, USA, 14–19. https://doi.org/10.1145/3512729.3533014Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. OpenAI. 2023. GPT-4 Technical Report. https://doi.org/10.48550/arXiv.2303.08774 arXiv:2303.08774 [cs].Google ScholarGoogle Scholar
  10. Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Gray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Christiano, Jan Leike, and Ryan Lowe. 2022. Training language models to follow instructions with human feedback. https://openreview.net/forum?id=TG8KACxEONGoogle ScholarGoogle Scholar
  11. Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. 2021. Learning Transferable Visual Models From Natural Language Supervision. In Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event(Proceedings of Machine Learning Research, Vol. 139), Marina Meila and Tong Zhang (Eds.). PMLR, 8748–8763. http://proceedings.mlr.press/v139/radford21a.htmlGoogle ScholarGoogle Scholar
  12. Ly-Duyen Tran, Manh-Duy Nguyen, Binh T. Nguyen, Hyowon Lee, Liting Zhou, and Cathal Gurrin. 2022. E-Myscéal: Embedding-based Interactive Lifelog Retrieval System for LSC’22. https://doi.org/10.1145/3512729.3533012Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, brian ichter, Fei Xia, Ed H. Chi, Quoc V Le, and Denny Zhou. 2022. Chain of Thought Prompting Elicits Reasoning in Large Language Models. In Advances in Neural Information Processing Systems, Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho (Eds.).Google ScholarGoogle Scholar
  14. Zhuosheng Zhang, Aston Zhang, Mu Li, and Alex Smola. 2023. Automatic Chain of Thought Prompting in Large Language Models. In The Eleventh International Conference on Learning Representations.Google ScholarGoogle Scholar
  15. Zhuosheng Zhang, Aston Zhang, Mu Li, Hai Zhao, George Karypis, and Alex Smola. 2023. Multimodal Chain-of-Thought Reasoning in Language Models. arxiv:2302.00923 [cs.CL]Google ScholarGoogle Scholar
  16. Yongchao Zhou, Andrei Ioan Muresanu, Ziwen Han, Keiran Paster, Silviu Pitis, Harris Chan, and Jimmy Ba. 2023. Large Language Models are Human-Level Prompt Engineers. In The Eleventh International Conference on Learning Representations.Google ScholarGoogle Scholar

Index Terms

  1. Lifelog Discovery Assistant: Suggesting Prompts and Indexing Event Sequences for FIRST at LSC 2023

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format