Abstract
Photos and videos are generated frequently in our daily activities. With a huge collection of visual data, it is essential to provide an efficient and easy-to-use system for users to retrieve moments of interest for a wide variation of query types. This motivates us to develop and upgrade our solution for visual lifelog exploration. Our solution includes a web-based retrieval system and a series of best practices to assist users in using our system. This paper details the enhanced version of FIRST, our Flexible Interactive Retrieval SysTem for visual lifelog retrieval. Our system supports multiple modalities for interaction and query processing, including visual query by meta-data, text query, and visual information matching based on a joint embedding model, scene clustering based on visual and location information, flexible temporal event navigation, and query expansion with visual examples. With a flexible system architecture, our system can easily integrate new modules to enhance its functionality. We conduct a user study to analyze the behaviors of novice and expert users of our system for different query scenarios. Finally, we propose best practice guidelines for users of our system to efficiently find events or moments of interest.
Similar content being viewed by others
Data availability statements
Data sharing not applicable to this article as no datasets were generated or analyzed during the current study
References
Alam N, Graham Y, Gurrin C (2021) Memento: A prototype lifelog search engine for LSC’21. In: Proceedings of the 4th annual on lifelog search challenge. LSC ’21, pp 53–58. Association for computing machinery, New York, NY, USA . event-place: Taipei, Taiwan. https://doi.org/10.1145/3463948.3469069
Alam N, Graham Y, Gurrin C (2022) Memento 2.0: An Improved Lifelog Search Engine for LSC’ 22. In: Proceedings of the 5th annual on lifelog search challenge. LSC ’22, pp 2–7. Association for computing machinery, New York, NY, USA . 3533006. https://doi.org/10.1145/3512729.3533006 Accessed 13-07-2022
Anderson P, He X, Buehler C, Teney D, Johnson M, Gould S, Zhang L (2018) Bottom-up and top-down attention for image captioning and visual question answering. In: CVPR
Duane A, Jónsson BÞ (2021) ViRMA: Virtual Reality Multimedia Analytics at LSC 2021. In: Proceedings of the 4th annual on lifelog search challenge. LSC ’21, pp 29–34. Association for computing machinery, New York, NY, USA . event-place: Taipei, Taiwan. https://doi.org/10.1145/3463948.3469067
Faghri F, Fleet DJ, Kiros JR, Fidler S (2018) VSE++: improving visualsemantic embeddings with hard negatives. In: British machine vision conference 2018, BMVC 2018, September 3–6, 2018, p. 12. BMVA Press, Newcastle, UK . http://bmvc2018.org/contents/papers/0344.pdf
Gurrin C, Schoeffmann K, Joho H, Zhou L, Duane A, Leibetseder A, Riegler M, Piras L, Tran M-T, Lokoč J, Hürst W (2019) Comparing approaches to interactive lifelog search at the lifelog search challenge (LSC2018). ITE Trans Media Technol Appl 7(2):46–59
Gurrin C, Schoeffmann K, Joho H, Leibetseder A, Zhou L, Duane A, Dang Nguyen DT, Riegler M, Piras L, Tran M-T, Lokoč J, Hürst W (2019) [Invited papers] Comparing Approaches to Interactive Lifelog Search at the Lifelog Search Challenge (LSC2018). ITE Trans Media Technol Appl 7:46–59. https://doi.org/10.3169/mta.7.46
Gurrin C, Jónsson BÞ, Schöffmann K, Dang-Nguyen D-T, Lokoč J, Tran M-T, Hürst W, Rossetto L, Healy G (2021) Introduction to the fourth annual lifelog search challenge, lsc’21. In: Proc. international conference on multimedia retrieval (ICMR’21). ACM, Taipei, Taiwan
Gurrin C, Jónsson BÞ, Schöffmann K, Dang-Nguyen D-T, Lokoc̆ J, Tran M-T, Hürst W, Rossetto L, Healy G (2021) Introduction to the fourth annual lifelog search challenge, LSC’21. In: Proceedings of the 2021 international conference on multimedia retrieval. ICMR ’21, pp 690–691. Association for Computing Machinery, New York, NY, USA . https://doi.org/10.1145/3460426.3470945 Accessed 22-08-2202
Gurrin C, Le T-K, Ninh V-T, Dang-Nguyen D-T, Jónsson BÞ, Lokoc̆ J, Hurst W, Tran M-T, Schoeffmann K (2020) An Introduction to the third annual lifelog search challenge, LSC’20. In: ICMR ’20, The 2020 international conference on multimedia retrieval. ACM, Dublin, Ireland
Heller S, Gasser R, Parian-Scherb M, Popovic S, Rossetto L, Sauter L, Spiess F, Schuldt H (2021) Interactive multimodal lifelog retrieval with Vitrivr at LSC 2021. In: Proceedings of the 4th annual on lifelog search challenge. LSC ’21, pp 35–39. Association for computing machinery, New York, NY, USA . event-place: Taipei, Taiwan. https://doi.org/10.1145/3463948.3469062
Hoang-Xuan N, Trang-Trung H-P, Nguyen E-R, Le T-C, Tran M-K, Ninh V-T, Le T-K, Gurrin C, Tran M-T (2022) Flexible interactive retrieval system 3.0 for visual lifelog exploration at lsc 2022. In: Proceedings of the 2022 ACM workshop on the lifelog search challenge, LSC22, Newark, NJ
Krishna R, Zhu Y, Groth O, Johnson J, Hata K, Kravitz J, Chen S, Kalantidis Y, Li L-J, Shamma DA, Bernstein M, Fei-Fei L (2016) Visual genome: Connecting language and vision using crowdsourced dense image annotations. https://arxiv.org/abs/1602.07332
Leibetseder A, Schoeffmann K (2021) LifeXplore at the Lifelog Search Challenge 2021. In: Proceedings of the 4th annual on lifelog search challenge. LSC ’21, pp 23–28. Association for computing machinery, New York, NY, USA . event-place:Taipei, Taiwan. https://doi.org/10.1145/3463948.3469060
Le N, Nguyen D, Hoang T, Nguyen T, Truong T, Duy TD, Luong Q, Vo-Ho V, Nguyen V, Tran M (2019) Smart lifelog retrieval system with habit-based concepts and moment visualization. In: Gurrin C, Schöffmann K, Joho H, Dang-Nguyen D, Riegler M, Piras L (eds) Proceedings of the ACM workshop on lifelog search challenge, LSC@ICMR 2019, 10 June 2019, pp 1–6. ACM, Ottawa, ON, Canada
Le N, Nguyen D, Nguyen V, Tran M (2019) Lifelog moment retrieval with advanced semantic extraction and flexible moment visualization for exploration. In: Cappellato L, Ferro N, Losada DE, Müller H (eds) Working notes of CLEF 2019 - Conference and labs of the evaluation forum, September 9–12, 2019. CEUR workshop proceedings, vol 2380. CEUR-WS.org, Lugano, Switzerland . http://ceur-ws.org/Vol-2380/paper_139.pdf
Lin T, Maire M, Belongie SJ, Bourdev LD, Girshick RB, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft COCO: common objects in context. CoRR arXiv:1405.0312, https://arxiv.org/abs/1405.0312
Liu Y, Chen H, Shen C, He T, Jin L, Wang L (2020) Abcnet: Real-time scene text spotting with adaptive bezier-curve network. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition(CVPR)
Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692
Lokoc̆ J, Mejzlik F, Veselý P, Souc̆ek T (2021) Enhanced SOMHunter for known-Item search in lifelog data. In: Proceedings of the 4th annual on lifelog search challenge. LSC ’21, pp 71–73. Association for computing machinery, New York, NY, USA . event-place: Taipei, Taiwan. https://doi.org/10.1145/3463948.3469074
Nguyen T-N, Le T-K, Ninh V-T, Tran M-T, Nguyen TB, Healy G, Smyth S, Caputo A, Gurrin C (2022) Lifeseeker 4.0: An interactive lifelog search engine for lsc’22. In: Proceedings of the 5th annual on lifelog search challenge. LSC ’22, pp 14–19. Association for computing machinery, New York, NY, USA. https://doi.org/10.1145/3512729.3533014
Nguyen T-N, Le T-K, Ninh V-T, Tran M-T, Thanh Binh N, Healy G, Caputo A, Gurrin C (2021) LifeSeeker 3.0: An interactive lifelog search engine for LSC’21. In: Proceedings of the 4th annual on lifelog search challenge. LSC ’21, pp 41–46. Association for computing machinery, New York, NY, USA . event-place: Taipei, Taiwan. https://doi.org/10.1145/3463948.3469065
Radford A, Kim JW, Hallacy C, Ramesh A, Goh G, Agarwal S, Sastry G, Askell A, Mishkin P, Clark J, Krueger G, Sutskever I (2021) Learning Transferable Visual Models From Natural Language Supervision. In: Meila M, Zhang T (eds) Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18–24 July 2021, Virtual Event. Proceedings of Machine Learning Research, vol 139, pp 8748–8763. PMLR, ??? . http://proceedings.mlr.press/v139/radford21a.html
Ren S, He K, Girshick RB, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. CoRR arXiv:1506.01497https://arxiv.org/abs/1506.01497
Shin J, Waldau A, Duane A, Jónsson BÞ(2021) PhotoCube at the Lifelog Search Challenge 2021. In: Proceedings of the 4th annual on lifelog search challenge. LSC ’21, pp 59–63. Association for computing machinery, New York, NY, USA . event-place: Taipei, Taiwan. https://doi.org/10.1145/3463948.3469073
Tan M, Pang R, Le QV (2020) Efficientdet: Scalable and efficient object detection. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 10778–10787
Tran L-D, Nguyen M-D, Thanh Binh N, Lee H, Gurrin C (2021) Myscéal 2.0: A Revised experimental interactive lifelog retrieval system for LSC’21. In: Proceedings of the 4th annual on lifelog search challenge. LSC ’21, pp 11–16. Association for computing machinery, New York, NY, USA . event-place: Taipei, Taiwan. https://doi.org/10.1145/3463948.3469064
Tran M-T, Nguyen T-A, Tran Q-C, Tran M-K, Nguyen K, Ninh V-T, Le T-K, Trang-Trung H-P, Le H-A, Nguyen H-D, Do T-L, Vo-Ho V-K, Gurrin C (2020) FIRST - Flexible interactive retrieval system for Visual Lifelog Exploration at LSC 2020. In: Proceedings of the third annual workshop on lifelog search challenge, pp. 67-72. Association for computing machinery, New York, NY, USA . https://doi.org/10.1145/3379172.3391726
Trang-Trung H-P, Le T-C, Tran M-K, Ninh V-T, Le T-K, Gurrin C, Tran M-T (2021) Flexible interactive retrieval system 2.0 for visual lifelog exploration at LSC 2021. In: Proceedings of the 4th annual on lifelog search challenge. LSC ’21, pp 81–87. Association for computing machinery, New York, NY, USA . event-place: Taipei, Taiwan. https://doi.org/10.1145/3463948.3469072
Trang-Trung H, Le H, Tran M (2020) Lifelog moment retrieval with selfattention based joint embedding model. In: Cappellato L, Eickhoff C, Ferro N, Névéol A (eds) Working notes of CLEF 2020 - conference and labs of the evaluation forum, September 22–25, 2020. CEUR workshop proceedings, vol 2696. CEUR-WS.org, Thessaloniki, Greece . http://ceur-ws.org/Vol-2696/paper_60.pdf
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008
Acknowledgements
This research was funded by Vingroup and supported by Vingroup Innovation Foundation (VINIF) under project code VINIF.2019.DA19.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that they have no conflict of interest
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Nhat Hoang-Xuan and Hoang-Phuc Trang-Trung contributed equally to this work.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Hoang-Xuan, N., Trang-Trung, HP., Tran, MK. et al. First-flexible interactive retrieval system for visual lifelog exploration. Multimed Tools Appl 82, 37877–37902 (2023). https://doi.org/10.1007/s11042-023-16287-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-16287-9