skip to main content
research-article

Instance-Based Continual Learning: A Real-World Dataset and Baseline for Fresh Recognition

Published:24 August 2023Publication History
Skip Abstract Section

Abstract

Real-time learning on real-world data streams with temporal relations is essential for intelligent agents. However, current online Continual Learning (CL) benchmarks adopt the mini-batch setting and are composed of temporally unrelated and disjoint tasks as well as pre-set class boundaries. In this paper, we delve into a real-world CL scenario for fresh recognition where algorithms are required to recognize a huge variety of products to facilitate the checkout speed. Products mainly consists of packaged cereals, seasonal fruits, and vegetables from local farms or shipped from overseas. Since algorithms process instance streams consisting of sequential images, we name this real-world CL problem as Instance-Based Continual Learning (IBCL). Different from the current online CL setting, algorithms are required to perform instant testing and learning upon each incoming instance. Moreover, IBCL has no task boundaries or class boundaries and allows the evolution and the forgetting of old samples within each class. To promote the researches on real CL challenges, we propose the first real-world CL dataset coined the Continual Fresh Recognition (CFR) dataset, which consists of fresh recognition data streams (766 K labelled images in total) collected from 30 supermarkets. Based on the CFR dataset, we extensively evaluate the performance of current online CL methods under various settings and find that current prominent online CL methods operate at high latency and demand significant memory consumption to cache old samples for replaying. Therefore, we make the first attempt to design an efficient and effective Instant Training-Free Learning (ITFL) framework for IBCL. ITFL consists of feature extractors trained in the metric learning manner and reformulates CL as a temporal classification problem among several most similar classes. Unlike current online CL methods that cache image samples (150 KB per image) and rely on training to learn new knowledge, our framework only caches features (2 KB per image) and is free of training in deployment. Extensive evaluations across three datasets demonstrate that our method achieves comparable recognition accuracy to current methods with lower latency and less resource consumption. Our codes and datasets will be publicly available at https://github.com/detectRecog/IBCL.

REFERENCES

  1. [1] Aljundi Rahaf et al. 2019. Online continual learning with maximal interfered retrieval. Advances in Neural Information Processing Systems 32 (2019).Google ScholarGoogle Scholar
  2. [2] Aljundi Rahaf, Lin Min, Goujaud Baptiste, and Bengio Yoshua. 2019. Gradient based sample selection for online continual learning. Advances in Neural Information Processing Systems 32 (2019).Google ScholarGoogle Scholar
  3. [3] Bang Jihwan et al. 2022. Online continual learning on a contaminated data stream with blurry task boundaries. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 92759284.Google ScholarGoogle ScholarCross RefCross Ref
  4. [4] Caccia Lucas and Pineau Joelle. 2021. Special: Self-supervised pretraining for continual learning. arXiv preprint arXiv:2106.09065 (2021).Google ScholarGoogle Scholar
  5. [5] Cai Qi, Pan Yingwei, Yao Ting, Yan Chenggang, and Mei Tao. 2018. Memory matching networks for one-shot image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 40804088.Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] Caron Mathilde, Touvron Hugo, Misra Ishan, Jégou Hervé, Mairal Julien, Bojanowski Piotr, and Joulin Armand. 2021. Emerging properties in self-supervised vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 96509660.Google ScholarGoogle ScholarCross RefCross Ref
  7. [7] Castro Francisco M., Marín-Jiménez Manuel J., Guil Nicolás, Schmid Cordelia, and Alahari Karteek. 2018. End-to-end incremental learning. In Proceedings of the European Conference on Computer Vision (ECCV). 233248.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Cha Hyuntak, Lee Jaeho, and Shin Jinwoo. 2021. Co2l: Contrastive continual learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 95169525.Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Chaudhry Arslan et al. 2019. On tiny episodic memories in continual learning. arXiv preprint arXiv:1902.10486 (2019).Google ScholarGoogle Scholar
  10. [10] Chaudhry Arslan, Dokania Puneet K., Ajanthan Thalaiyasingam, and Torr Philip H. S.. 2018. Riemannian walk for incremental learning: Understanding forgetting and intransigence. In Proceedings of the European Conference on Computer Vision (ECCV). 532547.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. [11] Chaudhry Arslan, Ranzato Marc’Aurelio, Rohrbach Marcus, and Elhoseiny Mohamed. 2018. Efficient lifelong learning with A-GEM. arXiv preprint arXiv:1812.00420 (2018).Google ScholarGoogle Scholar
  12. [12] Dosovitskiy Alexey, Beyer Lucas, Kolesnikov Alexander, Weissenborn Dirk, Zhai Xiaohua, Unterthiner Thomas, Dehghani Mostafa, Minderer Matthias, Heigold Georg, Gelly Sylvain, et al. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).Google ScholarGoogle Scholar
  13. [13] Douillard Arthur, Ramé Alexandre, Couairon Guillaume, and Cord Matthieu. 2022. DyTox: Transformers for continual learning with dynamic token expansion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 92859295.Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Fini Enrico, Costa Victor G. Turrisi da, Alameda-Pineda Xavier, Ricci Elisa, Alahari Karteek, and Mairal Julien. 2021. Self-supervised models are continual learners. arXiv preprint arXiv:2112.04215 (2021).Google ScholarGoogle Scholar
  15. [15] Gallardo Jhair, Hayes Tyler L., and Kanan Christopher. 2021. Self-supervised training enhances online continual learning. arXiv preprint arXiv:2103.14010 (2021).Google ScholarGoogle Scholar
  16. [16] Graves Alex, Wayne Greg, and Danihelka Ivo. 2014. Neural turing machines. arXiv preprint arXiv:1410.5401 (2014).Google ScholarGoogle Scholar
  17. [17] Gu Yanan, Yang Xu, Wei Kun, and Deng Cheng. 2022. Not just selection, but exploration: Online class-incremental continual learning via dual view consistency. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 74427451.Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] He Kaiming, Zhang Xiangyu, Ren Shaoqing, and Sun Jian. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 770778.Google ScholarGoogle ScholarCross RefCross Ref
  19. [19] Hou Qiang, Min Weiqing, Wang Jing, Hou Sujuan, Zheng Yuanjie, and Jiang Shuqiang. 2021. FoodLogoDet-1500: A dataset for large-scale food logo detection via multi-scale feature decoupling network. In Proceedings of the 29th ACM International Conference on Multimedia. 46704679.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. [20] Jiang Shuqiang, Min Weiqing, Lyu Yongqiang, and Liu Linhu. 2020. Few-shot food recognition via multi-view representation learning. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 16, 3 (2020), 120.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. [21] Jiang Shuqiang, Zhu Yaohui, Liu Chenlong, Song Xinhang, Li Xiangyang, and Min Weiqing. 2022. Dataset bias in few-shot image recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (2022).Google ScholarGoogle Scholar
  22. [22] Karagkioules Theodoros, Paschos Georgios S., Liakopoulos Nikolaos, Fiandrotti Attilio, Tsilimantos Dimitrios, and Cagnazzo Marco. 2022. Online learning for adaptive video streaming in mobile networks. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 18, 1 (2022), 122.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. [23] Kemker Ronald and Kanan Christopher. 2017. FearNet: Brain-inspired model for incremental learning. arXiv preprint arXiv:1711.10563 (2017).Google ScholarGoogle Scholar
  24. [24] Khosla Prannay et al. 2020. Supervised contrastive learning. Advances in Neural Information Processing Systems 33 (2020), 1866118673.Google ScholarGoogle Scholar
  25. [25] Kirkpatrick James, Pascanu Razvan, Rabinowitz Neil, Veness Joel, Desjardins Guillaume, Rusu Andrei A., Milan Kieran, Quan John, Ramalho Tiago, Grabska-Barwinska Agnieszka, et al. 2017. Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences 114, 13 (2017), 35213526.Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Krizhevsky Alex, Hinton Geoffrey, et al. 2009. Learning multiple layers of features from tiny images. (2009).Google ScholarGoogle Scholar
  27. [27] Lee K. H. et al. 2017. CleanNet: Transfer learning for scalable image classifier training with label noise.Google ScholarGoogle Scholar
  28. [28] Lee Soochan, Ha Junsoo, Zhang Dongsu, and Kim Gunhee. 2020. A neural Dirichlet process mixture model for task-free continual learning. arXiv preprint arXiv:2001.00689 (2020).Google ScholarGoogle Scholar
  29. [29] Li Wen, Wang Limin, Li Wei, Agustsson Eirikur, and Gool Luc Van. 2017. WebVision database: Visual learning and understanding from web data. arXiv preprint arXiv:1708.02862 (2017).Google ScholarGoogle Scholar
  30. [30] Li Yunfan, Yang Mouxing, Peng Dezhong, Li Taihao, Huang Jiantao, and Peng Xi. 2022. Twin contrastive learning for online clustering. International Journal of Computer Vision 130, 9 (2022), 22052221.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. [31] Li Zhizhong and Hoiem Derek. 2017. Learning without forgetting. IEEE Transactions on Pattern Analysis and Machine Intelligence 40, 12 (2017), 29352947.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. [32] Lin Zhiqiu et al. 2021. The CLEAR benchmark: Continual LEArning on real-world imagery. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2).Google ScholarGoogle Scholar
  33. [33] Lomonaco Vincenzo and Maltoni Davide. 2017. Core50: A new dataset and benchmark for continuous object recognition. In Conference on Robot Learning. PMLR, 1726.Google ScholarGoogle Scholar
  34. [34] Lopez-Paz David and Ranzato Marc’Aurelio. 2017. Gradient episodic memory for continual learning. Advances in Neural Information Processing Systems 30 (2017).Google ScholarGoogle Scholar
  35. [35] Luo Dezhao et al. 2022. Exploring relations in untrimmed videos for self-supervised learning. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 18, 1s (2022), 121.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. [36] Madaan Divyam, Yoon Jaehong, Li Yuanchun, Liu Yunxin, and Hwang Sung Ju. 2021. Rethinking the representational continuity: Towards unsupervised continual learning. arXiv preprint arXiv:2110.06976 (2021).Google ScholarGoogle Scholar
  37. [37] Mai Zheda, Li Ruiwen, Jeong Jihwan, Quispe David, Kim Hyunwoo, and Sanner Scott. 2022. Online continual learning in image classification: An empirical survey. Neurocomputing 469 (2022), 2851.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. [38] Mai Zheda, Li Ruiwen, Kim Hyunwoo, and Sanner Scott. 2021. Supervised contrastive replay: Revisiting the nearest class mean classifier in online class-incremental continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 35893599.Google ScholarGoogle ScholarCross RefCross Ref
  39. [39] Mermillod Martial, Bugaiska Aurélia, and Bonin Patrick. 2013. The stability-plasticity dilemma: Investigating the continuum from catastrophic forgetting to age-limited learning effects. Frontiers in Psychology 4 (2013), 504.Google ScholarGoogle ScholarCross RefCross Ref
  40. [40] Miller A., Fisch A., Dodge J., Karimi A. H., Bordes A., and Weston J.. 2016. Key-value memory networks for directly reading documents. In Conference on Empirical Methods in Natural Language Processing.Google ScholarGoogle ScholarCross RefCross Ref
  41. [41] Min Weiqing, Liu Linhu, Wang Zhiling, Luo Zhengdong, Wei Xiaoming, Wei Xiaolin, and Jiang Shuqiang. 2020. ISIA Food-500: A dataset for large-scale food recognition via stacked global-local attention network. In Proceedings of the 28th ACM International Conference on Multimedia. 393401.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. [42] Pang Bo et al. 2022. Fully unsupervised person re-identification via selective contrastive learning. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 18, 2 (2022), 115.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. [43] Parisi German I., Tani Jun, Weber Cornelius, and Wermter Stefan. 2017. Lifelong learning of human actions with deep neural network self-organization. Neural Networks 96 (2017), 137149.Google ScholarGoogle ScholarCross RefCross Ref
  44. [44] Parisi German I., Tani Jun, Weber Cornelius, and Wermter Stefan. 2018. Lifelong learning of spatiotemporal representations with dual-memory recurrent self-organization. Frontiers in Neurorobotics (2018), 78.Google ScholarGoogle ScholarCross RefCross Ref
  45. [45] Pellegrini Lorenzo et al. 2020. Latent replay for real-time continual learning. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 1020310209.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. [46] Prabhu Ameya, Torr Philip H. S., and Dokania Puneet K.. 2020. GDumb: A simple approach that questions our progress in continual learning. In Proceedings of the European Conference on Computer Vision (ECCV). Springer, 524540.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. [47] Qi Charles R. et al. 2017. PointNet: Deep learning on point sets for 3D classification and segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 652660.Google ScholarGoogle Scholar
  48. [48] Rebuffi Sylvestre-Alvise et al. 2017. iCaRL: Incremental classifier and representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 20012010.Google ScholarGoogle ScholarCross RefCross Ref
  49. [49] Rusu Andrei A., Rabinowitz Neil C., Desjardins Guillaume, Soyer Hubert, Kirkpatrick James, Kavukcuoglu Koray, Pascanu Razvan, and Hadsell Raia. 2016. Progressive neural networks. arXiv preprint arXiv:1606.04671 (2016).Google ScholarGoogle Scholar
  50. [50] Serra Joan, Suris Didac, Miron Marius, and Karatzoglou Alexandros. 2018. Overcoming catastrophic forgetting with hard attention to the task. In International Conference on Machine Learning. PMLR, 45484557.Google ScholarGoogle Scholar
  51. [51] Shim Dongsub et al. 2021. Online class-incremental continual learning with adversarial Shapley value. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 96309638.Google ScholarGoogle ScholarCross RefCross Ref
  52. [52] Thomee Bart, Shamma David A., Friedland Gerald, Elizalde Benjamin, Ni Karl, Poland Douglas, Borth Damian, and Li Li-Jia. 2016. YFCC100M: The new data in multimedia research. Commun. ACM 59, 2 (2016), 6473.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. [53] Tiwari Rishabh et al. 2022. GCR: Gradient coreset based replay buffer selection for continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 99108.Google ScholarGoogle ScholarCross RefCross Ref
  54. [54] Ven G. M. van de and Tolias A. S.. 2019. Three scenarios for continual learning. (2019).Google ScholarGoogle Scholar
  55. [55] Vinyals Oriol, Blundell Charles, Lillicrap Timothy, Wierstra Daan, et al. 2016. Matching networks for one shot learning. Advances in Neural Information Processing Systems 29 (2016).Google ScholarGoogle Scholar
  56. [56] Wan Timmy S. T., Chen Jun-Cheng, Wu Tzer-Yi, and Chen Chu-Song. 2022. Continual learning for visual search with backward consistent feature embedding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1670216711.Google ScholarGoogle ScholarCross RefCross Ref
  57. [57] Wang Yue, Sun Yongbin, Liu Ziwei, Sarma Sanjay E., Bronstein Michael M., and Solomon Justin M.. 2019. Dynamic graph CNN for learning on point clouds. ACM Transactions on Graphics (TOG) 38, 5 (2019), 112.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. [58] Wang Zhen, Liu Liu, Duan Yiqun, Kong Yajing, and Tao Dacheng. 2022. Continual learning with lifelong vision transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 171181.Google ScholarGoogle ScholarCross RefCross Ref
  59. [59] Weston J., Chopra S., and Bordes A.. 2014. Memory networks. Eprint Arxiv (2014).Google ScholarGoogle Scholar
  60. [60] Weston J. E., Szlam A. D., Fergus R. D., and Sukhbaatar S.. 2017. End-to-End Memory Networks.Google ScholarGoogle Scholar
  61. [61] Xu Xing et al. 2021. Cross-modal hybrid feature fusion for image-sentence matching. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 17, 4 (2021), 123.Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. [62] Xue Mengqi, Zhang Haofei, Song Jie, and Song Mingli. 2022. Meta-attention for ViT-backed continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 150159.Google ScholarGoogle ScholarCross RefCross Ref
  63. [63] Yan Qingsen, Gong Dong, Liu Yuhang, Hengel Anton van den, and Shi Javen Qinfeng. 2022. Learning Bayesian sparse networks with full experience replay for continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 109118.Google ScholarGoogle ScholarCross RefCross Ref
  64. [64] Zenke Friedemann, Poole Ben, and Ganguli Surya. 2017. Continual learning through synaptic intelligence. In International Conference on Machine Learning. PMLR, 39873995.Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. [65] Zeno Chen, Golan Itay, Hoffer Elad, and Soudry Daniel. 2018. Task agnostic continual learning using online variational Bayes. arXiv preprint arXiv:1803.10123 (2018).Google ScholarGoogle Scholar

Index Terms

  1. Instance-Based Continual Learning: A Real-World Dataset and Baseline for Fresh Recognition

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Multimedia Computing, Communications, and Applications
          ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 20, Issue 1
          January 2024
          639 pages
          ISSN:1551-6857
          EISSN:1551-6865
          DOI:10.1145/3613542
          • Editor:
          • Abdulmotaleb El Saddik
          Issue’s Table of Contents

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 24 August 2023
          • Online AM: 25 April 2023
          • Accepted: 26 March 2023
          • Revised: 19 March 2023
          • Received: 18 October 2022
          Published in tomm Volume 20, Issue 1

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
        • Article Metrics

          • Downloads (Last 12 months)387
          • Downloads (Last 6 weeks)21

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Full Text

        View this article in Full Text.

        View Full Text