Automatic Check-Out via Prototype-Based Classifier Learning from Single-Product Exemplars

Chen, Hao; Wei, Xiu-Shen; Zhang, Faen; Shen, Yang; Xu, Hui; Xiao, Liang

doi:10.1007/978-3-031-19806-9_16

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13685))

Included in the following conference series:

European Conference on Computer Vision

1938 Accesses
2 Citations

Abstract

Automatic Check-Out (ACO) aims to accurately predict the presence and count of each category of products in check-out images, where a major challenge is the significant domain gap between training data (single-product exemplars) and test data (check-out images). To mitigate the gap, we propose a method, termed as PSP, to perform Prototype-based classifier learning from Single-Product exemplars. In PSP, by revealing the advantages of representing category semantics, the prototype representation of each product category is firstly obtained from single-product exemplars. Based on the prototypes, it then generates categorical classifiers with a background classifier to not only recognize fine-grained product categories but also distinguish background upon product proposals derived from check-out images. To further improve the ACO accuracy, we develop discriminative re-ranking to both adjust the predicted scores of product proposals for bringing more discriminative ability in classifier learning and provide a reasonable sorting possibility by considering the fine-grained nature. Moreover, a multi-label recognition loss is also equipped for modeling co-occurrence of products in check-out images. Experiments are conducted on the large-scale RPC dataset for evaluations. Our ACO result achieves 86.69%, by 6.18% improvements over state-of-the-arts, which demonstrates the superiority of PSP. Our codes are available at https://github.com/Hao-Chen-NJUST/PSP.

X.-S. Wei and Y. Shen are also with Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education, and Jiangsu Key Lab of Image and Video Understanding for Social Security, Nanjing University of Science and Technology, China. This work is supported by National Key R &D Program of China (2021YFA1001100), Natural Science Foundation of China under Grant (61871226), Natural Science Foundation of Jiangsu Province of China under Grant (BK20210340), the Fundamental Research Funds for the Central Universities (No. 30920041111, No. NJ2022028), CAAI-Huawei MindSpore Open Fund, Beijing Academy of Artificial Intelligence (BAAI), and Postgraduate Research & Practice Innovation Program of Jiangsu Province (KYCX22_0464).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: YOLOv4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)
Cai, Z., Vasconcelos, N.: Cascade R-CNN: delving into high quality object detection. In: CVPR, pp. 6154–6162 (2018)
Google Scholar
Chen, C., Zheng, Z., Huang, Y., Ding, X., Yu, Y.: I3Net: implicit instance-invariant network for adapting one-stage object detectors. In: CVPR, pp. 12576–12585 (2021)
Google Scholar
Chen, K., et al.: MMDetection: open MMLab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155 (2019)
Chen, Q., Wang, Y., Yang, T., Zhang, X., Cheng, J., Sun, J.: You only look one-level feature. In: CVPR, pp. 13039–13048 (2021)
Google Scholar
Chen, Z.M., Jin, X., Zhao, B., Wei, X.S., Guo, Y.: Hierarchical context embedding for region-based object detection. In: ECCV, pp. 633–648 (2020)
Google Scholar
Follmann, P., Bottger, T., Hartinger, P., Konig, R., Ulrich, M.: MVTec D2S: densely segmented supermarket dataset. In: ECCV, pp. 569–585 (2018)
Google Scholar
Frontoni, E., Raspa, P., Mancini, A., Zingaretti, P., Placidi, V.: Customers’ activity recognition in intelligent retail environments. In: ICIAP, pp. 509–516 (2013)
Google Scholar
George, M., Floerkemeier, C.: Recognizing products: a per-exemplar multi-label image classification approach. In: ECCV, pp. 440–455 (2014)
Google Scholar
Georgiadis, K., et al.: Products-6K: a large-scale groceries product recognition dataset. In: PETRA, pp. 1–7 (2021)
Google Scholar
Girshick, R.: Fast R-CNN. In: CVPR, pp. 1440–1448 (2015)
Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp. 580–587 (2014)
Google Scholar
He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask R-CNN. In: ICCV, pp. 2961–2969 (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE TPAMI 37(9), 1904–1916 (2015)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Google Scholar
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. CoRR abs/1503.02531 (2015)
Google Scholar
Jund, P., Abdo, N., Eitel, A., Burgard, W.: The freiburg groceries dataset. CoRR abs/1611.05799 (2016)
Google Scholar
Koubaroulis, D., Matas, J., Kittler, J.: Evaluating colour-based object recognition algorithms using the SOIL-47 database. In: ACCV, pp. 840–845 (2002)
Google Scholar
Kozerawski, J., Turk, M.: CLEAR: cumulative learning for one-shot one-class image recognition. In: CVPR, pp. 3446–3455 (2018)
Google Scholar
Lapin, M., Hein, M., Schiele, B.: Analysis and optimization of loss functions for multiclass, top-k, and multilabel classification. IEEE TPAMI 40(7), 1533–1554 (2018)
Article Google Scholar
Li, C., Du, D., Zhang, L., Luo, T., Wu, Y., Tian, Q., Wen, L., Lyu, S.: Data priming network for automatic check-out. In: ACM MM, pp. 2152–2160 (2019)
Google Scholar
Li, Z., Peng, C., Yu, G., Zhang, X., Deng, Y., Sun, J.: Light-head R-CNN: in defense of two-stage object detector. arXiv preprint arXiv:1711.07264 (2017)
Li, Z., Hoiem, D.: Learning without forgetting. IEEE TPAMI 40(12), 2935–2947 (2018)
Article Google Scholar
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: CVPR, pp. 936–944 (2017)
Google Scholar
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollar, P.: Focal loss for dense object detection. In: ICCV, pp. 2980–2988 (2017)
Google Scholar
Liu, A., Wang, J., Liu, X., Cao, B., Zhang, C., Yu, H.: Bias-based universal adversarial patch attack for automatic check-out. In: ECCV, pp. 395–410 (2020)
Google Scholar
Liu, W., et al.: SSD: single shot multibox detector. In: ECCV, pp. 21–37 (2016)
Google Scholar
Merler, M., Galleguillos, C., Belongie, S.: Recognizing groceries in situ using in vitro training data. In: CVPR, pp. 1–8 (2007)
Google Scholar
Paolanti, M., Liciotti, D., Pietrini, R., Mancini, A., Frontoni, E.: Modelling and forecasting customer navigation in intelligent retail environments. JINT 91(2), 165–180 (2018)
Google Scholar
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: NeurIPS, pp. 8026–8037 (2019)
Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You Only look once: unified, real-time object detection. In: CVPR, pp. 779–788 (2016)
Google Scholar
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: CVPR, pp. 7263–7271 (2017)
Google Scholar
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NeurIPS, pp. 91–99 (2015)
Google Scholar
Sciucca, L.D., Manco, D., Contigiani, M., Pietrini, R., Bello, L.D., Placidi, V.: Shoppers detection analysis in an intelligent retail environment. In: ICPR, pp. 534–546 (2021)
Google Scholar
Tan, Z., Nie, X., Qian, Q., Li, N., Li, H.: Learning to rank proposals for object detection. In: ICCV, pp. 8273–8281 (2019)
Google Scholar
Tychsen-Smith, L., Petersson, L.: Improving object localization with fitness NMS and bounded iou loss. In: CVPR, pp. 6877–6885 (2018)
Google Scholar
Vieville, T., Crahay, S.: Using an hebbian learning rule for multi-class SVM classifiers. J. Comput. Neurosci. 17(3), 271–287 (2004)
Article Google Scholar
Wang, Q., Liu, X., Liu, W., Liu, A.A., Liu, W., Mei, T.: MetaSearch: incremental product search via deep meta-learning. IEEE TIP 29, 7549–7564 (2020)
MATH Google Scholar
Wang, Y.X., Hebert, M.: Learning to learn: model regression networks for easy small sample learning. In: ECCV, pp. 616–634 (2016)
Google Scholar
Wei, X.S., Cui, Q., Yang, L., Wang, P., Liu, L., Yang, J.: RPC: a large-scale and fine-grained retail product checkout dataset. Sci. China Inf. Sci. (2022). https://doi.org/10.1007/s11432-022-F3513-y
Wei, X.S., Shen, Y., Sun, X., Ye, H.J., Yang, J.: A\(^{2}\)-Net: Learning attribute-aware hash codes for large-scale fine-grained image retrieval. In: NeurIPS, pp. 5720–5730 (2021)
Google Scholar
Wei, X.S., et al.: Fine-grained image analysis with deep learning: a survey. IEEE TPAMI (2021). https://doi.org/10.1109/TPAMI.2021.3126648
Wei, X.S., Wang, P., Liu, L., Shen, C., Wu, J.: Piecewise classifier mappings: Learning fine-grained learners for novel categories with few examples. IEEE TIP 28(12), 6116–6125 (2019)
MathSciNet MATH Google Scholar
Wu, Y., et al.: Rethinking classification and localization for object detection. In: CVPR, pp. 10186–10195 (2020)
Google Scholar
Yang, Y., Sheng, L., Jiang, X., Wang, H., Xu, D., Cao, X.B.: IncreACO: incrementally learned automatic check-out with photorealistic exemplar augmentation. In: WACV, pp. 626–634 (2021)
Google Scholar
Yeh, M.C., Li, Y.N.: Multilabel deep visual-semantic embedding. IEEE TPAMI 42(6), 1530–1536 (2020)
Article Google Scholar
Zhan, X., et al.: Product1M: towards weakly supervised instance-level product retrieval via cross-modal pretraining. In: ICCV, pp. 11782–11791 (2021)
Google Scholar
Zhang, X., Wan, F., Liu, C., Ji, R., Ye, Q.: FreeAnchor: learning to match anchors for visual object detection. In: NeurIPS, pp. 147–155 (2019)
Google Scholar
Zhao, L., Yao, J., Du, H., Zhao, J., Zhang, R.: A unified object detection framework for intelligent retail container commodities. In: ICIP, pp. 3891–3895 (2019)
Google Scholar
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV, pp. 2223–2232 (2017)
Google Scholar

Download references

Acknowledgments

The authors would like to thank the anonymous reviewers for their critical and constructive comments and suggestions. We gratefully acknowledge the support of MindSpore, CANN (Compute Architecture for Neural Networks) and Ascend AI Processor used for this research.

Author information

Authors and Affiliations

School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China
Hao Chen, Xiu-Shen Wei, Yang Shen & Liang Xiao
State Key Laboratory of Integrated Services Networks, Xidian University, Xi’an, China
Hao Chen & Xiu-Shen Wei
State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China
Xiu-Shen Wei
Qingdao AInnovation Technology Group Co., Ltd, Qingdao, China
Faen Zhang & Hui Xu

Authors

Hao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xiu-Shen Wei
View author publications
You can also search for this author in PubMed Google Scholar
Faen Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yang Shen
View author publications
You can also search for this author in PubMed Google Scholar
Hui Xu
View author publications
You can also search for this author in PubMed Google Scholar
Liang Xiao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Xiu-Shen Wei or Liang Xiao .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 781 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, H., Wei, XS., Zhang, F., Shen, Y., Xu, H., Xiao, L. (2022). Automatic Check-Out via Prototype-Based Classifier Learning from Single-Product Exemplars. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13685. Springer, Cham. https://doi.org/10.1007/978-3-031-19806-9_16

Download citation

DOI: https://doi.org/10.1007/978-3-031-19806-9_16
Published: 20 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19805-2
Online ISBN: 978-3-031-19806-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Automatic Check-Out via Prototype-Based Classifier Learning from Single-Product Exemplars