The Combination of Background Subtraction and Convolutional Neural Network for Product Recognition

Thai, Tin-Trung; Ha, Synh Viet-Uyen; Nguyen, Thong Duy-Minh; Le, Huy Dinh-Anh; Chung, Nhat Minh; Nguyen, Quang Qui-Vinh; Ai-Nguyen, Vuong

doi:10.1007/978-3-031-21743-2_22

Tin-Trung Thai^13,14,
Synh Viet-Uyen Ha^13,14,
Thong Duy-Minh Nguyen^13,14,
Huy Dinh-Anh Le^13,14,
Nhat Minh Chung^13,14,
Quang Qui-Vinh Nguyen^13,14 &
…
Vuong Ai-Nguyen^13,14

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13757))

Included in the following conference series:

Asian Conference on Intelligent Information and Database Systems

1050 Accesses

Abstract

Multi-class retail product recognition is an important Computer Vision application for the retail industry. Track 4 of the AICITY challenge is introduced for the retail industry. This track focuses on the accuracy and efficiency of the automatic checkout process. However, due to the lack of training data for retail items in the real world, a synthetic data set is usually generated based on the 3d scanned items to produce training data for an automated checkout system. To overcome the difference informative representative appearance between training data and the real-world scenario in the test set provided by the AICITY organizer, our research focuses on analyzing and recognizing retail items by combining the traditional method and state-of-the-art Convolutional Neural Network (CNN) approach. This paper presents our proposed system for product counting and recognition for automated retail checkout. Our proposed method is ranked top 8 in the experimental evaluation in the 2022 AI City challenge Track-4 with an F1-score 0.4082.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Retail Product Classification on Distinct Distribution of Training and Evaluation Data

Article 18 March 2022

Naïve Approach for Bounding Box Annotation and Object Detection Towards Smart Retail Systems

A CNN-Based Supermarket Auto-Counting System

References

Sriram, T., et al.: Applications of barcode technology in automated storage and retrieval systems. In: Proceedings of the 1996 IEEE IECON. 22nd International Conference on Industrial Electronics, Control, and Instrumentation, vol. 1, pp. 641–646 (1996)
Google Scholar
LoweDavid, G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. (2004)
Google Scholar
Christopher, G., Harris, M.J., Stephens, A.: Combined corner and edge detector. In: Alvey Vision Conference (1988)
Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learn. nat. 521(7553), 436–444 (2015)
Google Scholar
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014)
Google Scholar
Howard, A.G., et al.: MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Wang, C-Y., Bochkovskiy, A., Mark Liao, H.Y.: Scaled-YOLOV4: Scaling cross stage partial network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. (13029–13038) (2021)
Google Scholar
Tan, M., Pang, R., Le, Q.V.: EfficientDet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. (10781–10790) (2020)
Google Scholar
Ze, L., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)
Google Scholar
Yin, J., Wang, W., Meng, Q., Yang, R., Shen, J.: A unified object motion and affinity model for online multi-object tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6768–6777 (2020)
Google Scholar
Braso, G., Leal-Taixe, L.: Learning a neural solver for multiple object tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6247–6257 (2020)
Google Scholar
Bergmann, P., Meinhardt, T., Leal-Taixe, L.: Tracking without bells and whistles. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 941–951 (2019)
Google Scholar
Wojke, N., Bewley, A., Paulus, D.: Simple online and realtime tracking with a deep association metric. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 3645–3649. IEEE (2017)
Google Scholar
Chattopadhyay, P., Vedantam, R., Selvaraju, R.R., Batra, D., Parikh, D.: Counting everyday objects in everyday scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1135–1144 (2017)
Google Scholar
Zhang, Y., Zhou, D., Chen, S., Gao, S., Ma, Y.: Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 589–597 (2016)
Google Scholar
Ma, Z., Wei, X., Hong, X., Gong, Y.: Bayesian loss for crowd count estimation with point supervision. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6142–6151 (2019)
Google Scholar
Kai, C., et al.: MMDetection: Open MMLab Detection Toolbox and Benchmark (2019)
Google Scholar
Zivkovic, Z., et al.: Efficient adaptive density estimation per image pixel for the task of background subtraction. Pattern Recogn. Lett. 27(7), 773–780 (2006)
Article Google Scholar
Lucena, O., et al.: Improving face detection performance by skin detection post-processing. In: 2017 30th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), pp. 300–307 IEEE, (2017)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A.: Inception-v4, inception-ResNet and the impact of residual connections on learning. In: Thirty-first AAAI Conference on Artificial Intelligence (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp. 770–778 (2016)
Google Scholar
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Google Scholar
Tan, M., Le, Q.: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. International conference on machine learning, PMLR (2019)
Google Scholar

Download references

Acknowledgment

We would like to express a big thank to Ho Chi Minh City International University-Vietnam National University (HCMIU-VNU) for supporting our work. Additionally, we would like to express our appreciation to all of our colleagues for their contributions, which considerably aided in the revision of the manuscript.

Author information

Authors and Affiliations

School of Computer Science and Engineering, International University, Ho Chi Minh City, Vietnam
Tin-Trung Thai, Synh Viet-Uyen Ha, Thong Duy-Minh Nguyen, Huy Dinh-Anh Le, Nhat Minh Chung, Quang Qui-Vinh Nguyen & Vuong Ai-Nguyen
Vietnam National University, Ho Chi Minh City, Vietnam
Tin-Trung Thai, Synh Viet-Uyen Ha, Thong Duy-Minh Nguyen, Huy Dinh-Anh Le, Nhat Minh Chung, Quang Qui-Vinh Nguyen & Vuong Ai-Nguyen

Authors

Tin-Trung Thai
View author publications
You can also search for this author in PubMed Google Scholar
Synh Viet-Uyen Ha
View author publications
You can also search for this author in PubMed Google Scholar
Thong Duy-Minh Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Huy Dinh-Anh Le
View author publications
You can also search for this author in PubMed Google Scholar
Nhat Minh Chung
View author publications
You can also search for this author in PubMed Google Scholar
Quang Qui-Vinh Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Vuong Ai-Nguyen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Synh Viet-Uyen Ha .

Editor information

Editors and Affiliations

Wrocław University of Science and Technology, Wrocław, Poland
Ngoc Thanh Nguyen
Vietnam National University, Ho Chi Minh City, Ho Chi Minh City, Vietnam
Tien Khoa Tran
Al-Farabi Kazakh National University, Almaty, Kazakhstan
Ualsher Tukayev
National University of Kaohsiung, Kaohsiung, Taiwan
Tzung-Pei Hong
Wrocław University of Science and Technology, Wrocław, Poland
Bogdan Trawiński
University of Newcastle, Newcastle, NSW, Australia
Edward Szczerbicki

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Thai, TT. et al. (2022). The Combination of Background Subtraction and Convolutional Neural Network for Product Recognition. In: Nguyen, N.T., Tran, T.K., Tukayev, U., Hong, TP., Trawiński, B., Szczerbicki, E. (eds) Intelligent Information and Database Systems. ACIIDS 2022. Lecture Notes in Computer Science(), vol 13757. Springer, Cham. https://doi.org/10.1007/978-3-031-21743-2_22

Download citation

DOI: https://doi.org/10.1007/978-3-031-21743-2_22
Published: 09 December 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21742-5
Online ISBN: 978-3-031-21743-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

The Combination of Background Subtraction and Convolutional Neural Network for Product Recognition

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Retail Product Classification on Distinct Distribution of Training and Evaluation Data

Naïve Approach for Bounding Box Annotation and Object Detection Towards Smart Retail Systems

A CNN-Based Supermarket Auto-Counting System

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

The Combination of Background Subtraction and Convolutional Neural Network for Product Recognition

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Retail Product Classification on Distinct Distribution of Training and Evaluation Data

Naïve Approach for Bounding Box Annotation and Object Detection Towards Smart Retail Systems

A CNN-Based Supermarket Auto-Counting System

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation