Multi-task cascade deep convolutional neural networks for large-scale commodity recognition

Zou, Xiaofeng; Zhou, Liqian; Li, Kenli; Ouyang, Aijia; Chen, Cen

doi:10.1007/s00521-019-04311-9

Multi-task cascade deep convolutional neural networks for large-scale commodity recognition

Advances in Parallel and Distributed Computing for Neural Computing
Published: 01 July 2019

Volume 32, pages 5633–5647, (2020)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Xiaofeng Zou ORCID: orcid.org/0000-0002-5823-6345¹,
Liqian Zhou²,
Kenli Li¹,
Aijia Ouyang¹ &
…
Cen Chen^1,3

867 Accesses
19 Citations
Explore all metrics

Abstract

In recent years, deep convolutional neural network have achieved remarkable performance in object detection and image classification. However, there are still some practical challenges in large-scale image recognition tasks. To be specific, the visual separability between different object categories is extremely uneven, and some categories have strong inter-class similarities. Existing CNN networks are trained as flat n-way classifiers, which is usually not sufficient to meet the challenges. Hence, we propose a framework: multi-task cascade deep convolutional neural network (MTCD-CNN), which contains two phases: object detection and hierarchical image classification, for large-scale commodity recognition. First, the object detection framework is utilized to locate and crop the areas that may contain objects. Then, hierarchical spectrum clustering is adopted to construct a category and a tree-like image classification model. During the testing phase, the indistinguishable objects are classified from coarse to fine by searching the path of the category tree. The proposed hierarchical image classification method provides an insight into the data by identifying the group of classes that are hard to classify and require more attention when compared to others. Through extensive experiments and comparative analyses of commodity detection in supermarkets and stores of Jinzhou city, the performance of MTCD-CNN has proved to be superior to other advanced methods, indicating that our proposed method has effectively solved the problem of excessive similarity of confusingly similar categories.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Self-adaptive Cascade ConvNets Model Based on Three-Way Decision Theory

Detection of Grocery Items in Store Using Deep Learning in Retail Business

Retail Product Classification on Distinct Distribution of Training and Evaluation Data

Article 18 March 2022

References

Xulei Y, Zeng Z, Teo Sin G, Li W, Vijay C, Steven H (2018) Deep learning for practical image recognition: case study on kaggle competitions. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 923–931
Chen J, Li K, Bilal K, Zhou X, Li K, Yu PS (2018) A bi-layered parallel training architecture for large-scale convolutional neural networks. IEEE Trans Parallel Distrib Syst 30:965–976
Article Google Scholar
Chen C, Li K, Teo SG, Chen G, Zou X, Yang X, Vijay RC, Feng J, Zeng Z (2018) Exploiting spatio-temporal correlations with multiple 3d convolutional neural networks for citywide vehicle flow prediction. In: 2018 IEEE international conference on data mining (ICDM), pp 893–898
Chen C, Li K, Ouyang A, Zeng Z, Li K (2018) Gflink: an in-memory computing architecture on heterogeneous cpu–gpu clusters for big data. IEEE Trans Parallel Distrib Syst 29(6):1275–1288
Article Google Scholar
Jianguo C, Kenli L, Zhuo T, Bilal Kashif Y, Shui WC, Keqin L (2017) A parallel random forest algorithm for big data in a spark cloud computing environment. IEEE Trans Parallel Distrib Syst 1:1–1
Google Scholar
Marian G, Christian F (2014) Recognizing products: a per-exemplar multi-label image classification approach. In: European conference on computer vision, Springer, pp 440–455
Jianlong F, Zheng H, Mei T (2017) Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. CVPR 2:3
Google Scholar
Li L, Wanli O, Xiaogang W, Paul F, Jie C, Xinwang L, Matti P (2018) Deep learning for generic object detection: a survey. arXiv preprint arXiv:1809.02165
Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117
Article Google Scholar
Christian S, Wei L, Yangqing J, Pierre S, Scott R, Dragomir A, Dumitru E, Vincent V, Andrew R (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Karen S, Andrew Z (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Kaiming H, Xiangyu Z, Shaoqing R, Jian S (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. CVPR 1:3
Google Scholar
Ross G, Jeff D, Trevor D, Jitendra M (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
Ross G (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448
Shaoqing R, Kaiming H, Ross G, Jian S (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Joseph R, Ali F (2017) Yolo9000: better, faster, stronger. arXiv preprint
Lin T-Y, Dollár P, Girshick RB, He K, Hariharan B, Belongie SJ (2017) Feature pyramid networks for object detection. CVPR 1:3
Google Scholar
Kaiming H, Georgia G, Piotr D, Ross G (2017) Mask r-cnn. In: 2017 IEEE international conference on computer vision (ICCV), IEEE, pp 2980–2988
Min L, Qiang C, Shuicheng Y (2013) Network in network. arXiv preprint arXiv:1312.4400
Xiao G, Li K, Chen Y, He W, Zomaya AY, Li T (2019) CASpMV: a customized and accelerative SpMV framework for the sunway TaihuLight. IEEE Trans Parallel Distrib Syst. https://doi.org/10.1109/TPDS.2019.2907537
Article Google Scholar
Sergey I, Christian S (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167
Li W, Matthew Z, Sixin Z, Le Cun Y, Rob F (2013) Regularization of neural networks using dropconnect. In: International conference on machine learning, pp 1058–1066
Yunpeng C, Jianan L, Huaxin X, Xiaojie J, Shuicheng Y, Jiashi F (2017) Dual path networks. In: Advances in neural information processing systems, pp 4467–4475
Joseph R, Santosh D, Ross G, Ali F (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Wei L, Dragomir A, Dumitru E, Christian S, Scott R,Cheng-Yang F, Berg AC (2016) Ssd: single shot multibox detector. In: European conference on computer vision, Springer, pp 21–37
Spyros G, Nikos K (2016) Attend refine repeat: active box proposal generation via in-out localization. In: CoRR. arXiv:abs/1606.04446
Wanli O, Ku W, Xin Z, Xiaogang W (2017) Learning chained deep features and classifiers for cascade in object detection. In: CoRR. arXiv:abs/1702.07054
Zhaowei C, Nuno V (2017) Cascade r-cnn: delving into high quality object detection. arXiv preprint arXiv:1712.00726
Zhicheng Y, Hao Z, Robinson P, Vignesh J, Dennis D, Wei D, Yizhou Y (2015) Hd-cnn: hierarchical deep convolutional neural networks for large scale visual recognition. In: Proceedings of the IEEE international conference on computer vision, pp 2740–2748
Murthy VN, Vivek S, Terrence C, Manmatha R, Dorin C (2016) Deep decision network for multi-class image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2240–2248
Yanyun Q, Li L, Fumin S, Chang L, Yang W, Yuan X, Dacheng T (2017) Joint hierarchical category structure learning and large-scale image classification. arXiv preprint arXiv:1709.05072
Fan J, Zhou N, Peng J, Gao L (2015) Hierarchical learning of tree classifiers for large-scale plant species identification. IEEE Trans Image Process 24(11):4172–4184
Article MathSciNet MATH Google Scholar
Xiao G, Li K (2017) Reporting l most influential objects in uncertain databases based on probabilistic reverse top-k queries. Inf Sci 405:207–226
Article Google Scholar
Samy B, Jason W, David G (2010) Label embedding trees for large multi-class tasks. In: Advances in neural information processing systems, pp 163–171
Chen C, Li K, Ouyang A, Li K (2018) FlinkCL: An openCL-based in-memory computing architecture on heterogeneous CPU-GPU clusters for big data. IEEE Trans Comput 67:1765–1779
Article MathSciNet MATH Google Scholar
Chen Y, Li K, Yang W, Xiao G, Xie X, Li T (2018) Performance-aware model for sparse matrix-matrix multiplication on the sunway TaihuLight supercomputer. IEEE Trans Parallel Distrib Syst 30(4):923–938. https://doi.org/10.1109/TPDS.2018.2871189
Article Google Scholar
Von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416
Article MathSciNet Google Scholar
Xiao G, Li K, Zhou X (2017) Efficient monochromatic and bichromatic probabilistic reverse top-k query processing for uncertain big data. J Comput Syst Sci 89:92–113
Article MathSciNet MATH Google Scholar
Chen C, Ouyang A, Tang Z, Li K (2017) GPU-accelerated parallel hierarchical extreme learning machine on flink for big data. IEEE Tran Syst Man Cybern Syst 47:2740–2753
Article Google Scholar
Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. AAAI 4:12
Google Scholar
Jifeng D, Yi L, Kaiming H, Jian S (2016) R-fcn: object detection via region-based fully convolutional networks. In: Advances in neural information processing systems, pp 379–387

Download references

Acknowledgements

The research was partially funded by the National Key R&D Program of China (Grant No. 2018YFB1003401), the National Outstanding Youth Science Program of National Natural Science Foundation of China (Grant No. 61625202 ), the International (Regional) Cooperation and Exchange Program of National Natural Science Foundation of China (Grant Nos. 61661146006, 61860206011 ), the National Natural Science Foundation of China ( Grant No. 61602350), the Singapore–China NRF-NSFC Grant (Grant No. NRF2016NRF-NSFC001-111), Open Foundation of Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System (No. znxx2018MS01).

Author information

Authors and Affiliations

College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
Xiaofeng Zou, Kenli Li, Aijia Ouyang & Cen Chen
School of Computer Science, Hunan University of Technology, Zhuzhou, 412007, Hunan, China
Liqian Zhou
Institute for Infocomm Research, Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
Cen Chen

Authors

Xiaofeng Zou
View author publications
You can also search for this author in PubMed Google Scholar
Liqian Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Kenli Li
View author publications
You can also search for this author in PubMed Google Scholar
Aijia Ouyang
View author publications
You can also search for this author in PubMed Google Scholar
Cen Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kenli Li.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zou, X., Zhou, L., Li, K. et al. Multi-task cascade deep convolutional neural networks for large-scale commodity recognition. Neural Comput & Applic 32, 5633–5647 (2020). https://doi.org/10.1007/s00521-019-04311-9

Download citation

Received: 08 January 2019
Accepted: 17 June 2019
Published: 01 July 2019
Issue Date: May 2020
DOI: https://doi.org/10.1007/s00521-019-04311-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-task cascade deep convolutional neural networks for large-scale commodity recognition

Abstract

Access this article

Similar content being viewed by others

A Self-adaptive Cascade ConvNets Model Based on Three-Way Decision Theory

Detection of Grocery Items in Store Using Deep Learning in Retail Business

Retail Product Classification on Distinct Distribution of Training and Evaluation Data

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-task cascade deep convolutional neural networks for large-scale commodity recognition

Abstract

Access this article

Similar content being viewed by others

A Self-adaptive Cascade ConvNets Model Based on Three-Way Decision Theory

Detection of Grocery Items in Store Using Deep Learning in Retail Business

Retail Product Classification on Distinct Distribution of Training and Evaluation Data

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation