Skip to main content
Log in

Multi-task cascade deep convolutional neural networks for large-scale commodity recognition

  • Advances in Parallel and Distributed Computing for Neural Computing
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

In recent years, deep convolutional neural network have achieved remarkable performance in object detection and image classification. However, there are still some practical challenges in large-scale image recognition tasks. To be specific, the visual separability between different object categories is extremely uneven, and some categories have strong inter-class similarities. Existing CNN networks are trained as flat n-way classifiers, which is usually not sufficient to meet the challenges. Hence, we propose a framework: multi-task cascade deep convolutional neural network (MTCD-CNN), which contains two phases: object detection and hierarchical image classification, for large-scale commodity recognition. First, the object detection framework is utilized to locate and crop the areas that may contain objects. Then, hierarchical spectrum clustering is adopted to construct a category and a tree-like image classification model. During the testing phase, the indistinguishable objects are classified from coarse to fine by searching the path of the category tree. The proposed hierarchical image classification method provides an insight into the data by identifying the group of classes that are hard to classify and require more attention when compared to others. Through extensive experiments and comparative analyses of commodity detection in supermarkets and stores of Jinzhou city, the performance of MTCD-CNN has proved to be superior to other advanced methods, indicating that our proposed method has effectively solved the problem of excessive similarity of confusingly similar categories.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Xulei Y, Zeng Z, Teo Sin G, Li W, Vijay C, Steven H (2018) Deep learning for practical image recognition: case study on kaggle competitions. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 923–931

  2. Chen J, Li K, Bilal K, Zhou X, Li K, Yu PS (2018) A bi-layered parallel training architecture for large-scale convolutional neural networks. IEEE Trans Parallel Distrib Syst 30:965–976

    Article  Google Scholar 

  3. Chen C, Li K, Teo SG, Chen G, Zou X, Yang X, Vijay RC, Feng J, Zeng Z (2018) Exploiting spatio-temporal correlations with multiple 3d convolutional neural networks for citywide vehicle flow prediction. In: 2018 IEEE international conference on data mining (ICDM), pp 893–898

  4. Chen C, Li K, Ouyang A, Zeng Z, Li K (2018) Gflink: an in-memory computing architecture on heterogeneous cpu–gpu clusters for big data. IEEE Trans Parallel Distrib Syst 29(6):1275–1288

    Article  Google Scholar 

  5. Jianguo C, Kenli L, Zhuo T, Bilal Kashif Y, Shui WC, Keqin L (2017) A parallel random forest algorithm for big data in a spark cloud computing environment. IEEE Trans Parallel Distrib Syst 1:1–1

    Google Scholar 

  6. Marian G, Christian F (2014) Recognizing products: a per-exemplar multi-label image classification approach. In: European conference on computer vision, Springer, pp 440–455

  7. Jianlong F, Zheng H, Mei T (2017) Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. CVPR 2:3

    Google Scholar 

  8. Li L, Wanli O, Xiaogang W, Paul F, Jie C, Xinwang L, Matti P (2018) Deep learning for generic object detection: a survey. arXiv preprint arXiv:1809.02165

  9. Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117

    Article  Google Scholar 

  10. Christian S, Wei L, Yangqing J, Pierre S, Scott R, Dragomir A, Dumitru E, Vincent V, Andrew R (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9

  11. Karen S, Andrew Z (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

  12. Kaiming H, Xiangyu Z, Shaoqing R, Jian S (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  13. Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. CVPR 1:3

    Google Scholar 

  14. Ross G, Jeff D, Trevor D, Jitendra M (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587

  15. Ross G (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448

  16. Shaoqing R, Kaiming H, Ross G, Jian S (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99

  17. Joseph R, Ali F (2017) Yolo9000: better, faster, stronger. arXiv preprint

  18. Lin T-Y, Dollár P, Girshick RB, He K, Hariharan B, Belongie SJ (2017) Feature pyramid networks for object detection. CVPR 1:3

    Google Scholar 

  19. Kaiming H, Georgia G, Piotr D, Ross G (2017) Mask r-cnn. In: 2017 IEEE international conference on computer vision (ICCV), IEEE, pp 2980–2988

  20. Min L, Qiang C, Shuicheng Y (2013) Network in network. arXiv preprint arXiv:1312.4400

  21. Xiao G, Li K, Chen Y, He W, Zomaya AY, Li T (2019) CASpMV: a customized and accelerative SpMV framework for the sunway TaihuLight. IEEE Trans Parallel Distrib Syst. https://doi.org/10.1109/TPDS.2019.2907537

    Article  Google Scholar 

  22. Sergey I, Christian S (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167

  23. Li W, Matthew Z, Sixin Z, Le Cun Y, Rob F (2013) Regularization of neural networks using dropconnect. In: International conference on machine learning, pp 1058–1066

  24. Yunpeng C, Jianan L, Huaxin X, Xiaojie J, Shuicheng Y, Jiashi F (2017) Dual path networks. In: Advances in neural information processing systems, pp 4467–4475

  25. Joseph R, Santosh D, Ross G, Ali F (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788

  26. Wei L, Dragomir A, Dumitru E, Christian S, Scott R,Cheng-Yang F, Berg AC (2016) Ssd: single shot multibox detector. In: European conference on computer vision, Springer, pp 21–37

  27. Spyros G, Nikos K (2016) Attend refine repeat: active box proposal generation via in-out localization. In: CoRR. arXiv:abs/1606.04446

  28. Wanli O, Ku W, Xin Z, Xiaogang W (2017) Learning chained deep features and classifiers for cascade in object detection. In: CoRR. arXiv:abs/1702.07054

  29. Zhaowei C, Nuno V (2017) Cascade r-cnn: delving into high quality object detection. arXiv preprint arXiv:1712.00726

  30. Zhicheng Y, Hao Z, Robinson P, Vignesh J, Dennis D, Wei D, Yizhou Y (2015) Hd-cnn: hierarchical deep convolutional neural networks for large scale visual recognition. In: Proceedings of the IEEE international conference on computer vision, pp 2740–2748

  31. Murthy VN, Vivek S, Terrence C, Manmatha R, Dorin C (2016) Deep decision network for multi-class image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2240–2248

  32. Yanyun Q, Li L, Fumin S, Chang L, Yang W, Yuan X, Dacheng T (2017) Joint hierarchical category structure learning and large-scale image classification. arXiv preprint arXiv:1709.05072

  33. Fan J, Zhou N, Peng J, Gao L (2015) Hierarchical learning of tree classifiers for large-scale plant species identification. IEEE Trans Image Process 24(11):4172–4184

    Article  MathSciNet  MATH  Google Scholar 

  34. Xiao G, Li K (2017) Reporting l most influential objects in uncertain databases based on probabilistic reverse top-k queries. Inf Sci 405:207–226

    Article  Google Scholar 

  35. Samy B, Jason W, David G (2010) Label embedding trees for large multi-class tasks. In: Advances in neural information processing systems, pp 163–171

  36. Chen C, Li K, Ouyang A, Li K (2018) FlinkCL: An openCL-based in-memory computing architecture on heterogeneous CPU-GPU clusters for big data. IEEE Trans Comput 67:1765–1779

    Article  MathSciNet  MATH  Google Scholar 

  37. Chen Y, Li K, Yang W, Xiao G, Xie X, Li T (2018) Performance-aware model for sparse matrix-matrix multiplication on the sunway TaihuLight supercomputer. IEEE Trans Parallel Distrib Syst 30(4):923–938. https://doi.org/10.1109/TPDS.2018.2871189

    Article  Google Scholar 

  38. Von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416

    Article  MathSciNet  Google Scholar 

  39. Xiao G, Li K, Zhou X (2017) Efficient monochromatic and bichromatic probabilistic reverse top-k query processing for uncertain big data. J Comput Syst Sci 89:92–113

    Article  MathSciNet  MATH  Google Scholar 

  40. Chen C, Ouyang A, Tang Z, Li K (2017) GPU-accelerated parallel hierarchical extreme learning machine on flink for big data. IEEE Tran Syst Man Cybern Syst 47:2740–2753

    Article  Google Scholar 

  41. Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. AAAI 4:12

    Google Scholar 

  42. Jifeng D, Yi L, Kaiming H, Jian S (2016) R-fcn: object detection via region-based fully convolutional networks. In: Advances in neural information processing systems, pp 379–387

Download references

Acknowledgements

The research was partially funded by the National Key R&D Program of China (Grant No. 2018YFB1003401), the National Outstanding Youth Science Program of National Natural Science Foundation of China (Grant No. 61625202 ), the International (Regional) Cooperation and Exchange Program of National Natural Science Foundation of China (Grant Nos. 61661146006, 61860206011 ), the National Natural Science Foundation of China ( Grant No. 61602350), the Singapore–China NRF-NSFC Grant (Grant No. NRF2016NRF-NSFC001-111), Open Foundation of Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System (No. znxx2018MS01).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kenli Li.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zou, X., Zhou, L., Li, K. et al. Multi-task cascade deep convolutional neural networks for large-scale commodity recognition. Neural Comput & Applic 32, 5633–5647 (2020). https://doi.org/10.1007/s00521-019-04311-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-019-04311-9

Keywords

Navigation