skip to main content
10.1145/3582197.3582215acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicitConference Proceedingsconference-collections
research-article

A Comparison Study of Convolutional Neural Network and Recurrent Neural Network on Image Classification

Published:30 March 2023Publication History

ABSTRACT

Image classification is a very important task in the field of computer vision, and it is widely used in daily life. In recent years, deep learning has developed rapidly in the field of image classification. Image classification methods based on deep learning can not only deal with complex images that are difficult to be processed by traditional image classification methods, but also with large-scale image data that are difficult to be dealt with by image classification methods based on machine learning algorithms. The motivation of this paper is to do a comparative analysis of the performance of CNN and RNN on image classification. In this paper, we use the CNN model, the RNN model and the CNN and RNN mixed model, which are commonly used in deep learning, to compare their classification performance on the ISVRC dataset. We learn from the comparison experiments that the CNN model has better accuracy and F1-Score than the other two models for the overall classification results and the classification results of individual categories. The results prove that CNN has better feature extraction ability than RNN for image data, and the further investigation is needed. CNN has become the dominant approach in image classification due to its excellent features such as local connectivity, weight sharing, pooling operation and multilayer structure.

CCS CONCEPTS • Computing methodologies∼Computer vision tasks

References

  1. Li L, Mu X, Li S, A review of face recognition technology[J]. IEEE access, 2020, 8: 139110-139120.Google ScholarGoogle ScholarCross RefCross Ref
  2. Grigorescu S, Trasnea B, Cocias T, A survey of deep learning techniques for autonomous driving[J]. Journal of Field Robotics, 2020, 37(3): 362-386.Google ScholarGoogle ScholarCross RefCross Ref
  3. Brunetti A, Buongiorno D, Trotta G F, Computer vision and deep learning techniques for pedestrian detection and tracking: A survey[J]. Neurocomputing, 2018, 300: 17-33.Google ScholarGoogle ScholarCross RefCross Ref
  4. Suzuki K. Overview of deep learning in medical imaging[J]. Radiological physics and technology, 2017, 10(3): 257-273.Google ScholarGoogle Scholar
  5. Zou Z, Shi Z, Guo Y, Object detection in 20 years: A survey[J]. arXiv preprint arXiv:1905.05055, 2019.Google ScholarGoogle Scholar
  6. Soleimanitaleb Z, Keyvanrad M A, Jafari A. Object tracking methods: A review[C]//2019 9th International Conference on Computer and Knowledge Engineering (ICCKE). IEEE, 2019: 282-288.Google ScholarGoogle Scholar
  7. Liu H L, Taniguchi T, Tanaka Y, Visualization of driving behavior based on hidden feature extraction by using deep learning[J]. IEEE Transactions on Intelligent Transportation Systems, 2017, 18(9): 2477-2489.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Wu F Y, Yan S Y, Smith J S, Traffic scene recognition based on deep CNN and VLAD spatial pyramids[C]//2017 International Conference on Machine Learning and Cybernetics (ICMLC). IEEE, 2017, 1: 156-161.Google ScholarGoogle Scholar
  9. Wan S, Xu X, Wang T, An intelligent video analysis method for abnormal event detection in intelligent transportation systems[J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 22(7): 4487-4495.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Miguel A. Vega-Rodriguez. Feature Extraction and Image Processing[J]. The Computer Journal,2004,47(2).Google ScholarGoogle Scholar
  11. Deyuan Zhang,Bingquan Liu,Chengjie Sun,Xiaolong Wang. Learning the Classifier Combination for Image Classification[J]. Journal of Computers,2011,6(8).Google ScholarGoogle Scholar
  12. Perreault Simon,Hébert Patrick. Median filtering in constant time.[J]. IEEE transactions on image processing : a publication of the IEEE Signal Processing Society,2007,16(9).Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. K. lot,J. Kowalski,A. Napieralski,T. Kacprzak. Analogue median/average image filter based on cellular neural network paradigm[J]. Electronics Letters,1999,35(19).Google ScholarGoogle Scholar
  14. Direkoglu C, Nixon M S. Image-Based Multiscale Shape Description Using Gaussian Filter[C]∥Computer Vision, Graphics&Image Processing, 2008. ICVGIP'08.Sixth Indian Conference on. IEEE, 2009:673-678.Google ScholarGoogle Scholar
  15. LOWE D G.Object recognition from local scale-invariant features[C]//ICCV1999: Proceedings of the 7th IEEE International Conference onComputer Vision. Piscataway: IEEE, 1999:1150-1157.Google ScholarGoogle Scholar
  16. BAY H, TUYTELAARS T, GOOL L V. SURF: speeded up robustfeatures[C]// ECCV2006: Proceedings of the 9th EuropeanConference on Computer Vision. Berlin: Springer, 2006:404-417.Google ScholarGoogle Scholar
  17. DALAL N,TRIGGS B.Histograms of oriented gradients for humandetection[C]// Proceedings of the 2005 IEEE Computer SocietyConference on Computer Vision & Pattern Recognition. Washington,DC: IEEE Computer Society, 2005:886-893.Google ScholarGoogle Scholar
  18. OJALA T, PIETIKAINEN M, MAENPAA T. Multiresolution grayscale and rotation invariant texture classification with local binarypatterns[J]. IEEE Transactions on Pattern Analysis and MachineIntelligence, 2002, 24(7): 971-987.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. BERG T, BELHUMEUR P N. Poof: part-based one-vs.-one featuresfor fine-grained categorization, face verification, and attribute estimation[C]// CVPR2013: Proceedings of the 2013 IEEEConference on Computer Vision and Pattern Recognition.Washington, DC: IEEE Computer Society, 2013:955-962.Google ScholarGoogle Scholar
  20. DANIILIDIS K, MARAGOS P, PARAGIOS N. Improving the fisherkernel for large-scale image classification[C]// ECCV2010:Proceedings of IEEE European Conference on Computer Vision,LNCS 6314. Berlin: Springer, 2010: 143-156.Google ScholarGoogle Scholar
  21. Lu D, Weng Q. A survey of image classification methods and techniques for improving classification performance[J]. International journal of Remote sensing, 2007, 28(5): 823-870.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Keller J M, Gray M R, Givens J A. A fuzzy k-nearest neighbor algorithm[J]. IEEE transactions on systems, man, and cybernetics, 1985 (4): 580-585.Google ScholarGoogle ScholarCross RefCross Ref
  23. Cortes, C., Vapnik, V.: Support vector networks. Machine Learning 20(3), 273–297 (1995)Google ScholarGoogle ScholarCross RefCross Ref
  24. Lecun Y, Bengio Y, Hinton G. Deep Learning[J]. Nature, 2015, 521 (7553) :436-444.Google ScholarGoogle ScholarCross RefCross Ref
  25. Y. LeCun,B. Boser,J. S. Denker,D. Henderson,R. E. Howard,W. Hubbard,L. D. Jackel. Backpropagation Applied to Handwritten Zip Code Recognition[J]. Neural Computation,1989,1(4).Google ScholarGoogle Scholar
  26. Fukushima K. Neocognitron: a self organizing neural network model for a mechanism of pattern recognition unaffected by shift in position.[J]. Biological cybernetics,1980,36(4).Google ScholarGoogle Scholar
  27. CHEN R, WANG M L, LAI Y. Analysis of the role and robustness of artificial intelligence in commodity image recognition under deep learning neural network[J]. PLoS ONE,2020,15(7):No. e0235783.Google ScholarGoogle Scholar
  28. NOH H,HONG S,HAN B. Learning deconvolution network for semantic segmentation[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway:IEEE,2015:1520-1528.Google ScholarGoogle Scholar
  29. Shaheen F, Verma B, Asafuddoula M. Impact of automatic feature extraction in deep learning architecture[C]//2016 International conference on digital image computing: techniques and applications (DICTA). IEEE, 2016: 1-8.Google ScholarGoogle Scholar
  30. He K, Zhang X, Ren S, Deep Residual Learning for Image Recognition[C]∥IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 2016:770-778.Google ScholarGoogle Scholar
  31. Alex Krizhevsky,Ilya Sutskever,Geoffrey E. Hinton. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM,2017,60(6).Google ScholarGoogle Scholar
  32. SZEGEDY C,LIU W,JIA Y Q,et al. Going deeper with convolutions[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2015:1-9.Google ScholarGoogle Scholar
  33. SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL].(2015-04-10)[2021-06-20].Google ScholarGoogle Scholar
  34. HE K M,ZHANG X Y,REN S Q,et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition.Piscataway:IEEE,2016:770-778.Google ScholarGoogle Scholar
  35. HOWARD A G,ZHU M L,CHEN B,et al. MobileNets:efficient convolutional neural networks for mobile vision applications[EB/OL].(2017-04-17)[2021-06-20].Google ScholarGoogle Scholar
  36. Ronald J. Williams,David Zipser. A Learning Algorithm for Continually Running Fully Recurrent Neural Networks[J]. Neural Computation,1989,1(2).Google ScholarGoogle Scholar
  37. P. Rodriguez, J. Wiles, and J. L. Elman. A Recurrent Neural Network that Learns to Count[J]. Connection Science,1999,11(1).Google ScholarGoogle Scholar
  38. Hochreiter S,Schmidhuber J. Long short-term memory.[J]. Neural computation,1997,9(8).Google ScholarGoogle Scholar
  39. Yu Y, Si X, Hu C, A review of recurrent neural networks: LSTM cells and network architectures[J]. Neural computation, 2019, 31(7): 1235-1270.Google ScholarGoogle Scholar
  40. Cho K, Merrienboer B V, Gulcehre C, Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation[J]. Computer Science, 2014.Google ScholarGoogle Scholar
  41. Pierre Sermanet,David Eigen,Xiang Zhang,Michaël Mathieu,Rob Fergus,Yann LeCun. OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks.[J]. CoRR,2013,abs/1312.6229.Google ScholarGoogle Scholar
  42. Jonathan Long,Evan Shelhamer,Trevor Darrell. Fully Convolutional Networks for Semantic Segmentation.[J]. CoRR,2014,abs/1411.4038.Google ScholarGoogle Scholar
  43. Mou Lichao,Ghamisi Pedram,Zhu Xiao Xiang. Deep Recurrent Neural Networks for Hyperspectral Image Classification[J]. IEEE Transactions on Geoscience and Remote Sensing,2017,55(7).Google ScholarGoogle Scholar
  44. Francesco Visin,Kyle Kastner,Kyunghyun Cho,Matteo Matteucci,Aaron C. Courville,Yoshua Bengio. ReNet: A Recurrent Neural Network Based Alternative to Convolutional Networks.[J]. CoRR,2015,abs/1505.00393.Google ScholarGoogle Scholar
  45. Jia D , Wei D , Socher R , ImageNet: A large-scale hierarchical image database[C]// 2009:248-255.Google ScholarGoogle Scholar
  46. Olga Russakovsky,Jia Deng,Hao Su,Jonathan Krause,Sanjeev Satheesh,Sean Ma,Zhiheng Huang,Andrej Karpathy,Aditya Khosla,Michael Bernstein,Alexander C. Berg,Li Fei-Fei. ImageNet Large Scale Visual Recognition Challenge[J]. International Journal of Computer Vision,2015,115(3).Google ScholarGoogle Scholar
  47. Goutte C , Gaussier E . A Probabilistic Interpretation of Precision, Recall and F-Score, with Implication for Evaluation[C]// Taylor & Francis. Taylor & Francis, 2005:952-952.Google ScholarGoogle Scholar
  48. Tom Fawcett. An introduction to ROC analysis[J]. Pattern Recognition Letters,2006,27(8).Google ScholarGoogle Scholar
  49. Andrew P. Bradley. The use of the area under the ROC curve in the evaluation of machine learning algorithms[J]. Pattern Recognition,1997,30(7).Google ScholarGoogle Scholar
  50. Kingma D , Ba J . Adam: A Method for Stochastic Optimization[J]. Computer Science, 2014.Google ScholarGoogle Scholar

Index Terms

  1. A Comparison Study of Convolutional Neural Network and Recurrent Neural Network on Image Classification
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          ICIT '22: Proceedings of the 2022 10th International Conference on Information Technology: IoT and Smart City
          December 2022
          385 pages
          ISBN:9781450397438
          DOI:10.1145/3582197

          Copyright © 2022 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 30 March 2023

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited
        • Article Metrics

          • Downloads (Last 12 months)71
          • Downloads (Last 6 weeks)6

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format