ABSTRACT
Image classification is a very important task in the field of computer vision, and it is widely used in daily life. In recent years, deep learning has developed rapidly in the field of image classification. Image classification methods based on deep learning can not only deal with complex images that are difficult to be processed by traditional image classification methods, but also with large-scale image data that are difficult to be dealt with by image classification methods based on machine learning algorithms. The motivation of this paper is to do a comparative analysis of the performance of CNN and RNN on image classification. In this paper, we use the CNN model, the RNN model and the CNN and RNN mixed model, which are commonly used in deep learning, to compare their classification performance on the ISVRC dataset. We learn from the comparison experiments that the CNN model has better accuracy and F1-Score than the other two models for the overall classification results and the classification results of individual categories. The results prove that CNN has better feature extraction ability than RNN for image data, and the further investigation is needed. CNN has become the dominant approach in image classification due to its excellent features such as local connectivity, weight sharing, pooling operation and multilayer structure.
CCS CONCEPTS • Computing methodologies∼Computer vision tasks
- Li L, Mu X, Li S, A review of face recognition technology[J]. IEEE access, 2020, 8: 139110-139120.Google ScholarCross Ref
- Grigorescu S, Trasnea B, Cocias T, A survey of deep learning techniques for autonomous driving[J]. Journal of Field Robotics, 2020, 37(3): 362-386.Google ScholarCross Ref
- Brunetti A, Buongiorno D, Trotta G F, Computer vision and deep learning techniques for pedestrian detection and tracking: A survey[J]. Neurocomputing, 2018, 300: 17-33.Google ScholarCross Ref
- Suzuki K. Overview of deep learning in medical imaging[J]. Radiological physics and technology, 2017, 10(3): 257-273.Google Scholar
- Zou Z, Shi Z, Guo Y, Object detection in 20 years: A survey[J]. arXiv preprint arXiv:1905.05055, 2019.Google Scholar
- Soleimanitaleb Z, Keyvanrad M A, Jafari A. Object tracking methods: A review[C]//2019 9th International Conference on Computer and Knowledge Engineering (ICCKE). IEEE, 2019: 282-288.Google Scholar
- Liu H L, Taniguchi T, Tanaka Y, Visualization of driving behavior based on hidden feature extraction by using deep learning[J]. IEEE Transactions on Intelligent Transportation Systems, 2017, 18(9): 2477-2489.Google ScholarDigital Library
- Wu F Y, Yan S Y, Smith J S, Traffic scene recognition based on deep CNN and VLAD spatial pyramids[C]//2017 International Conference on Machine Learning and Cybernetics (ICMLC). IEEE, 2017, 1: 156-161.Google Scholar
- Wan S, Xu X, Wang T, An intelligent video analysis method for abnormal event detection in intelligent transportation systems[J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 22(7): 4487-4495.Google ScholarDigital Library
- Miguel A. Vega-Rodriguez. Feature Extraction and Image Processing[J]. The Computer Journal,2004,47(2).Google Scholar
- Deyuan Zhang,Bingquan Liu,Chengjie Sun,Xiaolong Wang. Learning the Classifier Combination for Image Classification[J]. Journal of Computers,2011,6(8).Google Scholar
- Perreault Simon,Hébert Patrick. Median filtering in constant time.[J]. IEEE transactions on image processing : a publication of the IEEE Signal Processing Society,2007,16(9).Google ScholarDigital Library
- K. lot,J. Kowalski,A. Napieralski,T. Kacprzak. Analogue median/average image filter based on cellular neural network paradigm[J]. Electronics Letters,1999,35(19).Google Scholar
- Direkoglu C, Nixon M S. Image-Based Multiscale Shape Description Using Gaussian Filter[C]∥Computer Vision, Graphics&Image Processing, 2008. ICVGIP'08.Sixth Indian Conference on. IEEE, 2009:673-678.Google Scholar
- LOWE D G.Object recognition from local scale-invariant features[C]//ICCV1999: Proceedings of the 7th IEEE International Conference onComputer Vision. Piscataway: IEEE, 1999:1150-1157.Google Scholar
- BAY H, TUYTELAARS T, GOOL L V. SURF: speeded up robustfeatures[C]// ECCV2006: Proceedings of the 9th EuropeanConference on Computer Vision. Berlin: Springer, 2006:404-417.Google Scholar
- DALAL N,TRIGGS B.Histograms of oriented gradients for humandetection[C]// Proceedings of the 2005 IEEE Computer SocietyConference on Computer Vision & Pattern Recognition. Washington,DC: IEEE Computer Society, 2005:886-893.Google Scholar
- OJALA T, PIETIKAINEN M, MAENPAA T. Multiresolution grayscale and rotation invariant texture classification with local binarypatterns[J]. IEEE Transactions on Pattern Analysis and MachineIntelligence, 2002, 24(7): 971-987.Google ScholarDigital Library
- BERG T, BELHUMEUR P N. Poof: part-based one-vs.-one featuresfor fine-grained categorization, face verification, and attribute estimation[C]// CVPR2013: Proceedings of the 2013 IEEEConference on Computer Vision and Pattern Recognition.Washington, DC: IEEE Computer Society, 2013:955-962.Google Scholar
- DANIILIDIS K, MARAGOS P, PARAGIOS N. Improving the fisherkernel for large-scale image classification[C]// ECCV2010:Proceedings of IEEE European Conference on Computer Vision,LNCS 6314. Berlin: Springer, 2010: 143-156.Google Scholar
- Lu D, Weng Q. A survey of image classification methods and techniques for improving classification performance[J]. International journal of Remote sensing, 2007, 28(5): 823-870.Google ScholarDigital Library
- Keller J M, Gray M R, Givens J A. A fuzzy k-nearest neighbor algorithm[J]. IEEE transactions on systems, man, and cybernetics, 1985 (4): 580-585.Google ScholarCross Ref
- Cortes, C., Vapnik, V.: Support vector networks. Machine Learning 20(3), 273–297 (1995)Google ScholarCross Ref
- Lecun Y, Bengio Y, Hinton G. Deep Learning[J]. Nature, 2015, 521 (7553) :436-444.Google ScholarCross Ref
- Y. LeCun,B. Boser,J. S. Denker,D. Henderson,R. E. Howard,W. Hubbard,L. D. Jackel. Backpropagation Applied to Handwritten Zip Code Recognition[J]. Neural Computation,1989,1(4).Google Scholar
- Fukushima K. Neocognitron: a self organizing neural network model for a mechanism of pattern recognition unaffected by shift in position.[J]. Biological cybernetics,1980,36(4).Google Scholar
- CHEN R, WANG M L, LAI Y. Analysis of the role and robustness of artificial intelligence in commodity image recognition under deep learning neural network[J]. PLoS ONE,2020,15(7):No. e0235783.Google Scholar
- NOH H,HONG S,HAN B. Learning deconvolution network for semantic segmentation[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway:IEEE,2015:1520-1528.Google Scholar
- Shaheen F, Verma B, Asafuddoula M. Impact of automatic feature extraction in deep learning architecture[C]//2016 International conference on digital image computing: techniques and applications (DICTA). IEEE, 2016: 1-8.Google Scholar
- He K, Zhang X, Ren S, Deep Residual Learning for Image Recognition[C]∥IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 2016:770-778.Google Scholar
- Alex Krizhevsky,Ilya Sutskever,Geoffrey E. Hinton. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM,2017,60(6).Google Scholar
- SZEGEDY C,LIU W,JIA Y Q,et al. Going deeper with convolutions[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2015:1-9.Google Scholar
- SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL].(2015-04-10)[2021-06-20].Google Scholar
- HE K M,ZHANG X Y,REN S Q,et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition.Piscataway:IEEE,2016:770-778.Google Scholar
- HOWARD A G,ZHU M L,CHEN B,et al. MobileNets:efficient convolutional neural networks for mobile vision applications[EB/OL].(2017-04-17)[2021-06-20].Google Scholar
- Ronald J. Williams,David Zipser. A Learning Algorithm for Continually Running Fully Recurrent Neural Networks[J]. Neural Computation,1989,1(2).Google Scholar
- P. Rodriguez, J. Wiles, and J. L. Elman. A Recurrent Neural Network that Learns to Count[J]. Connection Science,1999,11(1).Google Scholar
- Hochreiter S,Schmidhuber J. Long short-term memory.[J]. Neural computation,1997,9(8).Google Scholar
- Yu Y, Si X, Hu C, A review of recurrent neural networks: LSTM cells and network architectures[J]. Neural computation, 2019, 31(7): 1235-1270.Google Scholar
- Cho K, Merrienboer B V, Gulcehre C, Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation[J]. Computer Science, 2014.Google Scholar
- Pierre Sermanet,David Eigen,Xiang Zhang,Michaël Mathieu,Rob Fergus,Yann LeCun. OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks.[J]. CoRR,2013,abs/1312.6229.Google Scholar
- Jonathan Long,Evan Shelhamer,Trevor Darrell. Fully Convolutional Networks for Semantic Segmentation.[J]. CoRR,2014,abs/1411.4038.Google Scholar
- Mou Lichao,Ghamisi Pedram,Zhu Xiao Xiang. Deep Recurrent Neural Networks for Hyperspectral Image Classification[J]. IEEE Transactions on Geoscience and Remote Sensing,2017,55(7).Google Scholar
- Francesco Visin,Kyle Kastner,Kyunghyun Cho,Matteo Matteucci,Aaron C. Courville,Yoshua Bengio. ReNet: A Recurrent Neural Network Based Alternative to Convolutional Networks.[J]. CoRR,2015,abs/1505.00393.Google Scholar
- Jia D , Wei D , Socher R , ImageNet: A large-scale hierarchical image database[C]// 2009:248-255.Google Scholar
- Olga Russakovsky,Jia Deng,Hao Su,Jonathan Krause,Sanjeev Satheesh,Sean Ma,Zhiheng Huang,Andrej Karpathy,Aditya Khosla,Michael Bernstein,Alexander C. Berg,Li Fei-Fei. ImageNet Large Scale Visual Recognition Challenge[J]. International Journal of Computer Vision,2015,115(3).Google Scholar
- Goutte C , Gaussier E . A Probabilistic Interpretation of Precision, Recall and F-Score, with Implication for Evaluation[C]// Taylor & Francis. Taylor & Francis, 2005:952-952.Google Scholar
- Tom Fawcett. An introduction to ROC analysis[J]. Pattern Recognition Letters,2006,27(8).Google Scholar
- Andrew P. Bradley. The use of the area under the ROC curve in the evaluation of machine learning algorithms[J]. Pattern Recognition,1997,30(7).Google Scholar
- Kingma D , Ba J . Adam: A Method for Stochastic Optimization[J]. Computer Science, 2014.Google Scholar
Index Terms
- A Comparison Study of Convolutional Neural Network and Recurrent Neural Network on Image Classification
Recommendations
Performance comparison of text-based sentiment analysis using recurrent neural network and convolutional neural network
ICCIP '17: Proceedings of the 3rd International Conference on Communication and Information ProcessingOne biggest challenge in sentiment analysis is that it should include Natural Language Processing (NLP), to make the machine understand the human language. With the current development of Artificial Neural Network (ANN), with its implementation, ...
A dyadic multi-resolution deep convolutional neural wavelet network for image classification
For almost the past four decades, image classification has gained a lot of attention in the field of pattern recognition due to its application in various fields. Given its importance, several approaches have been proposed up to now. In this paper, we ...
Protein Image Classification based on Convolutional Neural Network and Recurrent Neural Network
CSAI '19: Proceedings of the 2019 3rd International Conference on Computer Science and Artificial IntelligenceProteins are an essential component in the cell where the functions are executed to enable life. At present, the manual evaluation and classification of protein images is not practical given the current situation for generated images on a large scale. ...
Comments