research-article

A Feature Saliency Based Hybrid Neural Network Model for Object Recognition

Authors:
Daojie Guo

College of Intelligence Science, National University of Defense Technology, China

College of Intelligence Science, National University of Defense Technology, China

0009-0004-1142-1270
View Profile

,
Yujun Zeng

College of Intelligence Science, National University of Defense Technology, China

College of Intelligence Science, National University of Defense Technology, China

0000-0002-5765-684X
View Profile

MLMI '23: Proceedings of the 6th International Conference on Machine Learning and Machine IntelligenceOctober 2023Pages 62–70https://doi.org/10.1145/3635638.3635648

Published:16 January 2024Publication History

MLMI '23: Proceedings of the 6th International Conference on Machine Learning and Machine Intelligence

Pages 62–70

ABSTRACT

Accurate recognition of image targets is a fundamental intelligent perception task and extracting effective features is the prerequisite. However, there are still problems with both hand-crafted features and the ones learned by deep neural networks, such as insufficient generalization ability, limited applicability, and insufficient robustness. This paper first selects representative hand-crafted features and deep convolutional features, analyzing their strengths and weaknesses from the perspective of saliency. Then, inspired by the aforementioned analysis results, a feature saliency extraction process is modeled as the feature coding of the autoencoder based on extreme learning machine (ELM-AE), and further integrated into the object recognition framework composed of a deep convolutional feature extractor and an extreme learning machine classifier. In a consequence, a hybrid neural network model for object recognition based on feature saliency is prposed. Finally, experimental results on the German traffic sign recognition benchmark (GTSRB) show that the proposed model can achieve better performance.

References

C. P. Papageorgiou, M. Oren and T. Poggio, "A general framework for object detection," Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271), Bombay, India, 1998, pp. 555-562, doi: 10.1109/ICCV.1998.710772.Google ScholarCross Ref
D. Lowe, “Distinctive image features from scale-invariant keypoints,” INTERNATIONAL JOURNAL OF COMPUTER VISION, vol. 60, no. 2, pp.91–110, Nov. 2004, doi: 10.1023/B: VISI.0000029664.99615.94.Google ScholarDigital Library
Y. Ke, R. Sukthankar, and IEEE Computer Society, “PCA-SIFT: A more distinctive representation for local image descriptors,” presented at the PROCEEDINGS OF THE 2004 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 2, 2004, pp. 506–513.Google Scholar
H. Bay, T. Tuytelaars, and L. Van Gool, “SURF: Speeded up robust features,” in COMPUTER VISION - ECCV 2006 , PT 1, PROCEEDINGS, A. Leonardis, H. Bischof, and A. Pinz, Eds., 2006, pp. 404–417. doi: 10.1007/11744023_32.Google ScholarDigital Library
N Dalal, B Triggs. Histograms of Oriented Gradients for Human Detection [J]. IEEE Computer Society Conference on Computer Vision & Pattern Recognition, 2005, 1(12): 886-893.Google ScholarDigital Library
X. Ren and D. Ramanan, "Histograms of Sparse Codes for Object Detection," 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, 2013, pp. 3246-3253, doi: 10.1109/CVPR.2013.417.Google ScholarDigital Library
Mallat, S.G. & Zhang, Z. 1993.Matching pursuit in a time-frequency dictionary. IEEE Transactions on Signal Processing 41(12): 3397–3415.Google Scholar
J. Yang, J. Wright, T. Huang, Y. Ma, and IEEE, “Image super-resolution as sparse representation of raw image patches,” presented at the 2008 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-12, 2008, pp. 2378-+.Google Scholar
F. Perronnin, C. Dance, and IEEE, “Fisher kernels on visual vocabularies for image categorization,” presented at the 2007 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-8, 2007, pp. 2272-+.Google Scholar
Lazebnik S, Schmid C, Ponce J. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories[C]//Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on. IEEE, 2006, 2: 2169-2178.Google Scholar
S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” presented at the ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett, Eds., 2015.Google Scholar
K. He, X. Zhang, S. Ren, and J. Sun, “Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition,” IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 37, no. 9, pp. 1904–1916, Sep. 2015, doi: 10.1109/TPAMI.2015.2389824.Google ScholarDigital Library
R. Girshick and IEEE, “Fast R-CNN,” presented at the 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, pp. 1440–1448. doi: 10.1109/ICCV.2015.169.Google ScholarDigital Library
S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” presented at the ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett, Eds., 2015.Google Scholar
A. Krizhevsky, I. Sutskever, and G. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” COMMUNICATIONS OF THE ACM, vol. 60, no. 6, pp. 84–90, Jun. 2017, doi: 10.1145/3065386.Google ScholarDigital Library
Dosovitskiy, Alexey “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale.” ArXiv abs/2010.11929 (2020): n. pag.Google Scholar
Washington García, Cristian Mera, Leonel Santana, and Luzmila Pro, "Algorithm for the Recognition of a Silhouette of a Person from an Image," Journal of Image and Graphics, Vol. 7, No. 2, pp. 59-63, June 2019. doi: 10.18178/joig.7.2.59-63Google ScholarCross Ref
Ryo Hasegawa, Yutaro Iwamoto, and Yen-Wei Chen, "Robust Japanese Road Sign Detection and Recognition in Complex Scenes Using Convolutional Neural Networks," Journal of Image and Graphics, Vol. 8, No. 3, pp. 59-66, September 2020. doi: 10.18178/joig.8.3.59-66Google ScholarCross Ref
Yordanka Karayaneva and Diana Hintea, "Object Recognition in Python and MNIST Dataset Modification and Recognition with Five Machine Learning Classifiers," Journal of Image and Graphics, Vol. 6, No. 1, pp. 10-20 June 2018. doi: 10.18178/joig.6.1.10-20Google ScholarCross Ref
RUMELHART D E, HINTON G E, WILLIAMS R J. Learning representations by back-propagating errors [J]. Nature, 1986,323(6088): 533-536.Google Scholar
NG A. Sparse autoencoder[J]. CS294A Lecture Notes, 2011,72(1): 1-19. VINCENT P , LAROCHELLE H , LAJOIE I ,et al. Stacked denoising autoencoders:learning useful representations in a deep network with a local denoising criterion[J]. Journal of Machine Learning.Google Scholar
Research,2010,11(12): 33713408.RIFAI S, VINCENT P, MULLER X, Contractive auto-encoders:explicit invariance during feature extraction[C]// Proceedings of the 28th International Conference on Machine Learning. Bellevue:Omnipress, 2011: 833-840.Google Scholar
RIFAI S, VINCENT P, MULLER X, Contractive auto-encoders:explicit invariance during feature extraction[C]// Proceedings of the 28th International Conference on Machine Learning. Bellevue:Omnipress, 2011: 833-840.Google Scholar
K. He, “Masked Autoencoders Are Scalable Vision Learners,” presented at the 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, pp. 15979–15988. doi: 10.1109/CVPR52688.2022.01553.Google ScholarCross Ref
G. Huang, Q. Zhu, and C. Siew, “Extreme learning machine: Theory and applications,” NEUROCOMPUTING, vol. 70, no. 1–3, pp. 489–501, Dec. 2006, doi: 10.1016/j.neucom.2005.12.126.Google ScholarCross Ref
Tang J, Deng C, Huang G B. Extreme Learning Machine for Multilayer Perceptron[J]. IEEE Transactions on Neural Networks & Learning Systems, 2017:809-821.Google Scholar
Bech A. A fast iterative shrinkage-thresholding algorithms for linear inverse problems[J]. SIAM J. Imaging Sciences, 2009, 2.Google Scholar

Index Terms

A Feature Saliency Based Hybrid Neural Network Model for Object Recognition
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object detection

Recommendations

Inception recurrent convolutional neural network for object recognition
Abstract
Deep convolutional neural network (DCNN) is an influential tool for solving various problems in machine learning and computer vision. Recurrent connectivity is a very important component of visual information processing within the human brain. The ...
Read More
Feature saliency using signal-to-noise ratios in automated diagnostic systems developed for ECG beats

Artificial neural networks (ANNs) have been used in a great number of medical diagnostic decision support system applications and within feedforward ANNs framework there are a number of established measures such as saliency measures for identifying ...
Read More
A new method of emotional analysis based on CNN–BiLSTM hybrid neural network
Abstract
The hybrid neural network model proposed in this paper consists of two main parts: extracting local features of text vectors by convolutional neural network, extracting global features related to text context by BiLSTM, and fusing the features ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

MLMI '23: Proceedings of the 6th International Conference on Machine Learning and Machine Intelligence
October 2023
196 pages
ISBN:9798400709456
DOI:10.1145/3635638

Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 16 January 2024
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Feature saliency
Hybrid neural network
Object recognition
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 9
  Total Downloads
- Downloads (Last 12 months)9
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

A Feature Saliency Based Hybrid Neural Network Model for Object Recognition

MLMI '23: Proceedings of the 6th International Conference on Machine Learning and Machine Intelligence

ABSTRACT

References

Cited By

Index Terms

Recommendations

Inception recurrent convolutional neural network for object recognition

Feature saliency using signal-to-noise ratios in automated diagnostic systems developed for ECG beats

A new method of emotional analysis based on CNN–BiLSTM hybrid neural network

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

A Feature Saliency Based Hybrid Neural Network Model for Object Recognition

MLMI '23: Proceedings of the 6th International Conference on Machine Learning and Machine Intelligence

ABSTRACT

References

Cited By

Index Terms

Recommendations

Inception recurrent convolutional neural network for object recognition

Feature saliency using signal-to-noise ratios in automated diagnostic systems developed for ECG beats

A new method of emotional analysis based on CNN–BiLSTM hybrid neural network

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media