research-article

ALResNet: Attention-Driven Lightweight Residual Network for Fast and Accurate Image Recognition

Authors:
Chang Lu

College of Information Science and Technology, Jinan University, China

College of Information Science and Technology, Jinan University, China
View Profile

,
Rui Wang

College of Information Science and Technology, Jinan University, China

College of Information Science and Technology, Jinan University, China
View Profile

,
Beibei Huang

College of Information Science and Technology, Jinan University, China

College of Information Science and Technology, Jinan University, China
View Profile

,
Yuan Li

College of Information Science and Technology, Jinan University, China

College of Information Science and Technology, Jinan University, China
View Profile

,
Zunkai Huang

Shanghai Advanced Research Institute, Chinese Academy of Sciences, China

Shanghai Advanced Research Institute, Chinese Academy of Sciences, China
View Profile

,
Yicong Zhou

Department of Computer and Information Science, University of Macau, China

Department of Computer and Information Science, University of Macau, China
View Profile

,
Aiwen Luo

College of Information Science and Technology, Jinan University, China and Department of Computer and Information Science, University of Macau, China

College of Information Science and Technology, Jinan University, China and Department of Computer and Information Science, University of Macau, China
View Profile

MLMI '21: Proceedings of the 2021 4th International Conference on Machine Learning and Machine IntelligenceSeptember 2021Pages 21–29https://doi.org/10.1145/3490725.3490729

Published:29 December 2021Publication History

MLMI '21: Proceedings of the 2021 4th International Conference on Machine Learning and Machine Intelligence

Pages 21–29

References

Yann LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel. 1989. Backpropagation applied to handwritten zip code recognition. Neural Computation, 1(4), 541-551.Google Scholar
Guosheng Lin, Chunhua Shen, Anton van den Hengel, and Ian Reid. 2016. Efficient piecewise training of deep structured models for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3194-3203.Google Scholar
Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio. 2017. A structured self-attentive sentence embedding. arXiv preprint arXiv:1703.03130.Google Scholar
Fei Wang, Mengqing Jiang, Chen Qian, 2017. Residual attention network for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3156-3164.Google ScholarCross Ref
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. In Proceedings of the IEEE, 86(11), 2278-2324.Google Scholar
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25, 1097-1105.Google ScholarDigital Library
Karen Simonyan, and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556Google Scholar
Yiming Zuo, Peishun Liu, Yaqi Tan, Zhaoxia Guo, and Ruichun Tang. 2020, October. An attention-based lightweight residual network for plant disease recognition. In 2020 International Conference on Artificial Intelligence and Computer Engineering (ICAICE). IEEE, 224-228.Google Scholar
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770-778.Google ScholarCross Ref
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.Google Scholar
Jun Fu, Jing Liu, Haijie Tian, Yong Li, Yongjun Bao, Zhiwei Fang, and Hanqing Lu. 2019. Dual attention network for scene segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3146-3154.Google ScholarCross Ref
Wojciech Zaremba, Ilya Sutskever, and Oriol Vinyals. 2014. Recurrent neural network regularization. arXiv preprint arXiv:1409.2329.Google Scholar
Sepp Hochreiter, and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation, 9(8): 1735-1780.Google Scholar
Hugo Larochelle, and Geoffrey E. Hinton. 2010. Learning to combine foveal glimpses with a third-order Boltzmann machine. Advances in Neural Information Processing Systems, 23, 1243-1251.Google Scholar
Kelvin Xu, Jimmy Ba, Ryan Kiros, 2015. Show, attend and tell: Neural image caption generation with visual attention. In Proceedings of the 32nd International Conference on Machine Learning, PMLR 37: 2048-2057Google Scholar
Sanghyun Woo, Jongchan Park, Joon-Young Lee, and In So Kweon. 2018. CBAM: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), 3-19.Google ScholarDigital Library
Drew Linsley, Dan Scheibler, Sven Eberhardt, and Thomas Serre. 2018. Global-and-local attention networks for visual recognition. arXiv preprint arXiv:1805.08819.Google Scholar
Francois Chollet. 2017. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1251-1258.Google ScholarCross Ref
Andrew G. Howard, Menglong Zhu, Bo Chen, 2017. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861.Google Scholar
Corrado Alessio, Animals-10 Dataset, Animal Pictures of 10 Different Categories Taken from Google Images. Accessed on: Dec. 20, 2020, [Online]. Available: https://www.kaggle.com/alessiocorrado99/animals10Google Scholar
A. Krizhevsky, and G. Hinton. 2009. Learning multiple layers of features from tiny images. Accessed on: Dec. 25, 2020, [Online]. Available: https://www.cs.toronto.edu/∼kriz/cifar.htmlGoogle Scholar
Yu Wang, Quan Zhou, Jia Liu, Jian Xiong, Guangwei Gao, Xiaofu Wu, and Longin Jan Latecki. 2019, September. LedNet: A lightweight encoder-decoder network for real-time semantic segmentation. In 2019 IEEE International Conference on Image Processing, 1860-1864.Google Scholar
Forrest N. Iandola, Song Han, Matthew W. Moskewicz, Khalid Ashraf, William J. Dally, and Kurt Keutzer. 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0.5 MB model size. arXiv preprint arXiv:1602.07360.Google Scholar
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 248-255.Google ScholarCross Ref
Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, and Jian Sun. 2018. Shufflenet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 6848-6856.Google ScholarCross Ref
Gao Huang, Zhuang Li, Laurens van der Maaten, and Kilian Q. Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4700-4708.Google Scholar
Mingxing Tan, and Quoc Le 2019, May. EfficientNet: Rethinking model scaling for convolutional neural networks. Proceedings of the 36th International Conference on Machine Learning, Long Beach, California, PMLR 97: 6105-6114.Google Scholar
Ashish Vaswani, 2017. Attention is all you need. In Advances in Neural Information Processing Systems, 5998-6008.Google Scholar
Yuhui Yuan, Lang Huang, Jianyuan Guo, 2018. OCNet: Object context network for scene parsing. arXiv preprint arXiv:1809.00916.Google Scholar
Ryo Hasegawa, Yutaro Iwamoto, and Yen-Wei Chen. 2020. Robust Japanese road sign detection and recognition in complex scenes using convolutional neural networks. Journal of Image and Graphics, 8(3): 59-66.Google ScholarCross Ref
Mengting Liu, Guoying Liu, Yongge Liu, and Qingju Jiao. 2020. Oracle-bone inscription recognition based on deep convolutional neural network. Journal of Image and Graphics, 8(4): 114-119.Google Scholar
Angie M. Ceniza, Tom Kalvin B. Archival, and Kate V. Bongo. 2018. Mobile application for recognizing text in degraded document images using optical character recognition with adaptive document image binarization. Journal of Image and Graphics, 6(1):44-47.Google ScholarCross Ref
Jie Hu, Li Shen and Gang Sun. 2018. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 7132-7141.Google ScholarCross Ref
Haifeng Zhang and Shenjie Xu. 2016. The face recognition algorithms based on weighted LTP. Journal of Image and Graphics. 4.1:11-14.Google ScholarCross Ref
Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 249-256.Google Scholar

Index Terms

ALResNet: Attention-Driven Lightweight Residual Network for Fast and Accurate Image Recognition
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
  2. Machine learning
    1. Learning paradigms
    2. Machine learning approaches
      1. Neural networks
2. Software and its engineering

Index terms have been assigned to the content through auto-classification.

Recommendations

Single image deraining using multi-scales context information and attention network
Highlights
- We proposed a novel CNN-based framework for image deraining.
- We designed a ...
Abstract
The existing deraining methods based on convolutional neural networks (CNNs) have made great success, but some remaining rain streaks can degrade images drastically. In this work, we proposed an end-to-end multi-scale context ...
Read More
Single-image super-resolution with multilevel residual attention network
Abstract
Recently, a great variety of image super-resolution (SR) algorithms based on convolutional neural network (CNN) have been proposed and achieved significant improvement. But how to restore more high-frequency details such as edges and textures is ...
Read More
Falcon: lightweight and accurate convolution based on depthwise separable convolution
Abstract
How can we efficiently compress convolutional neural network (CNN) using depthwise separable convolution, while retaining their accuracy on classification tasks? Depthwise separable convolution, which replaces a standard convolution with a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

MLMI '21: Proceedings of the 2021 4th International Conference on Machine Learning and Machine Intelligence
September 2021
189 pages
ISBN:9781450384247
DOI:10.1145/3490725

Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 29 December 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Channel split
Depthwise separable convolution
Fast image recognition
Lightweight residual networks
Spatial-channel attention
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 106
  Total Downloads
- Downloads (Last 12 months)20
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

ALResNet: Attention-Driven Lightweight Residual Network for Fast and Accurate Image Recognition

MLMI '21: Proceedings of the 2021 4th International Conference on Machine Learning and Machine Intelligence

References

Cited By

Index Terms

Recommendations

Single image deraining using multi-scales context information and attention network

Single-image super-resolution with multilevel residual attention network

Falcon: lightweight and accurate convolution based on depthwise separable convolution

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

ALResNet: Attention-Driven Lightweight Residual Network for Fast and Accurate Image Recognition

MLMI '21: Proceedings of the 2021 4th International Conference on Machine Learning and Machine Intelligence

References

Cited By

Index Terms

Recommendations

Single image deraining using multi-scales context information and attention network

Single-image super-resolution with multilevel residual attention network

Falcon: lightweight and accurate convolution based on depthwise separable convolution

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media