research-article

Object Detection Algorithm Based on Second-Order Pooling Network and Gaussian Mixture Attention

Authors:
Sugang Ma

School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, China

School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, China

0000-0002-8588-8111
View Profile

,
Ningbo Li

School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, China

School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, China

0000-0002-6110-5741
View Profile

,
Xiaobao Yang

School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, China

School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, China

0000-0003-1515-8663
View Profile

AIPR '22: Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern RecognitionSeptember 2022Pages 353–359https://doi.org/10.1145/3573942.3574033

Published:16 May 2023Publication History

AIPR '22: Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition

Pages 353–359

ABSTRACT

To improve the feature representation ability of the YOLOX algorithm and obtain better detection performance, an object detection algorithm based on second-order pooling network and gaussian mixture attention is proposed. Firstly, the second-order pooling network is added after the PAFPN, and the higher-order statistical information is obtained by calculating the covariance matrix between different channels, to enhance the non-linear modeling capability. Secondly, the mixture attention based on the gaussian function is added after the SPP to model global contexts in the spatial and channel dimensions respectively, which improves the network performance with almost no extra parameters. The experimental results show that the detection accuracy of the proposed algorithm on the PASCAL VOC dataset reaches 82.6 %, which is 1.6 % higher than the YOLOX algorithm.

References

A. Ozcan and O. Cetin, A Novel Fusion Method With Thermal and RGB-D Sensor Data for Human Detection[J]. IEEE Access, 2022, 10: 66831-66843.Google ScholarCross Ref
Gao J, Yang T. Face detection algorithm based on improved TinyYOLOv3 and attention mechanism[J]. Computer Communications, 2022, 181: 329-337.Google ScholarDigital Library
Qian R, Lai X, Li X. 3D object detection for autonomous driving: a survey[J]. Pattern Recognition, 2022: 108796.Google ScholarDigital Library
Chen Keqi, Zhu Zhiliang, Deng Xiaoming, Deep learning for multi-scale object detection: A Survey[J]. Journal of Software, 2021, 32(04):1201-1227Google Scholar
Lowe D G. Distinctive image features from scale-invariant keypoints[J]. International journal of computer vision, 2004, 60(2): 91-110.Google ScholarDigital Library
Dalal N, Triggs B. Histograms of oriented gradients for human detection[C]//2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05). Ieee, 2005, 1: 886-893.Google ScholarDigital Library
Girshick R, Donahue J, Darrell T, Rich feature hierarchies for accurate object detection and semantic segmentation[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2014: 580-587Google Scholar
Ren S Q, He K M, Girshick R, Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39(6): 1137-1149Google ScholarDigital Library
Pang J, Chen K, Shi J, Libra r-cnn: Towards balanced learning for object detection[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 821-830.Google Scholar
Redmon J, Divvala S, Girshick R, You only look once: Unified, real-time object detection[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2016: 779-788Google Scholar
Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 7263-7271.Google Scholar
Redmon J, Farhadi A. Yolov3: An incremental improvement[OL]. [2018.4.8]. https://arxiv.org/abs/1804.02767.pdfGoogle Scholar
Bochkovskiy A, Wang C Y, Liao H Y M. Yolov4: Optimal speed and accuracy of object detection[J]. arXiv preprint arXiv:2004.10934, 2020.Google Scholar
Ge Z, Liu S, Wang F, Yolox: Exceeding yolo series in 2021[OL]. [2021.7.18]. https://arxiv.org/abs/2107.08430.pdfGoogle Scholar
Liu W, Anguelov D, Erhan D, SSD: Single shot multibox detector[C] // Proceedings of European Conference on Computer Vision. Heidelberg: Springer, 2016: 21-37Google Scholar
Tian Z, Shen C, Chen H, FCOS: Fully convolutional one-stage object detection[C] //Proceedings of the IEEE International Conference on Computer Vision. Los Alamitos: IEEE Computer Society Press, 2019: 9627-9636Google Scholar
Zhou Xingyi, Wang Dequan, KRÄHENBÜHL P. Objects as points[OL]. [2019.5.25]. https://arxiv.org/abs/1904.07850.pdfGoogle Scholar
Vaswani A, Shazeer N, Parmar N, Attention is all you need[J]. Advances in Neural Information Processing Systems, 2017, 30Google Scholar
Carion N, Massa F, Synnaeve G, End-to-end object detection with transformers[C] // Proceedings of European Conference on Computer Vision. Heidelberg: Springer, 2020: 213-229Google Scholar
Dai Z, Cai B, Lin Y, UP-DETR: Unsupervised pre-training for object detection with transformers[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2021: 1601-1610Google Scholar
Wang H, Wang Q, Gao M, Multi-scale location-aware kernel representation for object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 1248-1257.Google Scholar
Gao Z, Xie J, Wang Q, Global second-order pooling convolutional networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 3024-3033.Google Scholar
Chen B, Deng W, Hu J. Mixed high-order attention network for person re-identification[C]//Proceedings of the IEEE/CVF international conference on computer vision. 2019: 371-381.Google Scholar
Li P, Xie J, Wang Q, Is second-order information helpful for large-scale visual recognition?[C]//Proceedings of the IEEE international conference on computer vision. 2017: 2070-2078.Google Scholar
Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 7132-7141.Google Scholar
Woo S, Park J, Lee J Y, Cbam: Convolutional block attention module[C]//Proceedings of the European conference on computer vision (ECCV). 2018: 3-19.Google Scholar
Park J, Woo S, Lee J Y, BAM: Bottleneck Attention Module[C]//British Machine Vision Conference (BMVC). British Machine Vision Association (BMVA), 2018.Google Scholar
Hou Q, Zhou D, Feng J. Coordinate attention for efficient mobile network design[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021: 13713-13722.Google Scholar
Ruan D, Wang D, Zheng Y, Gaussian Context Transformer[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 15129-15138.Google Scholar
DAI Jifeng, LI Yi, HE Kaiming, R-FCN: Object detection via region-based fully convolutional networks[C]. The 30th International Conference on Neural Information Processing Systems, Barcelona, SPAIN, 2016: 379–387. doi: 10.5555/3157096.3157139.Google ScholarDigital Library
LIN T Y, GOYAL P, GIRSHICK R, Focal loss for dense object detection[C]. 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 2980-2988.Google Scholar
Zhou X, Zhuo J, Krahenbuhl P. Bottom-up object detection by grouping extreme and center points[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 850-859.Google Scholar

Index Terms

Object Detection Algorithm Based on Second-Order Pooling Network and Gaussian Mixture Attention
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object detection

Recommendations

A BYY Split-and-Merge EM Algorithm for Gaussian Mixture Learning
ISNN '08: Proceedings of the 5th international symposium on Neural Networks: Advances in Neural Networks

Gaussian mixture is a powerful statistic tool and has been widely used in the fields of information processing and data analysis. However, its model selection, i.e., the selection of number of Gaussians in the mixture, is still a difficult problem. ...
Read More
PARAFAC-Based Blind Identification of Underdetermined Mixtures Using Gaussian Mixture Model

This paper presents a novel algorithm, named GMM-PARAFAC, for blind identification of underdetermined instantaneous linear mixtures. The GMM-PARAFAC algorithm uses Gaussian mixture model (GMM) to model non-Gaussianity of the independent sources. We show ...
Read More
Gaussian mixture density modeling, decomposition, and applications

We present a new approach to the modeling and decomposition of Gaussian mixtures by using robust statistical methods. The mixture distribution is viewed as a contaminated Gaussian density. Using this model and the model-fitting (MF) estimator, we propose ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

AIPR '22: Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition
September 2022
1221 pages
ISBN:9781450396899
DOI:10.1145/3573942

Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 16 May 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Attention mechanism
Gaussian function
Object detection
Second-Order Pooling network
YOLOX
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 21
  Total Downloads
- Downloads (Last 12 months)21
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Object Detection Algorithm Based on Second-Order Pooling Network and Gaussian Mixture Attention

AIPR '22: Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition

ABSTRACT

References

Cited By

Index Terms

Recommendations

A BYY Split-and-Merge EM Algorithm for Gaussian Mixture Learning

PARAFAC-Based Blind Identification of Underdetermined Mixtures Using Gaussian Mixture Model

Gaussian mixture density modeling, decomposition, and applications

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Object Detection Algorithm Based on Second-Order Pooling Network and Gaussian Mixture Attention

AIPR '22: Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition

ABSTRACT

References

Cited By

Index Terms

Recommendations

A BYY Split-and-Merge EM Algorithm for Gaussian Mixture Learning

PARAFAC-Based Blind Identification of Underdetermined Mixtures Using Gaussian Mixture Model

Gaussian mixture density modeling, decomposition, and applications

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media