research-article

Inference Adaptive Thresholding based Non-Maximum Suppression for Object Detection in Video Image Sequence

Authors:
Mengqing Jiang

Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Electronics, Beijing Institute of Technology, Beijing, China

Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Electronics, Beijing Institute of Technology, Beijing, China
View Profile

,
Yurong Jiang

Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Electronics, Beijing Institute of Technology, Beijing, China

Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Electronics, Beijing Institute of Technology, Beijing, China
View Profile

,
Min Li

Institute of Information Engineering, Chinese Academy of Sciences and School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China

Institute of Information Engineering, Chinese Academy of Sciences and School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
View Profile

,
Bo Meng

Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Electronics, Beijing Institute of Technology, Beijing, China

Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Electronics, Beijing Institute of Technology, Beijing, China
View Profile

,
Hong Song

School of Software, Beijing Institute of Technology, Beijing, China

School of Software, Beijing Institute of Technology, Beijing, China
View Profile

,
Danni Ai

Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Electronics, Beijing Institute of Technology, Beijing, China

Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Electronics, Beijing Institute of Technology, Beijing, China
View Profile

,
Jian Yang

Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Electronics, Beijing Institute of Technology, Beijing, China

Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Electronics, Beijing Institute of Technology, Beijing, China
View Profile

ICIAI '19: Proceedings of the 2019 3rd International Conference on Innovation in Artificial IntelligenceMarch 2019Pages 21–27https://doi.org/10.1145/3319921.3319950

Published:15 March 2019Publication History

ICIAI '19: Proceedings of the 2019 3rd International Conference on Innovation in Artificial Intelligence

Pages 21–27

ABSTRACT

This study proposes a novel inference adaptive thresholding based non-maximum suppression (NMS) (IAT-NMS) algorithm for deriving temporal cues between video sequences. The inference of temporal connectivity is first derived according to an overlapping measure of the bounding boxes between adjacent frames. Frames with high-confidence detection object are taken as key frames to leverage the scores of neighbor detections and preserve potential detections of blurred objects with low scores. Then, bounding boxes within each frame are ranked via their confidence scores and the overlapping ratio between the bounding box with the highest score against the remaining surrounding boxes is computed. This measure of overlapping is brought into a Gaussian function to estimate weights for adaptive suppression and to softly suppress the detection scores of possible severely overlapped objects. The proposed method is compared with state-of-the-art video object detection techniques. With the application of IAT-NMS, overlapping objects originally undistinguishable in the compared methods become detectable. Experimental results demonstrate that this simple and unsupervised method outperforms state-of-the-art NMS algorithms, with an increase of 6% in mean average precision (mAP) on the ImageNet VID dataset. Our study on performance limitations and sensitivity to parametric variations also finds that IAT-NMS demonstrates better detection capability than does the three compared algorithms, which fail to detect all targets or distinguish in the presence of multiple overlapping targets.

References

Ren S, He K, Girshick R, Sun J. 2017. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE transactions on pattern analysis and machine intelligence. 39(6):1137--49. Google ScholarDigital Library
Ren S, He K, Girshick R, Zhang X, Sun J. 2017. Object detection networks on convolutional feature maps. IEEE transactions on pattern analysis and machine intelligence. 39(7):1476--81.Google Scholar
Redmon J, Farhadi A. 2017. YOLO9000: better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Honolulu, HI, USA, July 21-26, 2017). IEEE.Google ScholarCross Ref
Girshick R, Donahue J, Darrell T, Malik J, editors. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Columbus, OH, USA, June 23-28, 2014). IEEE. Google ScholarDigital Library
Chen C, Seff A, Kornhauser A, Xiao J, editors. 2015. Deepdriving: Learning affordance for direct perception in autonomous driving. In Proceedings of the IEEE International Conference on Computer Vision (Santiago, Chile, December 7-13, 2015). IEEE. Google ScholarDigital Library
Chen X, Ma H, Wan J, Li B, Xia T, editors. 2017. Multi-view 3d object detection network for autonomous driving. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Honolulu, HI, USA, July 21-26, 2017). IEEE.Google ScholarCross Ref
Venetianer PL, Lipton AJ, Chosak AJ, Frazier MF, Haering N, Myers GW, et al. 2018. Video surveillance system employing video primitives, US8711217.Google Scholar
Lande, R., & Mulajkar, R. M. 2018. Moving Object Detection using Foreground Detection for Video Surveillance System. International Research Journal of Engineering and Technology, 5(6): 517--519, e-ISSN: 2395--0056.Google Scholar
Zhu X, Xiong Y, Dai J, Yuan L, Wei Y, editors. 2017. Deep feature flow for video recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Honolulu, HI, USA, July 21-26, 2017). IEEE.Google ScholarCross Ref
Kang K, Li H, Xiao T, Ouyang W, Yan J, Liu X, et al., editors. 2017. Object detection in videos with tubelet proposal networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Honolulu, HI, USA, July 21-26, 2017). IEEE.Google Scholar
Dalal N, Triggs B, editors. 2005. Histograms of oriented gradients for human detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE. Google ScholarDigital Library
Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D. 2010. Object detection with discriminatively trained part-based models. IEEE transactions on pattern analysis and machine intelligence. 32(9):1627--45. Google ScholarDigital Library
Han W, Khorrami P, Paine TL, Ramachandran P, Babaeizadeh M, Shi H, et al. 2016. Seq-nms for video object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE. arXiv:1602.08465.Google Scholar
Hosang J, Benenson R, Schiele B, editors. 2016. A convnet for non-maximum suppression. In Proceedings of the German Conference on Pattern Recognition. arXiv:1511.06437.Google ScholarCross Ref
Ma L, Kan X, Xiao Q, Liu W, Sun P. 2017. Yes-Net: An effective Detector Based on Global Information. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE. arXiv:170609180.Google Scholar
Hosang J, Benenson R, Schiele B. 2017. Learning non-maximum suppression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Honolulu, HI, USA, July 21-26, 2017). IEEE.Google ScholarCross Ref
Bodla N, Singh B, Chellappa R, Davis LS, editors. 2017. Soft-NMS---Improving Object Detection with One Line of Code. In Proceedings of the IEEE International Conference on Computer Vision (Venice, Italy October 22-29 2017). IEEE.Google ScholarCross Ref

Index Terms

Inference Adaptive Thresholding based Non-Maximum Suppression for Object Detection in Video Image Sequence
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object detection

Recommendations

Penalty Non-maximum Suppression in Object Detection
Pattern Recognition and Computer Vision
Abstract
As a post-processing step, Non-Maximum Suppression (NMS) is always used to obtain final detection boxes. It suppresses all detection boxes which have a higher intersection-over-union (IoU) overlap than threshold T with pre-selected detection box p ...
Read More
Improved non-maximum suppression for object detection using harmony search algorithm
Abstract
Non-maximum suppression (NMS) plays a key role in many modern object detectors. It is responsible to remove detection boxes that cover the same object. NMS greedily selects the detection box with maximum score; other detection boxes ...
Highlights
- The task of Non-Maximum Suppression is regarded as a combination optimization problem.
Read More
ANMS: attention-based non-maximum suppression
Abstract
Non-Maximum Suppression (NMS) is an essential part of the object detection pipeline. However, due to the inconsistency between the classification confidence and the object localization, NMS may mistakenly eliminate the bounding boxes with low ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICIAI '19: Proceedings of the 2019 3rd International Conference on Innovation in Artificial Intelligence
March 2019
279 pages
ISBN:9781450361286
DOI:10.1145/3319921

Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 March 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Non-maximum suppression
Object detection
Video image
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 111
  Total Downloads
- Downloads (Last 12 months)7
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Inference Adaptive Thresholding based Non-Maximum Suppression for Object Detection in Video Image Sequence

ICIAI '19: Proceedings of the 2019 3rd International Conference on Innovation in Artificial Intelligence

ABSTRACT

References

Cited By

Index Terms

Recommendations

Penalty Non-maximum Suppression in Object Detection

Improved non-maximum suppression for object detection using harmony search algorithm

ANMS: attention-based non-maximum suppression

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Inference Adaptive Thresholding based Non-Maximum Suppression for Object Detection in Video Image Sequence

ICIAI '19: Proceedings of the 2019 3rd International Conference on Innovation in Artificial Intelligence

ABSTRACT

References

Cited By

Index Terms

Recommendations

Penalty Non-maximum Suppression in Object Detection

Improved non-maximum suppression for object detection using harmony search algorithm

ANMS: attention-based non-maximum suppression

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media