ABSTRACT
In order to reduce the time consuming and expensive process of manually annotating data, and achieve the purpose of lightweight deployment. In this paper, an object detection method for weakly supervised learning with discrimination mechanism is proposed. We introduce the classification branch and the location branch based on the Darknet-53 backbone network of YOLO model, utilize Global Average Pooling (GAP) and Softmax to complete classification on selected areas, and adopt classification activation map for location. In addition, we use a model compression mechanism for model pruning operations, which reduces the size of the model and achieves the lightweight goal. These can effectively solve the problems of object detection to a certain extent. The results show that the improved model achieves good performance in terms of robustness and stability while maintaining the accuracy and efficiency of object detection, further improving the effectiveness of object detection tasks in practical application scenarios.
- Wang X, Zhang W, Wu X, Real-time vehicle type classification with deep convolutional neural networks[J]. Journal of Real-Time Image Processing, 2019, 16(1): 5-14.Google ScholarDigital Library
- Li L, Zhang S, Wu J. Efficient Object Detection Framework and Hardware Architecture for Remote Sensing Images[J]. Remote Sensing, 2019, 11(20): 2376.Google ScholarCross Ref
- Olague G, Hernández D E, Llamas P, Brain programming as a new strategy to create visual routines for object tracking[J]. Multimedia Tools and Applications, 2019, 78(5): 5881-5918.Google ScholarDigital Library
- Wu X, Sahoo D, Zhang D, Single-shot bidirectional pyramid networks for high-quality object detection[J]. Neurocomputing, 2020.Google Scholar
- Stathopoulou E K, Remondino F. Semantic photogrammetry: boosting image-based 3D reconstruction with semantic labeling[J]. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2019, 42(2/W9).Google ScholarCross Ref
- Wellhausen L, Dosovitskiy A, Ranftl R, Where should i walk? predicting terrain properties from images via self-supervised learning[J]. IEEE Robotics and Automation Letters, 2019, 4(2): 1509-1516.Google ScholarCross Ref
- Hamza R, Hassan A, Huang T, An Efficient Cryptosystem for Video Surveillance in the Internet of Things Environment[J]. Complexity, 2019, 2019.Google Scholar
- Radosavovic I, Dollár P, Girshick R, Data distillation: Towards omni-supervised learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 4119-4128.Google Scholar
- Vorontsov E, Molchanov P, Byeon W, Boosting segmentation with weak supervision from image-to-image translation[J]. arXiv preprint arXiv:1904.01636, 2019.Google Scholar
- Kim B, Pardo B. Sound Event Detection Using Point-Labeled Data[C]//2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). IEEE, 2019: 1-5.Google Scholar
- Bresilla K, Perulli G D, Boini A, Single-shot convolution neural networks for real-time fruit detection within the tree[J]. Frontiers in plant science, 2019, 10.Google Scholar
- Drugman T, Pylkkonen J, Kneser R. Active and semi-supervised learning in ASR: Benefits on the acoustic and language models[J]. arXiv preprint arXiv:1903.02852, 2019.Google Scholar
- Sudharshan P J, Petitjean C, Spanhol F, Multiple instance learning for histopathological breast cancer image classification[J]. Expert Systems with Applications, 2019, 117: 103-111.Google ScholarCross Ref
- Georgiadis G. Accelerating Convolutional Neural Networks via Activation Map Compression[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019: 7085-7095.Google Scholar
- Kim K J, Kim P K, Chung Y S, Multi-scale detector for accurate vehicle detection in traffic surveillance data[J]. IEEE Access, 2019, 7: 78311-78319.Google ScholarCross Ref
- Hsiao T Y, Chang Y C, Chou H H, Filter-based deep-compression with global average pooling for convolutional networks[J]. Journal of Systems Architecture, 2019, 95: 9-18.Google ScholarDigital Library
- Long C, Collins R, Swears E, Deep neural networks in fully connected crf for image labeling with social network metadata[C]//2019 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2019: 1607-1615.Google Scholar
- Courtiol P, Maussion C, Moarii M, Deep learning-based classification of mesothelioma improves prediction of patient outcome[J]. Nature medicine, 2019, 25(10): 1519-1525.Google Scholar
- Shen J, Liu N, Sun H, Vehicle Detection in Aerial Images Based on Hyper Feature Map in Deep Convolutional Network[J]. TIIS, 2019, 13(4): 1989-2011.Google Scholar
- Bharati P, Pramanik A. Deep Learning Techniques—R-CNN to Mask R-CNN: A Survey[M]//Computational Intelligence in Pattern Recognition. Springer, Singapore, 2020: 657-668.Google ScholarCross Ref
- Graham S, Vu Q D, Raza S E A, Hover-Net: Simultaneous segmentation and classification of nuclei in multi-tissue histology images[J]. Medical Image Analysis, 2019, 58: 101563.Google ScholarCross Ref
- Teh E W, Taylor G. Apparent age estimation with relational networks[C]//2019 16th Conference on Computer and Robot Vision (CRV). IEEE, 2019: 89-96.Google Scholar
- Wang D, Li C, Wen S, Daedalus: Breaking non-maximum suppression in object detection via adversarial examples[J]. arXiv, 2019: arXiv: 1902.02067.Google Scholar
- Yang H, Zhu Y, Liu J. Ecc: Platform-independent energy-constrained deep neural network compression via a bilinear regression model[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019: 11206-11215.Google Scholar
- Jiang Y, Wang S, Ko B J, Model Pruning Enables Efficient Federated Learning on Edge Devices[J]. arXiv preprint arXiv:1909.12326, 2019.Google Scholar
- Benjdira B, Khursheed T, Koubaa A, Car detection using unmanned aerial vehicles: Comparison between faster r-cnn and yolov3[C]//2019 1st International Conference on Unmanned Vehicle Systems-Oman (UVS). IEEE, 2019: 1-6.Google Scholar
- Maddern W, Pascoe G, Gadd M, Real-time Kinematic Ground Truth for the Oxford RobotCar Dataset[J]. arXiv preprint arXiv:2002.10152, 2020.Google Scholar
- Li Y, Chen Y, Wang N, Scale-aware trident networks for object detection[C]//Proceedings of the IEEE International Conference on Computer Vision. 2019: 6054-6063.Google Scholar
- Bao Z, Zhou W, Zhang W. Multi-grained Pruning Method of Convolutional Neural Network[C]//International Conference of Pioneering Computer Scientists, Engineers and Educators. Springer, Singapore, 2019: 564-576.Google Scholar
- Lee C P, Wright S J. Random permutations fix a worst case for cyclic coordinate descent[J]. IMA Journal of Numerical Analysis, 2019, 39(3): 1246-1275.Google ScholarCross Ref
- Ding X, Ding G, Guo Y, Centripetal sgd for pruning very deep convolutional networks with complicated structure[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019: 4943-4953.Google Scholar
- He Y, Liu P, Wang Z, Filter pruning via geometric median for deep convolutional neural networks acceleration[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019: 4340-4349.Google Scholar
- Salman S, Liu X. Sparsity as the Implicit Gating Mechanism for Residual Blocks[C]//2019 International Joint Conference on Neural Networks (IJCNN). IEEE, 2019: 1-6Google Scholar
Recommendations
Leveraging Prior-Knowledge for Weakly Supervised Object Detection Under a Collaborative Self-Paced Curriculum Learning Framework
Weakly supervised object detection is an interesting yet challenging research topic in computer vision community, which aims at learning object models to localize and detect the corresponding objects of interest only under the supervision of image-level ...
Forget and Diversify: Regularized Refinement for Weakly Supervised Object Detection
Computer Vision – ACCV 2018AbstractWe study weakly supervised learning for object detectors, where training images have image-level class labels only. This problem is often addressed by multiple instance learning, where pseudo-labels of proposals are constructed from image-level ...
Dynamic sample weighting for weakly supervised object detection
Highlights- A dynamic sample weighting strategy for weakly supervised object detection.
- Local domination is analyzed from the perspective of sample balance.
- A new perspective on sample importance is provided.
- Dynamically allocate the ...
AbstractThe framework based on Multiple Instance Learning (MIL) greatly improves the performance of Weakly Supervised Object Detection (WSOD), which enjoys a promising development. However, the detection results tend to be the most discriminative parts ...
Comments