Selective deep ensemble for instance retrieval

Ding, Zhengyan; Song, Lei; Zhang, Xiaoteng; Xu, Zheng

doi:10.1007/s11042-018-5967-8

Selective deep ensemble for instance retrieval

Published: 14 April 2018

Volume 78, pages 5751–5767, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Zhengyan Ding¹,
Lei Song^1,2,
Xiaoteng Zhang¹ &
…
Zheng Xu^1,2

354 Accesses
2 Citations
Explore all metrics

Abstract

In public security systems, visual instance retrieval has an explosive growing requirement, especially for large-scale image or video databases. Due to its wide range of applications in surveillance scenario, this paper aims at the retrieval tasks centered around ‘vehicle’ and ‘pedestrian’ targets. Many previous CNN-based methods have not exploited the ensemble abilities of different models, which achieve limited accuracy since a certain kind of deep architecture is not comprehensive. On the other hand, some features in the original deep representation are useless for retrieval tasks, while the attention-aware compact representation will be much more efficient and effective. To address the above problems, we propose a Selective Deep Ensemble (SDE) framework to combine various models and features in a complementary way, inspired by the attention mechanism. It is demonstrated that a large improvement can be acquired with slight increase on computation cost. Finally, we evaluate the performance on three public instance-retrieval datasets, VehicleID, VeRi and Market-1501, outperforming state-of-the-art methods by a large margin.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

SSD: Single Shot MultiBox Detector

End-to-End Object Detection with Transformers

References

Babenko A, Lempitsky V (2015) Aggregating local deep features for image retrieval[C]. Proceedings of the IEEE international conference on computer vision, pp 1269–1277
Bai Y, Gao F, Lou Y et al (2017) Incorporating intra-class variance to fine-grained visual recognition[J]. arXiv preprint arXiv:1703.00196
Google Scholar
Gordo A, Almazán J, Revaud J et al (2016) Deep image retrieval: Learning global representations for image search[C]. European Conference on Computer Vision. Springer International Publishing, pp 241–257
Hariharan B, Arbeláez P, Girshick R et al (2015) Hypercolumns for object segmentation and fine-grained localization[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 447–456
He K, Zhang X, Ren S et al (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition[C]. European conference on computer vision. Springer, Cham, pp 346–361
Google Scholar
He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition[C]. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hoang T, Do TT, Tan DKL et al (2017) Selective deep convolutional features for image retrieval[J]. arXiv preprint arXiv:1707.00809
Google Scholar
Hu J, Shen L, Sun G (2017) Squeeze-and-excitation networks[J]. arXiv preprint arXiv:1709.01507
Google Scholar
Kalantidis Y, Mellina C, Osindero S (2016) Cross-dimensional weighting for aggregated deep convolutional features[C]. European conference on computer vision. Springer International Publishing, pp 685–701
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks[C]. Advances in neural information processing systems, pp 1097–1105
Lin TY, Dollár P, Girshick R et al (2016) Feature pyramid networks for object detection[J]. arXiv preprint arXiv:1612.03144
Google Scholar
Liu X, Liu W, Mei T et al (2016) A deep learning-based approach to progressive vehicle re-identification for urban surveillance[C]. European conference on computer vision. Springer International Publishing, pp 869–884
Liu H, Tian Y, Yang Y et al (2016) Deep relative distance learning: tell the difference between similar vehicles[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2167–2175
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation[C]. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Ma C, Huang J B, Yang X et al (2015) Hierarchical convolutional features for visual tracking[C]. Proceedings of the IEEE international conference on computer vision, pp 3074–3082
Radenović F, Tolias G, Chum O (2016) CNN image retrieval learns from BoW: unsupervised fine-tuning with hard examples[C]. European conference on computer vision. Springer International Publishing, pp 3–20
Razavian A S, Azizpour H, Sullivan J et al (2014) CNN features off-the-shelf: an astounding baseline for recognition[C]. Computer vision and pattern recognition workshops (CVPRW), 2014 I.E. conference on. IEEE, pp 512–519
Razavian AS, Sullivan J, Carlsson S, Maki A (2016) Visual instance retrieval with deep convolutional networks[J]. ITE Transactions on Media Technology and Applications 4(3):251–258
Article Google Scholar
Ren S, He K, Girshick R et al (2015) Faster R-CNN: towards real-time object detection with region proposal networks[C]. Advances in neural information processing systems, pp 91–99
Sermanet P, Eigen D, Zhang X et al (2014) Overfeat: integrated recognition, localization and detection using convolutional networks[C]. In: ICLR
Shen Y, Xiao T, Li H et al (2017) Learning deep neural networks for vehicle re-id with visual-spatio-temporal path proposals[J]. arXiv preprint arXiv:1708.03918
Google Scholar
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition[C]. In: ICLR
Szegedy C, Liu W, Jia Y et al (2015) Going deeper with convolutions[C]. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Tolias G, Sicre R, Jégou H (2016) Particular object retrieval with integral max-pooling of CNN activations[C]. In: ICLR
Veit A, Wilber MJ, Belongie S (2016) Residual networks behave like ensembles of relatively shallow networks[C]. Advances in neural information processing systems, pp 550–558
Xu Q, Yan K, Tian Y (2017) Learning a repression network for precise vehicle search[J]. arXiv preprint arXiv:1708.02386
Google Scholar
Yuan Y, Yang K, Zhang C (2017) Hard-aware deeply cascaded embedding[C]. Proceedings of the IEEE international conference on computer vision
Yue-Hei Ng J, Yang F, Davis LS (2015) Exploiting local features from deep networks for image retrieval[C]. Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 53–61
Zhang Y, Liu D, Zha ZJ (2017) Improving triplet-wise training of convolutional neural network for vehicle re-identification[C]. Multimedia and expo (ICME), 2017 I.E. international conference on. IEEE, pp 1386–1391
Zheng L, Shen L, Tian L et al (2015) Scalable person re-identification: a benchmark[C]. Proceedings of the IEEE international conference on computer vision, pp 1116–1124
Zheng L, Yang Y, Hauptmann AG (2016) Person re-identification: past, present and future[J]. arXiv preprint arXiv:1610.02984
Google Scholar
Zheng Z, Zheng L, Yang Y (2016) A discriminatively learned cnn embedding for person re-identification[J]. arXiv preprint arXiv:1611.05666
Google Scholar
Zheng Z, Zheng L, Yang Y (2017) Unlabeled samples generated by gan improve the person re-identification baseline in vitro[J]. arXiv preprint arXiv:1701.07717
Google Scholar

Download references

Acknowledgements

The authors of this paper are members of Shanghai Engineering Research Center of Intelligent Video Surveillance. Dr. Lei Song is also a visiting researcher with Shenzhen Key Laboratory of Media Security, Shenzhen University, Shenzhen 518060, China. Our research was sponsored by following projects: the National Natural Science Foundation of China (61402116、61403084); Program of Science and Technology Commission of Shanghai Municipality (No. 15530701300, No. 15XD1520200, No. 17511106803); 2012 IoT Program of Ministry of Industry and Information Technology of China; Key Project of the Ministry of Public Security (No. 2014JSYJA007); the Project of the Key Laboratory of Embedded System and Service Computing, Ministry of Education, Tongji University (ESSCKF 2015-03); Shanghai Rising-Star Program(17QB1401000); the Special Fund for Basic R&D Expenses of Central Level Public Welfare Scientific Research Institutions (C17384); National Key R&D program of China (2016YFC0801304, 2017YFC0803705), supported by CCF-Venustech Open Research Fund (Grant No. CCF-VenustechRP2017006), and supported by Guangxi Key Laboratory of Cryptography and Information Security (No.GCIS201719).

Author information

Authors and Affiliations

The Third Research Institute of the Ministry of Public Security, Shanghai, 201204, China
Zhengyan Ding, Lei Song, Xiaoteng Zhang & Zheng Xu
Shenzhen Key Laboratory of Media Security, Shenzhen University & Guangxi Key Laboratory of Cryptography and Information Security , Shenzhen and Guilin, China
Lei Song & Zheng Xu

Authors

Zhengyan Ding
View author publications
You can also search for this author in PubMed Google Scholar
Lei Song
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoteng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zheng Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lei Song.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ding, Z., Song, L., Zhang, X. et al. Selective deep ensemble for instance retrieval. Multimed Tools Appl 78, 5751–5767 (2019). https://doi.org/10.1007/s11042-018-5967-8

Download citation

Received: 30 September 2017
Revised: 10 January 2018
Accepted: 03 April 2018
Published: 14 April 2018
Issue Date: March 2019
DOI: https://doi.org/10.1007/s11042-018-5967-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Selective deep ensemble for instance retrieval

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

SSD: Single Shot MultiBox Detector

End-to-End Object Detection with Transformers

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Selective deep ensemble for instance retrieval

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

SSD: Single Shot MultiBox Detector

End-to-End Object Detection with Transformers

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation