Multi-attribute object detection benchmark for smart city

Wang, Yaowei; Yang, Zhouxin; Liu, Rui; Li, Deng; Lai, Yuandu; Ouyang, Lihan; Fang, Leyuan; Han, Yahong

doi:10.1007/s00530-022-00971-1

Multi-attribute object detection benchmark for smart city

Regular Paper
Published: 12 July 2022

Volume 28, pages 2423–2435, (2022)
Cite this article

Multimedia Systems Aims and scope Submit manuscript

Yaowei Wang¹,
Zhouxin Yang¹,
Rui Liu²,
Deng Li²,
Yuandu Lai²,
Lihan Ouyang³,
Leyuan Fang³ &
…
Yahong Han²

370 Accesses
3 Citations
Explore all metrics

Abstract

Object detection is an algorithm that recognizes and locates the objects in the image and has a wide range of applications in the visual understanding of complex urban scenes. Existing object detection benchmarks mainly focus on a single specific scenario and their annotation attributes are not rich enough, these make the object detection model not generalized for the smart city scenes. Considering the diversity and complexity of scenes in intelligent city governance, we build a large-scale object detection benchmark for the smart city. Our benchmark contains about 100K images and includes three scenarios: intelligent transportation, intelligent surveillance, and drone. For the complexity of the real scene in the smart city, the diversity of weather, occlusion, and other complex environment diversity attributes of the images in the three scenes are annotated. The characteristics of the benchmark are analyzed and extensive experiments of the current state-of-the-art target detection algorithm are conducted based on our benchmark to show their performance. Our benchmark is available at https://openi.org.cn/projects/Benchmark.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

VisDrone-DET2018: The Vision Meets Drone Object Detection in Image Challenge Results

Small object detection in diverse application landscapes: a survey

Article 26 March 2024

VisDrone-VDT2018: The Vision Meets Drone Video Detection and Tracking Challenge Results

References

Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: European Conference on Computer Vision, pp. 213–229 (2020)
Chen, K., Wang, J., Pang, J., Cao, Y., Xiong, Y., Li, X., Sun, S., Feng, W., Liu, Z., Xu, J., et al.: Mmdetection: Open mmlab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155 (2019)
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., Schiele, B.: The cityscapes dataset for semantic urban scene understanding. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3213–3223. IEEE Computer Society (2016). 10.1109/CVPR.2016.350
Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., Tian, Q.: Centernet: Keypoint triplets for object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 6569–6578 (2019)
Everingham, M., Eslami, S.A., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes challenge: A retrospective. Int. J. Comput. Vision 111(1), 98–136 (2015)
Article Google Scholar
Everingham, M., Gool, L.V., Williams, C.K.I., Winn, J.M., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010). https://doi.org/10.1007/s11263-009-0275-4
Article Google Scholar
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? the KITTI vision benchmark suite. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3354–3361. IEEE Computer Society (2012)
Kuznetsova, A., Rom, H., Alldrin, N., Uijlings, J., Krasin, I., Pont-Tuset, J., Kamali, S., Popov, S., Malloci, M., Kolesnikov, A., et al.: The open images dataset v4. Int. J. Comput. Vision 128(7), 1956–1981 (2020)
Article Google Scholar
Lin, T., Maire, M., Belongie, S.J., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft COCO: common objects in context. In: D.J. Fleet, T. Pajdla, B. Schiele, T. Tuytelaars (eds.) Computer Vision - ECCV 2014 - 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V, Lecture Notes in Computer Science, vol. 8693, pp. 740–755. Springer (2014)
Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. In: European conference on computer vision, pp. 740–755. Springer (2014)
Luo, Z., Branchaud-Charron, F., Lemaire, C., Konrad, J., Li, S., Mishra, A., Achkar, A., Eichel, J., Jodoin, P.M.: Mio-tcd: A new benchmark dataset for vehicle classification and localization. IEEE Trans. Image Process. 27(10), 5129–5141 (2018)
Article MathSciNet Google Scholar
Nada, H., Sindagi, V.A., Zhang, H., Patel, V.M.: Pushing the limits of unconstrained face detection: a challenge dataset and baseline results. In: 2018 IEEE 9th International Conference on Biometrics Theory, Applications and Systems (BTAS), pp. 1–10. IEEE (2018)
Neumann, L., Karg, M., Zhang, S., Scharfenberger, C., Piegert, E., Mistr, S., Prokofyeva, O., Thiel, R., Vedaldi, A., Zisserman, A., et al.: Nightowls: A pedestrians at night dataset. In: Asian Conference on Computer Vision, pp. 691–705. Springer (2018)
Oh, S., Hoogs, A., Perera, A., Cuntoor, N., Chen, C.C., Lee, J.T., Mukherjee, S., Aggarwal, J., Lee, H., Davis, L., et al.: A large-scale benchmark dataset for event recognition in surveillance video. In: CVPR 2011, pp. 3153–3160. IEEE (2011)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)
Article Google Scholar
Shao, S., Li, Z., Zhang, T., Peng, C., Yu, G., Zhang, X., Li, J., Sun, J.: Objects365: A large-scale, high-quality dataset for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8430–8439 (2019)
Wong, Y., Chen, S., Mau, S., Sanderson, C., Lovell, B.C.: Patch-based probabilistic image quality assessment for face selection and improved video-based face recognition. In: CVPR 2011 WORKSHOPS, pp. 74–81. IEEE (2011)
Yang, S., Luo, P., Loy, C.C., Tang, X.: Wider face: A face detection benchmark. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5525–5533 (2016)
Yang, W., Yuan, Y., Ren, W., Liu, J., Scheirer, W.J., Wang, Z., Zhang, T., Zhong, Q., Xie, D., Pu, S., et al.: Advancing image understanding in poor visibility environments: A collective benchmark study. IEEE Trans. Image Process. 29, 5737–5752 (2020)
Article MATH Google Scholar
Yu, F., Chen, H., Wang, X., Xian, W., Chen, Y., Liu, F., Madhavan, V., Darrell, T.: BDD100K: A diverse driving dataset for heterogeneous multitask learning. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2633–2642. Computer Vision Foundation / IEEE (2020)
Zhang, S., Wu, G., Costeira, J.P., Moura, J.M.: Understanding traffic density from large-scale web camera data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5898–5907 (2017)
Zhu, P., Wen, L., Du, D., Bian, X., Hu, Q., Ling, H.: Vision meets drones: Past, present and future. arXiv preprint arXiv:2001.06303 (2020)
Zhu, X., Su, W., Lu, L., Li, B., Wang, X., Dai, J.: Deformable detr: Deformable transformers for end-to-end object detection. In: ICLR 2021: The Ninth International Conference on Learning Representations (2021)

Download references

Acknowledgments

This work is partially supported by Natural Science Foundation of China under contract No. U19B2036, and Peng Cheng Laboratory Research Project No. PCL2021A07.

Author information

Authors and Affiliations

Peng Cheng Laboratory, Shenzhen, China
Yaowei Wang & Zhouxin Yang
College of Intelligence and Computing, Tianjin University, Tianjin, China
Rui Liu, Deng Li, Yuandu Lai & Yahong Han
College of Electrical and Information Engineering, Hunan University, Changsha, China
Lihan Ouyang & Leyuan Fang

Authors

Yaowei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhouxin Yang
View author publications
You can also search for this author in PubMed Google Scholar
Rui Liu
View author publications
You can also search for this author in PubMed Google Scholar
Deng Li
View author publications
You can also search for this author in PubMed Google Scholar
Yuandu Lai
View author publications
You can also search for this author in PubMed Google Scholar
Lihan Ouyang
View author publications
You can also search for this author in PubMed Google Scholar
Leyuan Fang
View author publications
You can also search for this author in PubMed Google Scholar
Yahong Han
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yahong Han.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, Y., Yang, Z., Liu, R. et al. Multi-attribute object detection benchmark for smart city. Multimedia Systems 28, 2423–2435 (2022). https://doi.org/10.1007/s00530-022-00971-1

Download citation

Received: 03 April 2022
Accepted: 23 June 2022
Published: 12 July 2022
Issue Date: December 2022
DOI: https://doi.org/10.1007/s00530-022-00971-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-attribute object detection benchmark for smart city

Abstract

Access this article

Similar content being viewed by others

VisDrone-DET2018: The Vision Meets Drone Object Detection in Image Challenge Results

Small object detection in diverse application landscapes: a survey

VisDrone-VDT2018: The Vision Meets Drone Video Detection and Tracking Challenge Results

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-attribute object detection benchmark for smart city

Abstract

Access this article

Similar content being viewed by others

VisDrone-DET2018: The Vision Meets Drone Object Detection in Image Challenge Results

Small object detection in diverse application landscapes: a survey

VisDrone-VDT2018: The Vision Meets Drone Video Detection and Tracking Challenge Results

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation