ABSTRACT
Detection of vehicles in traffic surveillance needs good and large training datasets in order to achieve competitive detection rates. We are showing an approach to automatic synthesis of custom datasets, simulating various major influences: viewpoint, camera parameters, sunlight, surrounding environment, etc. Our goal is to create a competitive vehicle detector which "has not seen a real car before." We are using Blender as the modeling and rendering engine. A suitable scene graph accompanied by a set of scripts was created, that allows simple configuration of the synthesized dataset. The generator is also capable of storing rich set of metadata that are used as annotations of the synthesized images. We synthesized several experimental datasets, evaluated their statistical properties, as compared to real-life datasets. Most importantly, we trained a detector on the synthetic data. Its detection performance is comparable to a detector trained on state-of-the-art real-life dataset. Synthesis of a dataset of 10,000 images takes only several hours, which is much more efficient, compared to manual annotation, let aside the possibility of human error in annotation.
- Agarwal, S., Awan, A., and Roth, D. 2004. Learning to detect objects in images via a sparse, part-based representation. IEEE PAMI 26, 1475--1490. Google ScholarDigital Library
- Bileschi, S. M. 2006. StreetScenes: Towards scene understanding in still images. PhD thesis, MIT. Google ScholarDigital Library
- Caraffi, C., Vojir, T., Trefny, J., Sochman, J., and Matas, J. 2012. A system for real-time detection and tracking of vehicles from a single car-mounted camera. In ITS Conference, 975--982.Google Scholar
- Dalal, N., and Triggs, B. 2005. Histograms of oriented gradients for human detection. In IEEE CVPR, 886--893. Google ScholarDigital Library
- Dollár, P., Appel, R., Belongie, S., and Perona, P. 2014. Fast feature pyramids for object detection. IEEE PAMI.Google Scholar
- Fei-Fei, L., Fergus, R., and Perona, P. 2007. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. Comput. Vis. Image Underst. 106, 59--70. Google ScholarDigital Library
- Felzenszwalb, P. F., Girshick, R. B., McAllester, D., and Ramanan, D. 2010. Object detection with discriminatively trained part based models. IEEE PAMI 32, 1627--1645. Google ScholarDigital Library
- Geiger, A., Lenz, P., Stiller, C., and Urtasun, R. 2013. Vision meets robotics: The KITTI dataset. IJRR. Google ScholarDigital Library
- Griffin, G., Holub, A., and Perona, P. 2007. Caltech-256 object category dataset. Tech. Rep. CNS-TR-2007-001, California Institute of Technology.Google Scholar
- Gupte, S., Masoud, O., Martin, R., and Papanikolopoulos, N. 2002. Detection and classification of vehicles. IEEE Transactions on Intelligent Transportation Systems 3, 37--47. Google ScholarDigital Library
- Kumar, P., Sengupta, K., and Lee, A. 2002. A comparative study of different color spaces for foreground and shadow detection for traffic monitoring system. In ITSC, 100--105.Google Scholar
- Li, J., and Zhang, Y. 2013. Learning SURF cascade for fast and accurate object detection. In IEEE CVPR, 3468--3475. Google ScholarDigital Library
- Morris, B., and Trivedi, M. 2008. Learning, modeling, and classification of vehicle track patterns from live video. IEEE Transactions on Intelligent Transportation Systems 9, 425--437. Google ScholarDigital Library
- Niknejad, H., Takeuchi, A., Mita, S., and McAllester, D. 2012. On-road multivehicle tracking using deformable object model and particle filter with improved likelihood estimation. IEEE Tran. ITS 13, 748--758.Google Scholar
- Papageorgiou, C., and Poggio, T. 2000. A trainable system for object detection. International Journal Computer Vision. Google ScholarDigital Library
- Russell, B. C., Torralba, A., Murphy, K. P., and Freeman, W. T. 2008. LabelMe: A database and web-based tool for image annotation. IJCV 77, 157--173. Google ScholarDigital Library
- Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., and Blake, A. 2011. Real-time human pose recognition in parts from single depth images. In IEEE CVPR, 1297--1304. Google ScholarDigital Library
- Sivaraman, S., and Trivedi, M. 2013. Looking at vehicles on the road: A survey of vision-based vehicle detection, tracking, and behavior analysis. IEEE Tran. ITS 14, 1773--1795.Google ScholarDigital Library
- Viola, P., and Jones, M. J. 2004. Robust real-time face detection. International Journal Computer Vision 57, 137--154. Google ScholarDigital Library
- Yuan, Q., Thangali, A., Ablavsky, V., and Sclaroff, S. 2011. Learning a family of detectors via multiplicative kernels. IEEE PAMI 33, 514--530. Google ScholarDigital Library
- Zhou, B., Cao, J., Zeng, X., and Wu, H. 2010. Adaptive traffic light control in wireless sensor network-based intelligent transportation system. In Vehicular Technology Conference.Google Scholar
Index Terms
- Cheap rendering vs. costly annotation: rendered omnidirectional dataset of vehicles
Recommendations
Real-time dynamic reflections for realistic rendering of 3D scenes
Visual effects, such as real-time dynamic reflections, are fundamental for realistic rendering of 3D scenes and walkthrough animations containing multiple moving objects, since they provide the correct identification of their relative distance and of ...
Approximating global illumination on mesostructure surfaces with height gradient maps
Rendering global illumination for objects with mesostructure surfaces is a time-consuming task, and cannot presently be applied to interactive graphics. This paper presents a real-time rendering method based on a mesostructure height gradient map (MHGM) ...
Vision meets robotics: The KITTI dataset
We present a novel dataset captured from a VW station wagon for use in mobile robotics and autonomous driving research. In total, we recorded 6 hours of traffic scenarios at 10-100 Hz using a variety of sensor modalities such as high-resolution color ...
Comments