Abstract
In intelligent transportation systems, various sensors, including radar and conventional frame cameras, are used to improve system robustness in various challenging scenarios. An event camera is a novel bio-inspired sensor that has attracted the interest of several researchers. It provides a form of neuromorphic vision to capture motion information asynchronously at high speeds. Thus, it possesses advantages for intelligent transportation systems that conventional frame cameras cannot match, such as high temporal resolution, high dynamic range, as well as sparse and minimal motion blur. Therefore, this study proposes an E-detector based on event cameras that asynchronously detect moving objects. The main innovation of our framework is that the spatiotemporal domain of the event camera can be adjusted according to different velocities and scenarios. It overcomes the inherent challenges that traditional cameras face when detecting moving objects in complex environments, such as high speed, complex lighting, and motion blur. Moreover, our approach adopts filter models and transfer learning to improve the performance of event-based object detection. Experiments have shown that our method can detect high-speed moving objects better than conventional cameras using state-of-the-art detection algorithms. Thus, our proposed approach is extremely competitive and extensible, as it can be extended to other scenarios concerning high-speed moving objects. The study findings are expected to unlock the potential of event cameras in intelligent transportation system applications.
- [1] . 2006. Multi-task feature learning. Advances in Neural Information Processing Systems, vol. 19. MIT Press.Google Scholar
- [2] . 2021. Self-driving cars: A survey. Expert Syst. Appl. 165 (2021), 113816.Google ScholarCross Ref
- [3] . 2016. Simultaneous optical flow and intensity estimation from an event camera. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). 884–892.Google ScholarCross Ref
- [4] . 2020. YOLOV4: Optimal speed and accuracy of object detection. Retrieved from https://arXiv:2004.10934.Google Scholar
- [5] . 2014. A 240 \(\times\) 180 130 db 3 \(\mu\)s latency global shutter spatiotemporal vision sensor. IEEE J. Solid-State Circ. 49, 10 (2014), 2333–2341.Google ScholarCross Ref
- [6] . 2020. End-to-end object detection with transformers. In Proceedings of the European Conference on Computer Vision (ECCV’20). Springer, 213–229.Google ScholarDigital Library
- [7] . 2013. Event-based 3D reconstruction from neuromorphic retinas. Neural Netw. 45 (2013), 27–38.Google ScholarDigital Library
- [8] . 2007. AER EAR: A matched silicon cochlea pair with address event representation interface. IEEE Trans. Circ. Syst. I: Reg. Papers 54, 1 (2007), 48–59.Google ScholarCross Ref
- [9] . 2020. DenseLightNet: A light-weight vehicle detection network for autonomous driving. IEEE Trans. Industr. Electr. 67, 12 (2020), 10600–10609.Google ScholarCross Ref
- [10] . 2017. Moving-object detection from consecutive stereo pairs using slanted plane smoothing. IEEE Trans. Intell. Transport. Syst. 18, 11 (2017), 3093–3102.Google ScholarDigital Library
- [11] . 2019. Live demonstration: CeleX-V: A 1M pixel multi-mode event-based sensor. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’19). IEEE, 1682–1683.Google ScholarCross Ref
- [12] . 2019. DET: A high-resolution dvs dataset for lane extraction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’19). 0–0.Google ScholarCross Ref
- [13] . 2019. Star tracking using an event camera. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’19).Google ScholarCross Ref
- [14] . 2020. A large scale event-based detection dataset for automotive. Retrieved from https://arXiv:2001.08499.Google Scholar
- [15] . 2009. Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’09). IEEE, 248–255.Google ScholarCross Ref
- [16] . 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. Retrieved from https://arXiv:1810.04805.Google Scholar
- [17] . 2019. CenterNet: Keypoint triplets for object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV’19). 6569–6578.Google ScholarCross Ref
- [18] . 2017. DSSD: Deconvolutional single shot detector. Retrieved from https://arXiv:1701.06659.Google Scholar
- [19] . 2020. Event-based Vision: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 44, 1 (2020), 154–180.Google ScholarDigital Library
- [20] . 2019. Focus is all you need: Loss functions for event-based vision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). 12280–12289.Google ScholarCross Ref
- [21] . 2017. Event-based, 6-DOF camera tracking from photometric depth maps. IEEE Trans. Pattern Anal. Mach. Intell. 40, 10 (2017), 2402–2412.Google ScholarDigital Library
- [22] . 2018. A unifying contrast maximization framework for event cameras, with applications to motion, depth, and optical flow estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). 3867–3876.Google ScholarCross Ref
- [23] . 2020. Exploring deep learning for view-based 3D model retrieval. ACM Trans. Multimedia Comput., Commun. Appl. 16, 1 (2020), 1–21.Google ScholarDigital Library
- [24] . 2020. Event-based angular velocity regression with spiking networks. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). IEEE, 4195–4202.Google ScholarCross Ref
- [25] . 2015. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’15). 1440–1448.Google ScholarDigital Library
- [26] . 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’14). 580–587.Google ScholarDigital Library
- [27] . 2022. Hierarchical multi-attention transfer for knowledge distillation. ACM Trans. Multimedia Comput., Commun. Appl. (2022). Google ScholarDigital Library
- [28] . 2017. Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’17). 2961–2969.Google ScholarCross Ref
- [29] . 2021. Fast video saliency detection via maximally stable region motion and object repeatability. IEEE Trans. Multimedia 24 (2021), 4458–4470.Google ScholarCross Ref
- [30] . 2019. Mixed frame-/event-driven fast pedestrian detection. In Proceedings of the International Conference on Robotics and Automation (ICRA’19). IEEE, 8332–8338.Google ScholarDigital Library
- [31] . 2016. Real-time 3D reconstruction and 6-DoF tracking with an event camera. In Proceedings of the European Conference on Computer Vision (ECCV’16). Springer, 349–364.Google ScholarCross Ref
- [32] . 2020. Foveabox: Beyound anchor-based object detection. IEEE Trans. Image Process. 29 (2020), 7389–7398.Google ScholarDigital Library
- [33] . 2012. Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems 25 (2012), 1097–1105.Google ScholarDigital Library
- [34] . 2020. Centermask: Real-time anchor-free instance segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20). 13906–13915.Google ScholarCross Ref
- [35] . 2021. SwapInpaint: Identity-specific face inpainting with identity swapping. IEEE Trans. Circ. Syst. Video Technol. 32, 7 (2021), 4271–4281.Google ScholarDigital Library
- [36] . 2021. Pseudo-IoU: Improving label assignment in anchor-free object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’21). 2378–2387.Google ScholarCross Ref
- [37] . 2013. A sensor-fusion drivable-region and lane-detection system for autonomous vehicle navigation in challenging road scenarios. IEEE Trans. Vehic. Technol. 63, 2 (2013), 540–555.Google ScholarCross Ref
- [38] . 2005. A 64 \(\times\) 64 AER logarithmic temporal derivative silicon retina. In Research in Microelectronics and Electronics, 2005 PhD, Vol. 2. IEEE, 202–205.Google Scholar
- [39] . 2019. Real-time single-stage vehicle detector optimized by multi-stage image-based online hard example mining. IEEE Trans. Vehic. Technol. 69, 2 (2019), 1505–1518.Google ScholarCross Ref
- [40] . 2014. Microsoft COCO: Common objects in context. In Proceedings of the European Conference on Computer Vision. Springer, 740–755.Google ScholarCross Ref
- [41] . 2016. SSD: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision (ECCV’16). Springer, 21–37.Google ScholarCross Ref
- [42] . 2018. Event-based moving object detection and tracking. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS’18). IEEE, 1–9.Google ScholarDigital Library
- [43] . 2021. Moving object detection for event-based vision using graph spectral clustering. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV’21). 876–884.Google ScholarCross Ref
- [44] . 2019. Bringing a blurry frame alive at high frame-rate with an event camera. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). 6820–6829.Google ScholarCross Ref
- [45] . 2009. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 10 (2009), 1345–1359.Google ScholarDigital Library
- [46] . 2020. Learning to detect objects with a 1 megapixel event camera. Advances in Neural Information Processing Systems 33 (2020), 16639–16652.Google Scholar
- [47] . 2019. High speed and high dynamic range video with an event camera. IEEE Trans. Pattern Anal. Mach. Intell. 43, 6 (2019), 1964–1980.Google ScholarCross Ref
- [48] . 2016. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). 779–788.Google ScholarCross Ref
- [49] . 2017. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 7263–7271.Google ScholarCross Ref
- [50] . 2018. YOLOV3: An incremental improvement. Retrieved from https://arXiv:1804.02767.Google Scholar
- [51] . 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. Advances in Neural Information Processing Systems 28 (2015), 91–99.Google ScholarDigital Library
- [52] . 2020. A primer in bertology: What we know about how bert works. Trans. Assoc. Comput. Linguist. 8 (2020), 842–866.Google ScholarCross Ref
- [53] . 2017. The world of fast moving objects. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 5203–5211.Google ScholarCross Ref
- [54] . 2021. FMODetect: Robust detection of fast moving objects. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV’21). 3541–3549.Google ScholarCross Ref
- [55] . 2015. Imagenet large scale visual recognition challenge. Int. J. Comput. Vision 115, 3 (2015), 211–252.Google ScholarDigital Library
- [56] . 2018. Continuous-time intensity estimation using event cameras. In Proceedings of the Asian Conference on Computer Vision. Springer, 308–324.Google Scholar
- [57] . 2019. Event cameras, contrast maximization and reward functions: An analysis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). 12300–12308.Google ScholarCross Ref
- [58] . 2020. Unsupervised moving object detection in complex scenes using adversarial regularizations. IEEE Trans. Multimedia 23 (2020), 2005–2018.Google ScholarCross Ref
- [59] . 2019. FCOS: Fully convolutional one-stage object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 9627–9636.Google ScholarCross Ref
- [60] . 2004. Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13, 4 (2004), 600–612.Google ScholarDigital Library
- [61] . 2016. Constrained deep transfer feature learning and its applications. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). 5101–5109.Google ScholarCross Ref
- [62] . 2021. Exploring image enhancement for salient object detection in low light images. ACM Trans. Multimedia Comput., Commun. Appl. 17, 1s (2021), 1–19.Google ScholarDigital Library
- [63] . 2021. Anchor-free person search. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’21). 7690–7699.Google ScholarCross Ref
- [64] . 2018. New trends on moving object detection in video images captured by a moving camera: A survey. Comput. Sci. Rev. 28 (2018), 157–177.Google ScholarCross Ref
- [65] . 2022. EVtracker: An event-driven spatiotemporal method for dynamic object tracking. Sensors 22, 16 (2022), 6090.Google ScholarCross Ref
- [66] . 2018. Single-shot refinement neural network for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). 4203–4212.Google ScholarCross Ref
- [67] . 2021. Learning to match anchors for visual object detection. IEEE Trans. Pattern Anal. Mach. Intell. 44, 6 (2021), 3096–3109.Google ScholarCross Ref
- [68] . 2023. Local correlation ensemble with GCN based on attention features for cross-domain person Re-ID. ACM Trans. Multimedia Comput., Commun. Appl. 19, 1 (2023), 1–22.Google ScholarDigital Library
- [69] . 2022. Moving object detection and tracking by event frame from neuromorphic vision sensors. Biomimetics 7, 1 (2022), 31.Google ScholarCross Ref
- [70] . 2020. Fusion of 3D LIDAR and camera data for object detection in autonomous vehicle applications. IEEE Sensors J. 20, 9 (2020), 4901–4913.Google ScholarCross Ref
- [71] . 2019. Object detection with deep learning: A review. IEEE Trans. Neural Netw. Learn. Syst. 30, 11 (2019), 3212–3232.Google ScholarCross Ref
- [72] . 2023. Towards accurate oriented object detection in aerial images with adaptive multi-level feature fusion. ACM Trans. Multimedia Comput., Commun. Appl. 19, 1 (2023), 1–22.Google ScholarDigital Library
- [73] . 2020. VehicleNet: Learning robust visual representation for vehicle re-identification. IEEE Trans. Multimedia 23 (2020), 2683–2693.Google ScholarDigital Library
- [74] . 2020. Anchor box optimization for object detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 1286–1294.Google ScholarCross Ref
- [75] . 2021. Deepvit: Towards deeper vision transformer. Retrieved from https://arXiv:2103.11886.Google Scholar
- [76] . 2007. Moving vehicle detection for automatic traffic monitoring. IEEE Trans. Vehic. Technol. 56, 1 (2007), 51–59.Google ScholarCross Ref
- [77] . 2019. Objects as points. Retrieved from https://arXiv:1904.07850.Google Scholar
- [78] . 2019. Feature selective anchor-free module for single-shot object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). 840–849.Google ScholarCross Ref
- [79] . 2020. Deformable DETR: Deformable transformers for end-to-end object detection. Retrieved from https://arXiv:2010.04159.Google Scholar
Index Terms
- E-detector: Asynchronous Spatio-temporal for Event-based Object Detection in Intelligent Transportation System
Recommendations
Motion Vector Based Moving Object Detection and Tracking in the MPEG Compressed Domain
CBMI '09: Proceedings of the 2009 Seventh International Workshop on Content-Based Multimedia IndexingAs MPEG standards prevail, the opportunities to handle MPEG compressed videos increase, and the video indexing and management that can directly process the compressed videos become important. MPEG video coding standards use motion compensation to ...
Moving object detection and segmentation in urban environments from a moving platform
This paper proposes an effective approach to detect and segment moving objects from two time-consecutive stereo frames, which leverages the uncertainties in camera motion estimation and in disparity computation. First, the relative camera motion and its ...
Object Tracking and Primitive Event Detection by Spatio-Temporal Tracklet Association
ICIG '09: Proceedings of the 2009 Fifth International Conference on Image and GraphicsAccurate object tracking is a challenging problem in visual surveillance due to noise segmentation, partial and full object occlusions. In this paper, we present a method for object tracking and primitive event detection by associating tracklet caused ...
Comments