Boosting CNN-Based Pedestrian Detection via 3D LiDAR Fusion in Autonomous Driving

Dou, Jian; Fang, Jianwu; Li, Tao; Xue, Jianru

doi:10.1007/978-3-319-71589-6_1

Jian Dou¹⁶,
Jianwu Fang^16,17,
Tao Li¹⁶ &
…
Jianru Xue¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10667))

Included in the following conference series:

International Conference on Image and Graphics

2575 Accesses
5 Citations

Abstract

Robust pedestrian detection has been treated as one of the main pursuits for excellent autonomous driving. Recently, some convolutional neural networks (CNN) based detectors have made large progress for this goal, such as Faster R-CNN. However, the performance of them still needs a large space to be boosted, even owning the complex learning architectures. In this paper, we novelly introduce the 3D LiDAR sensor to boost the CNN-based pedestrian detection. Facing the heterogeneous and asynchronous properties of two different sensors, we firstly introduce an accurate calibration method for visual and LiDAR sensors. Then, some physically geometrical clues acquired by 3D LiDAR are explored to eliminate the erroneous pedestrian proposals generated by the state-of-the-art CNN-based detectors. Exhaustive experiments verified the superiority of the proposed method.

You have full access to this open access chapter, Download conference paper PDF

Fully convolutional neural networks for LIDAR–camera fusion for pedestrian detection in autonomous vehicle

Article 01 February 2023

Low Resolution Lidar-Based Multi-Object Tracking for Driving Applications

3D Vehicle Detection Based on LiDAR and Camera Fusion

Article 30 November 2019

1 Introduction

Pedestrian detection is the main task in autonomous driving, where the accurate and robust detection has the direct impact on the planning and decision of autonomous vehicles [14]. In addition, pedestrian detection forms as the basis for many promising vision tasks, such as pedestrian tracking [11], crowd sensing [25], activity reasoning [24], etc. Besides, pedestrian, as the main traffic element, plays an influential role for traffic scene understanding and mapping [6]. Hence, many efforts have been devoted for its progress. However, it still needs a large space to boost the detection performance, mainly because that there are many challenging factors: covering of all the pedestrians with different scales, distinct illumination, partial-occlusion, motion blur, similar appearance to other non-human objects, and so forth.

Facing these problems, many works have been proposed. Among them, convolutional neural network (CNN) module have established the most excellent performance. For example, faster region-based convolutional neural networks (Faster R-CNN) [20] is proposed with 9 anchor scales for bounding box regression, where a region proposal network (RPN) is embedded to speed up the proposal generation procedure. Redmon et al. [18] proposed a YOLO detection module which predicts the coordinates of bounding boxes directly using fully connected layers on top of the convolutional feature extractor. Subsequently, some variants of YOLO are put forward, such as YOLOv2 and YOLO 9000 [19]. Single shot multiBox detector (SSD) [12] initialized a set of default boxes over different aspect ratios and scales within a feature map, and discretized the output space of bounding boxes into these boxes. Although these works have complex architectures and delved into the instinct pedestrian representation, all of them cannot obtain a satisfactory performance, seeing Fig. 1 for a demonstration. One reason may be the dynamic challenging factors mentioned before, but one another more important reason is that it is difficult to learn an invariable representation of pedestrians in diverse environment. Supplemented by the 3D LiDAR sensor, we can gather the physically geometrical information of pedestrians, such as height from the ground, area of occupancy, etc. These information also can be treated as the spatial context clue for inferring. Actually, there is one former work [21] which addressed the pedestrian detection by fusing LiDAR and visual clues. However, this method cannot obtain a good calibration of visual and LiDAR clue, as well as an accurate detection by naive neural networks. Though there are some works for detection using LADAR or Laser sensors [13, 23], they are based on the hypothesis that front dynamic objects are all pedestrians, where non object class knowledge is exploited. In other words, LiDAR cannot distinguish the class of different object, but cameras can. Hence, it is inevitable to fuse the camera and LiDAR sensors together, whereas needs a calibration for tackling their heterogeneous and asynchronous properties. Actually, vision+x module is becoming the main trend for scene understanding.

To this end, this work firstly dedicates to an accurate calibration for visual and LiDAR sensors, and update the calibration parameters with an online way. Second, we take Faster R-CNN as the basis for generating pedestrian proposals, and eliminate the wrong detections by considering constraints of physical geometrical clues, including the dominant distance of pedestrian within the proposals, height from the ground and dynamic area occupancy variation of pedestrians. By that, the pedestrian proposals generated by Faster R-CNN are significantly cleaned. The detailed flowchart is demonstrated in Fig. 2.

2 Related Works

This work mainly aims to boost the CNN based pedestrian detection performance with a 3D LiDAR sensor auxiliary. We will review the related works from CNN-based pedestrian detectors and other detection modules by non-vision approaches, such as LiDAR, Laser, etc.

CNN-based pedestrian detection: Recently, there have been a lot of detection works of interest deriving a convolutional deep neural networks (CNNs) [4, 12]. Within this framework, great progress of pedestrian detection has been made compared with previous works with hand-craft feature, such as deformable part-based model (DPM) [5]. The core purpose of these CNN based detectors is to search the instinct or structural information implied by large-scale pedestrian samples with respect to the scale space [12, 20, 27] or geometry constraint, such as the part-geometry [16]. For example, Faster R-CNN [20], inspired by R-CNN [7], sampled the object proposal with multiple anchor scales, and speeded up the proposal generation by a region proposal networks (RPN). Cai et al. [2] proposed a unified multi-scale deep neural networks (denoted as MS-CNN) to address the scale issue. The similar issue was also concerned by the work of scale-adaptive deconvolutional regression (SADR) [27] and scale-aware Fast R-CNN [10]. Single shot multiBox detector (SSD) [12] predicted the category scores and box offsets for a set of default bounding boxes on feature maps, which is faster than single box module of YOLO [18]. Beside of the scale issue consideration, some studies concentrate on the structural information implied by different part of pedestrians. Within this category, Ouyang et al. [16] jointly estimated the visibility relationship of different parts of the same pedestrian to solve the partial-occlusion problem. They also proposed a deformable deep convolutional neural networks for generic object detection [17], where they introduced a new deformation constrained pooling layer modeling the deformation of object parts with geometric constraint and penalty. Although these CNN based detectors search for an instinct and structural representation of pedestrians, robust detection still remains very difficult because of the diverse environment.

Non-vision pedestrian detection: Except for the universal vision based module for pedestrian detection, some researchers exploited this problem using many non-vision ways, including Lidar [8, 23], LADAR [13], and so on. Within this domain, geometrical features, such as the edge, skeleton, width of the scan line are the main kind of features. For example, Navarroserment et al. [13] utilized LADAR to detect the pedestrian by the constraint of height from the ground. Oliveira and Nunes [15] introduced Lidar sensor to segment the scan lines of pedestrian from the background with a spatial context consideration. Börcs et al. [1] detected the instant object by 3D LiDAR point clouds segmentation, where a convolutional neural networks was utilized to learn information of objects of a depth image estimated by 3D LiDAR point clouds. Wang et al. [23] also adopted the 3D LiDAR sensor to detect and track pedestrians. In their work, they first clustered the point cloud into several blobs, and labeled many samples manually. Then a support vector machine (SVM) was used to learn the geometrical clue of pedestrians.

In summary, the information acquired by non-vision sensors are all the geometrical clues without the explicit class information. Hence, in some circumstance, the frequent false detection is generated while the vision based methods can distinguish the different classes. Nevertheless, non-vision modules have the superior ability to the vision based ones for adapting different environment. It is inevitable to fuse the camera and non-vision modules together to obtain a boosted detection performance. Hence, this work utilizes 3D LiDAR sensor for an attempt.

3 Accurate Calibration of 3D LiDAR and Camera

For boosting the pedestrian detection performance, primary task is to calibrate camera to 3D LiDAR because of the demand for targeting the same objects. The work of calibration can be explained as to compute the intrinsic parameter of camera and extrinsic parameters correlation of two sensors, i.e., the translational vector $\mathbf{{t}}$ and rotate matrix $\mathbf{{R}}\in \mathbb {R}^{3\times 3}$. In this work, the intrinsic camera parameter is computed by Zhang Zhengyou calibration [26]. For the extrinsic parameter, this work introduces an online automatic calibration method [9] to carry out the accurate calibration for our camera and 3D LiDAR sensors. It aims to pursue a maximization of overlapping geometry structure. Different from other off-line calibrations [22, 26], it optimizes the the extrinsic parameter by latest observed several frames. Specifically, six values are calculated when optimization. They are {$\varDelta x$, $\varDelta y$, $\varDelta z$} translations, and the {roll, pitch, Eular-angle} rotations between the camera and 3D LiDAR sensors. Given a calibration of $\mathbf{{t}}$ and $\mathbf{{R}}$, we first project the 3D LiDAR sensor onto the image plane captured by visual camera. Then, the objective function for optimization is specified as:

$$\begin{aligned} max: \sum \limits _{f = n - w}^n {\sum \limits _{p = 1}^{\left| {{V^f}} \right| } {V_p^f} } S_{i,j}^f, \end{aligned}$$

(1)

where w is the frame number for optimization (set as 9 frames in this work), n is the newest observed video frame, p is the index for 3D point set $\{{V_p^f}\}_{p=1}^{|V^f|}$ obtained by 3D LiDAR sensor, $S_{i,j}^f$ is the point (x, j) in the $f^{th}$ frame S. Note that, the point set in 3D LiDAR and camera is not the whole plane. Actually, the points in both sensors are all the edge points. For image, the point $S_{i,j}^f$ is extracted by edge detection appending an inverse distance transformation, and $\{{V_p^f}\}$ is obtained by calculating the distance difference of the scene from the 3D LiDAR (denoted as the origin of coordinates). Some typical calibration results are shown in Fig. 3. From Fig. 3, we obtain a high-accurate calibration results. As thus, we accomplish the calibration of camera and 3D LiDAR sensors.

4 Boosting CNN-Based Detectors by Fusing Physically Geometrical Clue of Pedestrian

4.1 Pedestrian Proposal Generation by CNN-Based Detectors

After the calibration, we obtain a fundamental precondition for tackling the pedestrian detection problem by fusing visual color and real distance of the target. However, despite the calibration is conducted, there remains some issues for detection. The main difficulty is the heterogeneous property, i.e., the sparsity and the physical meaning of the points in two sensors are rather different. In addition, the 3D point captured by LiDAR does not have the class information. Therefore, in this work, we treat the CNN-based detector as the basis, and take some physically geometrical clue of 3D point to rectify the generated pedestrian proposals. Recently, there are many works with a deep network architecture addressing the pedestrian detection. However, each of them does not perform a satisfactory performance. Therefore, his work takes Faster R-CNN [20] as an attempt, the erroneous pedestrian proposals are eliminated by fusing the following physically geometrical clues.

4.2 Physically Geometrical Clue Fusion for Pedestrian Detection

In this subsection, we will describe the method for how to fuse the physically geometrical clue extracted by 3D LiDAR in detail. As we all know that, the height of most of the walking person in the world belongs to the range of [1, 2] meters, and occupies a region with the maximum size of $0.5 \times 2$ m$^2$. In addition, the occupancy region of a human maintains relatively static. Therefore, this work extracts the static and dynamic physically geometrical clues of the pedestrian, including the height from the ground, occupancy dominance within a pedestrian proposals, and a dynamic occupancy variation in accordance with the scale variation of proposals.

(1)
Static Geometrical Clues

Occupancy dominance (OD): The pedestrian proposals are generally represented by bounding boxes. By the observation, the 3D points locate in the bounding boxes sparsely and uniformly. The distance of the 3D points in each bounding box is computed by $r=\root 2 \of {x^2+y^2+z^2}$, where r represents the distance of a 3D point (x, y, z). Specially, because the sparsity of 3D points, some pixels in color image have no distance information, usually denoted as $(\infty ,\infty ,\infty )$. Besides, the bounding box inevitably contain a few of background region whose distance is much larger than the ones of pedestrians. In addition, the distances of the 3D points within pedestrians always are similar, and the pedestrian occupies the dominant part of the bounding box. Inspired by this insight, this work puts forward an occupancy dominance to eliminate the bounding boxes whose scale is rather different from the pedestrian. Specifically, we sort the distance of the 3D points in a bounding box with ascending order, and observe that the truly pedestrian always have a largest width range of zone with constant distance, seeing Fig. 4 for an example. As thus, the main step of occupancy dominance is to extract the largest smooth part of the sorted distance curve. For this purpose, we compute the difference of two adjacent point in this distance curve, and set the difference as 0 when the distance difference is lower than 0.3 m. Then, we segment the curve into several fragments, and the length of each fragment is denoted as the occupancy in a bounding box. By this clue, we can get rid of the proposals without dominant object region.

Height-width constraint (HC): In the driving circumstances, the height of a walking pedestrian usually drops into a finite range, e.g., from 1.2 m to 2 m. Therefore, given a pedestrian proposal, its height cannot exceed 2.5 m. In this paper, for a bounding box, we specify the height constraint as $0.8<(h_{max}-h_{min})<1.5$ m. With this constraint, the proposals with too little or large size are removed.

(2)
Dynamic Geometrical Clues

Dynamic occupancy (DO): In addition to the static clues, we also exploit the dynamic clues for removing the proposals wrongly detected. That is because that the occupancy (defined before) of the human body in the bounding box maintains constant, i.e., the fragment length in Fig. 4 remains almost unchanged when varying the scale of the bounding box. On the contrary, the objects, such as the trees which are always detected as pedestrian, may have a rather different scale size, and have a dynamic occupancy (denoted as DO) when varying the scale of bounding boxes. Hence, we further determine the quality of the pedestrian proposal by varying the height of the bounding box, and examine whether the dominant occupancy of the bounding box variation has a direct proportion to height or not. If not, the proposal is a pedestrian proposal. Specifically, dynamic occupancy (DO) in this paper is fulfilled by enlarging the height of the bounding box with the size of 1.3 times.

Although the above clues are all quite simple, they are intuitive and can significantly boost the performance of the CNN based detector verified by the following experiments.

5 Experiments and Discussions

5.1 Dataset Acquisition

We collect the experimental data by an autonomous vehicle named as “Kuafu”, which is developed by the Laboratory of Visual Cognitive Computing and Intelligent Vehicle of Xian Jiaotong University. In this work, a Velodyne HDL-64E S2 LIDAR sensor with 64 beams, and a high-resolution camera system with differential GPS/inertial information are equipped in the acquisition system. The visual camera is with the resolution of $1920\times 1200$ and a frame rate of 25. In addition, the scanning frequency of the 3D-LiDAR is 10 Hz. In the dataset, there are 5000 frames containing 5771 pedestrians proposals in the ground-truth manually labeled by ourselves. It is worth noting that this work treats the detected proposals with a detection score larger than 0.8 as the truly detected pedestrian. Hence the performance of the proposed method cannot be represented by a precision-recall curve.

5.2 Metrics for Evaluation

To evaluate the performance, this paper introduces the precision and recall values. The precision value represents the ratio of proposals correctly detected as pedestrian to all the detected proposals, while the recall value specifies the percentage of detected pedestrian proposal in relation to the ground-truth number. For the performance evaluation, this work adds the constraints, i.e., OD, HC and DO gradually, by which the performance of different clues can be presented. In addition, occupancy dominance (OD) clue is essential for HC and DO. Therefore, we deploy it in all the configurations.

Table 1. The precision and recall values for different physically geometrical clue embedding. For a clearer comparison, we demonstrate the numbers of detected proposals (DPs) and the wrongly detected proposals (WDPs). Besides, the best precision and recall value are marked by bold font.

Full size table

5.3 Performance Evaluation

The detection efficiency of the proposed method is 5 fps. Table 1 demonstrates the precision and recall values after embedding different physically geometrical clues. From this table, we can observe that the more the clues are added, the better precision the method generates and the worse recall is. It seems that the more clues make the detector cannot robustly detect all the pedestrians, seeing the $1606^{th}$ and $1651^{th}$ frames. Actually, through a checking in the visual results, more clues are necessary, which can remove the wrongly detected proposals to a larger extent. In the meantime, the margin of the recall value by embedding all the clues is commonly caused by that we removed the pedestrian proposals whose distances are larger than about 50 m from our vehicle, which is totally acceptable in practical situations, taking the $1623^{th}$ frame as an example (Fig. 5).

5.4 Discussions

In this work, we only take the Faster R-CNN [20] as an attempt. Actually, it is not the focus and similar for other CNN-based detectors. In addition, the utilization procedure of 3D-LiDAR is not restricted to this kind of module in this work. The purpose of this work aims to present that the performance of CNN-based detectors can be boosted by fusing some simple and intuitive geometrical clues extracted from 3D-LiDAR sensor, and the convincing results can be generated.

6 Conclusion

This paper novelly introduced the 3D-LiDAR sensor to boost the performance of CNN-based detectors. Faster R-CNN was utilized as an attempt. Facing the heterogeneous and asynchronous properties of two different sensors, this work firstly calibrated the RGB and LiDAR data with an online module which can adapt to the dynamic scene more effectively. Then, some physically geometrical clues acquired by 3D LiDAR were exploited to eliminate the erroneous pedestrian proposals. Exhaustive experiments verified the superiority of the proposed method. In the future, the more fusing module for camera and 3D-LiDAR is our focus.

References

Börcs, A., Nagy, B., Benedek, C.: Instant object detection in LiDAR point clouds. IEEE Geosci. Remote Sens. Lett. (2017, accepted)
Google Scholar
Cai, Z., Fan, Q., Feris, R.S., Vasconcelos, N.: A unified multi-scale deep convolutional neural network for fast object detection. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 354–370. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_22
Chapter Google Scholar
Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: Advances in Neural Information Processing Systems, pp. 379–387 (2016)
Google Scholar
Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. CoRR abs/1605.06409 (2016)
Google Scholar
Felzenszwalb, P.F., Girshick, R.B., Mcallester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)
Article Google Scholar
Geiger, A., Lauer, M., Wojek, C., Stiller, C., Urtasun, R.: 3D traffic scene understanding from movable platforms. IEEE Trans. Pattern Anal. Mach. Intell. 36(5), 1012–1025 (2014)
Article Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Google Scholar
Kidono, K., Miyasaka, T., Watanabe, A., Naito, T.: Pedestrian recognition using high-definition LiDAR. In: Proceedings of the Intelligent Vehicles Symposium, pp. 405–410 (2011)
Google Scholar
Levinson, J., Thrun, S.: Automatic online calibration of cameras and lasers. In: Proceedings of the Robotics: Science and Systems (2013)
Google Scholar
Li, J., Liang, X., Shen, S., Xu, T., Yan, S.: Scale-aware fast R-CNN for pedestrian detection. CoRR abs/1510.08160 (2015)
Google Scholar
Li, J., Deng, C., Xu, R.Y.D., Tao, D., Zhao, B.: Robust object tracking with discrete graph-based multiple experts. IEEE Trans. Image Process. 26(6), 2736–2750 (2017)
Article MathSciNet Google Scholar
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Navarroserment, L.E., Mertz, C., Hebert, M.: Pedestrian detection and tracking using three-dimensional LADAR data. Int. J. Robot. Res. 29(12), 1516–1528 (2010)
Article Google Scholar
Ohn-Bar, E., Trivedi, M.M.: Looking at humans in the age of self-driving and highly automated vehicles. IEEE Trans. Intell. Veh. 1(1), 90–104 (2016)
Article Google Scholar
Oliveira, L., Nunes, U.: Context-aware pedestrian detection using LiDAR. In: Intelligent Vehicles Symposium, pp. 773–778 (2010)
Google Scholar
Ouyang, W., Zeng, X., Wang, X.: Modeling mutual visibility relationship in pedestrian detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2013)
Google Scholar
Ouyang, W., Zeng, X., Wang, X., Qiu, S., Luo, P., Tian, Y., Li, H., Yang, S., Wang, Z., Li, H., Wang, K., Yan, J., Loy, C.C., Tang, X.: DeepID-Net: deformable deep convolutional neural networks for object detection. Int. J. Comput. Vis. 1–14 (2016)
Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Google Scholar
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. arXiv preprint arXiv:1612.08242 (2016)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. PP(99), 1 (2015)
Google Scholar
Szarvas, M., Sakai, U., Ogata, J.: Real-time pedestrian detection using LiDAR and convolutional neural networks. In: Proceedings of the Intelligent Vehicles Symposium, pp. 213–218 (2006)
Google Scholar
Unnikrishnan, R., Hebert, M.: Fast extrinsic calibration of a laser rangefinder to a camera. Carnegie Mellon University (2005)
Google Scholar
Wang, H., Wang, B., Liu, B., Meng, X., Yang, G.: Pedestrian recognition and tracking using 3D LiDAR for autonomous vehicle. Robot. Auton. Syst. 88, 71–78 (2017)
Article Google Scholar
Yi, S., Li, H., Wang, X.: Pedestrian behavior modeling from stationary crowds with applications to intelligent surveillance. IEEE Trans. Image Process. 25(9), 4354–4368 (2016)
Article MathSciNet Google Scholar
Yuan, Y., Fang, J., Wang, Q.: Online anomaly detection in crowd scenes via structure analysis. IEEE Trans. Cybern. 45(3), 562–575 (2015)
Article Google Scholar
Zhang, Z.: A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 22(11), 1330–1334 (2000)
Article Google Scholar
Zhu, Y., Wang, J., Zhao, C., Guo, H., Lu, H.: Scale-adaptive deconvolutional regression network for pedestrian detection. In: Lai, S.-H., Lepetit, V., Nishino, K., Sato, Y. (eds.) ACCV 2016. LNCS, vol. 10112, pp. 416–430. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54184-6_26
Chapter Google Scholar

Download references

Acknowledgement

This work is supported by the National Key R&D Program Project under Grant 2016YFB1001004, and also supported by the Natural Science Foundation of China under Grant 61603057, China Postdoctoral Science Foundation under Grant 2017M613152, and is also partially supported by Collaborative Research with MSRA.

Author information

Authors and Affiliations

Laboratory of Visual Cognitive Computing and Intelligent Vehicle, Xi’an Jiaotong University, Xi’an, People’s Republic of China
Jian Dou, Jianwu Fang, Tao Li & Jianru Xue
School of Electronic and Control Engineering, Chang’an University, Xi’an, People’s Republic of China
Jianwu Fang

Authors

Jian Dou
View author publications
You can also search for this author in PubMed Google Scholar
Jianwu Fang
View author publications
You can also search for this author in PubMed Google Scholar
Tao Li
View author publications
You can also search for this author in PubMed Google Scholar
Jianru Xue
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianru Xue .

Editor information

Editors and Affiliations

Beijing Jiaotong University, Beijing, China
Yao Zhao
Dalian University of Technology, Dalian, China
Xiangwei Kong
UNSW, Sydney, New South Wales, Australia
David Taubman

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dou, J., Fang, J., Li, T., Xue, J. (2017). Boosting CNN-Based Pedestrian Detection via 3D LiDAR Fusion in Autonomous Driving. In: Zhao, Y., Kong, X., Taubman, D. (eds) Image and Graphics. ICIG 2017. Lecture Notes in Computer Science(), vol 10667. Springer, Cham. https://doi.org/10.1007/978-3-319-71589-6_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-71589-6_1
Published: 29 December 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-71588-9
Online ISBN: 978-3-319-71589-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)