Abstract
Nowadays, autonomous cars can drive smoothly in ordinary cases, and it is widely recognized that realistic sensor simulation will play a critical role in solving remaining corner cases by simulating them. To this end, we propose an autonomous driving simulator based upon neural radiance fields (NeRFs). Compared with existing works, ours has three notable features: (1) Instance-aware. Our simulator models the foreground instances and background environments separately with independent networks so that the static (e.g., size and appearance) and dynamic (e.g., trajectory) properties of instances can be controlled separately. (2) Modular. Our simulator allows flexible switching between different modern NeRF-related backbones, sampling strategies, input modalities, etc. We expect this modular design to boost academic progress and industrial deployment of NeRF-based autonomous driving simulation. (3) Realistic. Our simulator set new state-of-the-art photo-realism results given the best module selection. Our simulator will be open-sourced while most of our counterparts are not. Project page: https://open-air-sun.github.io/mars/.
H. Zhao—Sponsored by Tsinghua-Toyota Joint Research Fund (20223930097).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Barron, J.T., Mildenhall, B., Tancik, M., Hedman, P., Martin-Brualla, R., Srinivasan, P.P.: Mip-NeRF: a multiscale representation for anti-aliasing neural radiance fields. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 5835–5844 (2021)
Barron, J.T., Mildenhall, B., Verbin, D., Srinivasan, P.P., Hedman, P.: Mip-NeRF 360: unbounded anti-aliased neural radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. arXiv (2022)
Cabon, Y., Murray, N., Humenberger, M.: Virtual KITTI 2. http://arxiv.org/abs/2001.10773
Chen, X., Zhao, H., Zhou, G., Zhang, Y.Q.: PQ-transformer: jointly parsing 3D objects and layouts from point clouds. IEEE Robot. Autom. Lett. 7(2), 2519–2526 (2022)
Chen, Y., et al.: GeoSim: realistic video simulation via geometry-aware composition for self-driving. http://arxiv.org/abs/2101.06543
Deng, K., Liu, A., Zhu, J.Y., Ramanan, D.: Depth-supervised NeRF: fewer views and faster training for free. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12872–12881 (2022)
Dosovitskiy, A., Ros, G., Codevilla, F., Lopez, A., Koltun, V.: CARLA: an open urban driving simulator. In: Proceedings of the 1st Annual Conference on Robot Learning, pp. 1–16. PMLR (2017)
Fridovich-Keil, S., Meanti, G., Warburg, F., Recht, B., Kanazawa, A.: K-planes: explicit radiance fields in space, time, and appearance. In: Computer Vision and Pattern Recognition (2023)
Fu, X., et al.: Panoptic NeRF: 3D-to-2D label transfer for panoptic urban scene segmentation. In: 2022 International Conference on 3D Vision (3DV), pp. 1–11 (2022)
Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the KITTI dataset. Int. J. Robot. Res. 32(11), 1231–1237 (2013)
Hu, Y., et al.: Planning-oriented autonomous driving. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17853–17862 (2023)
Jin, B., et al.: ADAPT: action-aware driving caption transformer. In: 2023 IEEE International Conference on Robotics and Automation (ICRA), pp. 7554–7561 (2023)
Kundu, A., et al.: Panoptic neural fields: a semantic object-aware neural scene representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12871–12881 (2022)
Li, P., et al.: LODE: locally conditioned eikonal implicit scene completion from sparse LiDAR. In: 2023 IEEE International Conference on Robotics and Automation (ICRA). arXiv (2023)
Li, W., et al.: AADS: augmented autonomous driving simulation using data-driven algorithms. Sci. Robot. 4(28), eaaw0863 (2019)
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: representing scenes as neural radiance fields for view synthesis. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 405–421. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_24
Müller, T., Evans, A., Schied, C., Keller, A.: Instant neural graphics primitives with a multiresolution hash encoding. ACM Trans. Graph. 41(4), 1–15 (2022)
Niemeyer, M., Geiger, A.: GIRAFFE: representing scenes as compositional generative neural feature fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11453–11464 (2021)
Ost, J., Mannan, F., Thuerey, N., Knodt, J., Heide, F.: Neural scene graphs for dynamic scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. arXiv (2021)
Rematas, K., et al.: Urban radiance fields. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12922–12932 (2022)
Tancik, M., et al.: Nerfstudio: a modular framework for neural radiance field development. ACM Trans. Graph. 1(1) (2023)
Tian, B., Liu, M., Gao, H.A., Li, P., Zhao, H., Zhou, G.: Unsupervised road anomaly detection with language anchors. In: 2023 IEEE International Conference on Robotics and Automation (ICRA), pp. 7778–7785 (2023)
Tian, B., Luo, L., Zhao, H., Zhou, G.: VIBUS: data-efficient 3D scene parsing with VIewpoint Bottleneck and Uncertainty-Spectrum modeling. J. Photogramm. Remote Sens. 194, 302–318 (2022)
Turki, H., Zhang, J.Y., Ferroni, F., Ramanan, D.: SUDS: scalable urban dynamic scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. arXiv (2023)
Yang, Z., et al.: UniSim: a neural closed-loop sensor simulator. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1389–1399 (2023)
Yu, Z., Peng, S., Niemeyer, M., Sattler, T., Geiger, A.: MonoSDF: exploring monocular geometric cues for neural implicit surface reconstruction. In: Advances in Neural Information Processing Systems (2022)
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 586–595 (2018)
Zheng, Y., et al.: STEPS: joint self-supervised nighttime image enhancement and depth estimation. In: 2023 IEEE Conference on Robotics and Automation (ICRA 2023) (2023)
Zhi, S., Laidlow, T., Leutenegger, S., Davison, A.J.: In-place scene labelling and understanding with implicit scene representation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wu, Z. et al. (2024). MARS: An Instance-Aware, Modular and Realistic Simulator for Autonomous Driving. In: Fang, L., Pei, J., Zhai, G., Wang, R. (eds) Artificial Intelligence. CICAI 2023. Lecture Notes in Computer Science(), vol 14473. Springer, Singapore. https://doi.org/10.1007/978-981-99-8850-1_1
Download citation
DOI: https://doi.org/10.1007/978-981-99-8850-1_1
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8849-5
Online ISBN: 978-981-99-8850-1
eBook Packages: Computer ScienceComputer Science (R0)