FastPCI: Motion-Structure Guided Fast Point Cloud Frame Interpolation

Zhang, Tianyu; Qian, Guocheng; Xie, Jin; Yang, Jian

doi:10.1007/978-3-031-72904-1_15

Tianyu Zhang¹³,
Guocheng Qian ORCID: orcid.org/0000-0002-2935-8570¹⁴,
Jin Xie^15,16 &
…
Jian Yang¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15132))

Included in the following conference series:

European Conference on Computer Vision

223 Accesses

Abstract

Point cloud frame interpolation is a challenging task that involves accurate scene flow estimation across frames and maintaining the geometry structure. Prevailing techniques often rely on pre-trained motion estimators or intensive testing-time optimization, resulting in compromised interpolation accuracy or prolonged inference. This work presents FastPCI that introduces Pyramid Convolution-Transformer architecture for point cloud frame interpolation. Our hybrid Convolution-Transformer improves the local and long-range feature learning, while the pyramid network offers multilevel features and reduces the computation. In addition, FastPCI proposes a unique Dual-Direction Motion-Structure block for more accurate scene flow estimation. Our design is motivated by two facts: (1) accurate scene flow preserves 3D structure, and (2) point cloud at the previous timestep should be reconstructable using reverse motion from future timestep. Extensive experiments show that FastPCI significantly outperforms the state-of-the-art PointINet and NeuralPCI with notable gains (e.g. 26.6% and 18.3% reduction in Chamfer Distance in KITTI), while being more than 10$\times $ and 600$\times $ faster, respectively. Code is available at https://github.com/genuszty/FastPCI.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

PointPWC-Net: Cost Volume on Point Clouds for (Self-)Supervised Scene Flow Estimation

Bi-PointFlowNet: Bidirectional Learning for Point Cloud Based Scene Flow Estimation

PushNet: 3D reconstruction from a single image by pushing

Article 28 January 2024

References

Bao, W., Lai, W.S., Ma, C., Zhang, X., Gao, Z., Yang, M.H.: Depth-aware video frame interpolation. In: CVPR (2019)
Google Scholar
Caesar, H., et al.: nuScenes: a multimodal dataset for autonomous driving. In: CVPR, pp. 11621–11631 (2020)
Google Scholar
Chang, M.F., et al.: Argoverse: 3d tracking and forecasting with rich maps. In: CVPR, pp. 8748–8757 (2019)
Google Scholar
Garrido, D., Rodrigues, R., Augusto Sousa, A., Jacob, J., Castro Silva, D.: Point cloud interaction and manipulation in virtual reality. In: 2021 5th International Conference on Artificial Intelligence and Virtual Reality (AIVR), pp. 15–20 (2021)
Google Scholar
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The kitti vision benchmark suite. In: CVPR, pp. 3354–3361 (2012)
Google Scholar
Huang, Z., Zhang, T., Heng, W., Shi, B., Zhou, S.: Real-time intermediate flow estimation for video frame interpolation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) Computer Vision – ECCV 2022. ECCV 2022. LNCS, vol. 13674. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19781-9_36
Kalluri, T., Pathak, D., Chandraker, M., Tran, D.: FLAVR: flow-agnostic video representations for fast frame interpolation. In: 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 2070–2081 (2021)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kong, L., et al.: IFRNet: intermediate feature refine network for efficient frame interpolation. In: CVPR (2022)
Google Scholar
Li, G., et al.: Deepgcns: making gcns go as deep as cnns. IEEE TPAMI PP (2021)
Google Scholar
Li, X., Kaesemodel Pontes, J., Lucey, S.: Neural scene flow prior. NeurIPS 34 (2021)
Google Scholar
Liu, H., Liao, K., Lin, C., Zhao, Y., Guo, Y.: Pseudo-lidar point cloud interpolation based on 3D motion representation and spatial supervision. IEEE Trans. Intell. Transp. Syst. 23(7), 6379–6389 (2021)
Article MATH Google Scholar
Liu, H., Liao, K., Lin, C., Zhao, Y., Liu, M.: PLIN: a network for pseudo-lidar point cloud interpolation. Sensors 20(6), 1573 (2020)
Article MATH Google Scholar
Lu, F., Chen, G., Qu, S., Li, Z., Liu, Y., Knoll, A.: Pointinet: point cloud frame interpolation network. In: AAAI, pp. 2251–2259 (2021)
Google Scholar
Lu, L., Wu, R., Lin, H., Lu, J., Jia, J.: Video frame interpolation with transformer. In: CVPR (2022)
Google Scholar
Luo, W., Yang, B., Urtasun, R.: Fast and furious: real time end-to-end 3d detection, tracking and motion forecasting with a single convolutional net. In: CVPR, pp. 3569–3577 (2018)
Google Scholar
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: representing scenes as neural radiance fields for view synthesis. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 405–421. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_24
Chapter Google Scholar
Niklaus, S., Liu, F.: Context-aware synthesis for video frame interpolation. In: CVPR (2018)
Google Scholar
Park, J., Ko, K., Lee, C., Kim, C.-S.: BMBC: bilateral motion estimation with bilateral cost volume for video interpolation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12359, pp. 109–125. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58568-6_7
Chapter Google Scholar
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. NeurIPS 32 (2019)
Google Scholar
Pumarola, A., Corona, E., Pons-Moll, G., Moreno-Noguer, F.: D-NeRF: neural radiance fields for dynamic scenes. In: CVPR, pp. 10318–10327 (2021)
Google Scholar
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3d classification and segmentation. In: CVPR (2017)
Google Scholar
Qi, C.R., Yi, L., Su, H., Guibas, L.J.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: NeurIPS (2017)
Google Scholar
Qian, G., Hamdi, A., Zhang, X., Ghanem, B.: Pix4Point: image pretrained standard transformers for 3d point cloud understanding. In: 2024 International Conference on 3D Vision (3DV), pp. 1280–1290 (2024)
Google Scholar
Qian, G., Hammoud, H., Li, G., Thabet, A., Ghanem, B.: Assanet: an anisotropic separable set abstraction for efficient point cloud representation learning. NeurIPS 34 (2021)
Google Scholar
Qian, G., et al.: Pointnext: revisiting pointnet++ with improved training and scaling strategies. In: NeurIPS (2022)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: Unet: convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-assisted Intervention (MICCAI), pp. 234–241 (2015)
Google Scholar
Sim, H., Oh, J., Kim, M.: XVFI: extreme video frame interpolation. In: ICCV (2021)
Google Scholar
Thomas, H., Qi, C.R., Deschaud, J.E., Marcotegui, B., Goulette, F., Guibas, L.J.: KPConv: flexible and deformable convolution for point clouds. In: ICCV (2019)
Google Scholar
Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph CNN for learning on point clouds. ACM Trans. Graph. (TOG) (2019)
Google Scholar
Wu, W., Qi, Z., Fuxin, L.: Pointconv: deep convolutional networks on 3D point clouds. In: CVPR (2019)
Google Scholar
Wu, W., Wang, Z.Y., Li, Z., Liu, W., Fuxin, L.: PointPWC-Net: cost volume on point clouds for (Self-)supervised scene flow estimation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12350, pp. 88–107. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58558-7_6
Chapter Google Scholar
Wu, X., Lao, Y., Jiang, L., Liu, X., Zhao, H.: Point transformer v2: grouped vector attention and partition-based pooling. Adv. Neural Inf. Process. Syst. 35, 33330–33342 (2022)
Google Scholar
Xu, B., Wang, N., Chen, T., Li, M.: Empirical evaluation of rectified activations in convolutional network. CoRR (2015)
Google Scholar
Yin, T., Zhou, X., Krähenbühl, P.: Center-based 3d object detection and tracking. In: CVPR (2021)
Google Scholar
Zeng, Y., Qian, Y., Zhang, Q., Hou, J., Yuan, Y., He, Y.: Idea-net: dynamic 3D point cloud interpolation via deep embedding alignment. In: CVPR (2022)
Google Scholar
Zhang, G., Zhu, Y., Wang, H., Chen, Y., Wu, G., Wang, L.: Extracting motion and appearance via inter-frame attention for efficient video frame interpolation. In: CVPR, pp. 5682–5692 (2023)
Google Scholar
Zhang, Z., Hu, L., Deng, X., Xia, S.: Sequential 3D human pose estimation using adaptive point cloud sampling strategy. In: IJCAI, pp. 1330–1337 (2021)
Google Scholar
Zhao, H., Jiang, L., Jia, J., Torr, P.H., Koltun, V.: Point transformer. In: ICCV, pp. 16259–16268 (2021)
Google Scholar
Zheng, Z., Wu, D., Lu, R., Lu, F., Chen, G., Jiang, C.: Neuralpci: spatio-temporal neural field for 3d point cloud multi-frame non-linear interpolation. In: CVPR (2023)
Google Scholar

Download references

Acknowledgements

The authors would like to thank the valuable feedback from the anonymous reviewers. This work is supported in part by the NSFC under Grant Nos. 62276144.

Author information

Authors and Affiliations

Nankai University, Tianjin, China
Tianyu Zhang & Jian Yang
Snap Research, Tianjin, China
Guocheng Qian
State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China
Jin Xie
School of Intelligence Science and Technology, Nanjing University, Suzhou, China
Jin Xie

Authors

Tianyu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Guocheng Qian
View author publications
You can also search for this author in PubMed Google Scholar
Jin Xie
View author publications
You can also search for this author in PubMed Google Scholar
Jian Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jin Xie .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Hessen, Germany
Stefan Roth
Princeton University, Palo Alto, CA, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (zip 3493 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, T., Qian, G., Xie, J., Yang, J. (2025). FastPCI: Motion-Structure Guided Fast Point Cloud Frame Interpolation. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15132. Springer, Cham. https://doi.org/10.1007/978-3-031-72904-1_15

Download citation

DOI: https://doi.org/10.1007/978-3-031-72904-1_15
Published: 21 November 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72903-4
Online ISBN: 978-3-031-72904-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics