Skip to main content

FastPCI: Motion-Structure Guided Fast Point Cloud Frame Interpolation

  • Conference paper
  • First Online:
Computer Vision – ECCV 2024 (ECCV 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15132))

Included in the following conference series:

  • 223 Accesses

Abstract

Point cloud frame interpolation is a challenging task that involves accurate scene flow estimation across frames and maintaining the geometry structure. Prevailing techniques often rely on pre-trained motion estimators or intensive testing-time optimization, resulting in compromised interpolation accuracy or prolonged inference. This work presents FastPCI that introduces Pyramid Convolution-Transformer architecture for point cloud frame interpolation. Our hybrid Convolution-Transformer improves the local and long-range feature learning, while the pyramid network offers multilevel features and reduces the computation. In addition, FastPCI proposes a unique Dual-Direction Motion-Structure block for more accurate scene flow estimation. Our design is motivated by two facts: (1) accurate scene flow preserves 3D structure, and (2) point cloud at the previous timestep should be reconstructable using reverse motion from future timestep. Extensive experiments show that FastPCI significantly outperforms the state-of-the-art PointINet and NeuralPCI with notable gains (e.g. 26.6% and 18.3% reduction in Chamfer Distance in KITTI), while being more than 10\(\times \) and 600\(\times \) faster, respectively. Code is available at https://github.com/genuszty/FastPCI.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bao, W., Lai, W.S., Ma, C., Zhang, X., Gao, Z., Yang, M.H.: Depth-aware video frame interpolation. In: CVPR (2019)

    Google Scholar 

  2. Caesar, H., et al.: nuScenes: a multimodal dataset for autonomous driving. In: CVPR, pp. 11621–11631 (2020)

    Google Scholar 

  3. Chang, M.F., et al.: Argoverse: 3d tracking and forecasting with rich maps. In: CVPR, pp. 8748–8757 (2019)

    Google Scholar 

  4. Garrido, D., Rodrigues, R., Augusto Sousa, A., Jacob, J., Castro Silva, D.: Point cloud interaction and manipulation in virtual reality. In: 2021 5th International Conference on Artificial Intelligence and Virtual Reality (AIVR), pp. 15–20 (2021)

    Google Scholar 

  5. Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The kitti vision benchmark suite. In: CVPR, pp. 3354–3361 (2012)

    Google Scholar 

  6. Huang, Z., Zhang, T., Heng, W., Shi, B., Zhou, S.: Real-time intermediate flow estimation for video frame interpolation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) Computer Vision – ECCV 2022. ECCV 2022. LNCS, vol. 13674. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19781-9_36

  7. Kalluri, T., Pathak, D., Chandraker, M., Tran, D.: FLAVR: flow-agnostic video representations for fast frame interpolation. In: 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 2070–2081 (2021)

    Google Scholar 

  8. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  9. Kong, L., et al.: IFRNet: intermediate feature refine network for efficient frame interpolation. In: CVPR (2022)

    Google Scholar 

  10. Li, G., et al.: Deepgcns: making gcns go as deep as cnns. IEEE TPAMI PP (2021)

    Google Scholar 

  11. Li, X., Kaesemodel Pontes, J., Lucey, S.: Neural scene flow prior. NeurIPS 34 (2021)

    Google Scholar 

  12. Liu, H., Liao, K., Lin, C., Zhao, Y., Guo, Y.: Pseudo-lidar point cloud interpolation based on 3D motion representation and spatial supervision. IEEE Trans. Intell. Transp. Syst. 23(7), 6379–6389 (2021)

    Article  MATH  Google Scholar 

  13. Liu, H., Liao, K., Lin, C., Zhao, Y., Liu, M.: PLIN: a network for pseudo-lidar point cloud interpolation. Sensors 20(6), 1573 (2020)

    Article  MATH  Google Scholar 

  14. Lu, F., Chen, G., Qu, S., Li, Z., Liu, Y., Knoll, A.: Pointinet: point cloud frame interpolation network. In: AAAI, pp. 2251–2259 (2021)

    Google Scholar 

  15. Lu, L., Wu, R., Lin, H., Lu, J., Jia, J.: Video frame interpolation with transformer. In: CVPR (2022)

    Google Scholar 

  16. Luo, W., Yang, B., Urtasun, R.: Fast and furious: real time end-to-end 3d detection, tracking and motion forecasting with a single convolutional net. In: CVPR, pp. 3569–3577 (2018)

    Google Scholar 

  17. Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: representing scenes as neural radiance fields for view synthesis. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 405–421. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_24

    Chapter  Google Scholar 

  18. Niklaus, S., Liu, F.: Context-aware synthesis for video frame interpolation. In: CVPR (2018)

    Google Scholar 

  19. Park, J., Ko, K., Lee, C., Kim, C.-S.: BMBC: bilateral motion estimation with bilateral cost volume for video interpolation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12359, pp. 109–125. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58568-6_7

    Chapter  Google Scholar 

  20. Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. NeurIPS 32 (2019)

    Google Scholar 

  21. Pumarola, A., Corona, E., Pons-Moll, G., Moreno-Noguer, F.: D-NeRF: neural radiance fields for dynamic scenes. In: CVPR, pp. 10318–10327 (2021)

    Google Scholar 

  22. Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3d classification and segmentation. In: CVPR (2017)

    Google Scholar 

  23. Qi, C.R., Yi, L., Su, H., Guibas, L.J.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: NeurIPS (2017)

    Google Scholar 

  24. Qian, G., Hamdi, A., Zhang, X., Ghanem, B.: Pix4Point: image pretrained standard transformers for 3d point cloud understanding. In: 2024 International Conference on 3D Vision (3DV), pp. 1280–1290 (2024)

    Google Scholar 

  25. Qian, G., Hammoud, H., Li, G., Thabet, A., Ghanem, B.: Assanet: an anisotropic separable set abstraction for efficient point cloud representation learning. NeurIPS 34 (2021)

    Google Scholar 

  26. Qian, G., et al.: Pointnext: revisiting pointnet++ with improved training and scaling strategies. In: NeurIPS (2022)

    Google Scholar 

  27. Ronneberger, O., Fischer, P., Brox, T.: Unet: convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-assisted Intervention (MICCAI), pp. 234–241 (2015)

    Google Scholar 

  28. Sim, H., Oh, J., Kim, M.: XVFI: extreme video frame interpolation. In: ICCV (2021)

    Google Scholar 

  29. Thomas, H., Qi, C.R., Deschaud, J.E., Marcotegui, B., Goulette, F., Guibas, L.J.: KPConv: flexible and deformable convolution for point clouds. In: ICCV (2019)

    Google Scholar 

  30. Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph CNN for learning on point clouds. ACM Trans. Graph. (TOG) (2019)

    Google Scholar 

  31. Wu, W., Qi, Z., Fuxin, L.: Pointconv: deep convolutional networks on 3D point clouds. In: CVPR (2019)

    Google Scholar 

  32. Wu, W., Wang, Z.Y., Li, Z., Liu, W., Fuxin, L.: PointPWC-Net: cost volume on point clouds for (Self-)supervised scene flow estimation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12350, pp. 88–107. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58558-7_6

    Chapter  Google Scholar 

  33. Wu, X., Lao, Y., Jiang, L., Liu, X., Zhao, H.: Point transformer v2: grouped vector attention and partition-based pooling. Adv. Neural Inf. Process. Syst. 35, 33330–33342 (2022)

    Google Scholar 

  34. Xu, B., Wang, N., Chen, T., Li, M.: Empirical evaluation of rectified activations in convolutional network. CoRR (2015)

    Google Scholar 

  35. Yin, T., Zhou, X., Krähenbühl, P.: Center-based 3d object detection and tracking. In: CVPR (2021)

    Google Scholar 

  36. Zeng, Y., Qian, Y., Zhang, Q., Hou, J., Yuan, Y., He, Y.: Idea-net: dynamic 3D point cloud interpolation via deep embedding alignment. In: CVPR (2022)

    Google Scholar 

  37. Zhang, G., Zhu, Y., Wang, H., Chen, Y., Wu, G., Wang, L.: Extracting motion and appearance via inter-frame attention for efficient video frame interpolation. In: CVPR, pp. 5682–5692 (2023)

    Google Scholar 

  38. Zhang, Z., Hu, L., Deng, X., Xia, S.: Sequential 3D human pose estimation using adaptive point cloud sampling strategy. In: IJCAI, pp. 1330–1337 (2021)

    Google Scholar 

  39. Zhao, H., Jiang, L., Jia, J., Torr, P.H., Koltun, V.: Point transformer. In: ICCV, pp. 16259–16268 (2021)

    Google Scholar 

  40. Zheng, Z., Wu, D., Lu, R., Lu, F., Chen, G., Jiang, C.: Neuralpci: spatio-temporal neural field for 3d point cloud multi-frame non-linear interpolation. In: CVPR (2023)

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank the valuable feedback from the anonymous reviewers. This work is supported in part by the NSFC under Grant Nos. 62276144.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jin Xie .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (zip 3493 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, T., Qian, G., Xie, J., Yang, J. (2025). FastPCI: Motion-Structure Guided Fast Point Cloud Frame Interpolation. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15132. Springer, Cham. https://doi.org/10.1007/978-3-031-72904-1_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-72904-1_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-72903-4

  • Online ISBN: 978-3-031-72904-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics