Skip to main content

ColorMNet: A Memory-Based Deep Spatial-Temporal Feature Propagation Network for Video Colorization

  • Conference paper
  • First Online:
Computer Vision – ECCV 2024 (ECCV 2024)

Abstract

How to effectively explore spatial-temporal features is important for video colorization. Instead of stacking multiple frames along the temporal dimension or recurrently propagating estimated features that will accumulate errors or cannot explore information from far-apart frames, we develop a memory-based feature propagation module that can establish reliable connections with features from far-apart frames and alleviate the influence of inaccurately estimated features. To extract better features from each frame for the above-mentioned feature propagation, we explore the features from large-pretrained visual models to guide the feature estimation of each frame so that the estimated features can model complex scenarios. In addition, we note that adjacent frames usually contain similar contents. To explore this property for better spatial and temporal feature utilization, we develop a local attention module to aggregate the features from adjacent frames in a spatial-temporal neighborhood. We formulate our memory-based feature propagation module, large-pretrained visual model guided feature estimation module, and local attention module into an end-to-end trainable network (named ColorMNet) and show that it performs favorably against state-of-the-art methods on both the benchmark datasets and real-world scenarios. Our source codes and pre-trained models are available at: https://github.com/yyang181/colormnet.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Chen, S., et al.: Exemplar-based video colorization with long-term spatiotemporal dependency. arXiv preprint arXiv:2303.15081 (2023)

  2. Chen, X., Zou, D., Zhao, Q., Tan, P.: Manifold preserving edit propagation. ACM TOG 31(6), 1–7 (2012)

    Google Scholar 

  3. Cheng, H.K., Schwing, A.G.: XMEM: long-term video object segmentation with an Atkinson-Shiffrin memory model. In: ECCV (2022)

    Google Scholar 

  4. Cheng, Z., Yang, Q., Sheng, B.: Deep colorization. In: ICCV (2015)

    Google Scholar 

  5. Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. In: ICLR (2021)

    Google Scholar 

  6. Hasler, D., Süsstrunk, S.: Measuring colorfulness in natural images. In: Human Vision and Electronic Imaging VIII (2023)

    Google Scholar 

  7. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

    Google Scholar 

  8. He, M., Chen, D., Liao, J., Sander, P.V., Yuan, L.: Deep exemplar-based colorization. ACM TOG 37(4), 1–16 (2018)

    Google Scholar 

  9. Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local NASH equilibrium. In: NeurIPS (2017)

    Google Scholar 

  10. Iizuka, S., Simo-Serra, E.: DeepRemaster: temporal source-reference attention networks for comprehensive video enhancement. ACM TOG 38(6), 1–13 (2019)

    Article  Google Scholar 

  11. Iizuka, S., Simo-Serra, E., Ishikawa, H.: Let there be color! joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification. ACM TOG 35(4), 1–11 (2016)

    Article  Google Scholar 

  12. Kang, X., Lin, X., Zhang, K., et al.: NTIRE 2023 video colorization challenge. In: CVPRW (2023)

    Google Scholar 

  13. Kang, X., Yang, T., Ouyang, W., Ren, P., Li, L., Xie, X.: DDColor: towards photo-realistic image colorization via dual decoders. In: ICCV (2023)

    Google Scholar 

  14. Karen Simonyan, A.Z.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)

    Google Scholar 

  15. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)

    Google Scholar 

  16. Lai, W.S., Huang, J.B., Wang, O., Shechtman, E., Yumer, E., Yang, M.H.: Learning blind video temporal consistency. In: ECCV (2018)

    Google Scholar 

  17. Larsson, G., Maire, M., Shakhnarovich, G.: Learning representations for automatic colorization. In: ECCV (2016)

    Google Scholar 

  18. Lei, C., Chen, Q.: Fully automatic video colorization with self-regularization and diversity. In: CVPR (2019)

    Google Scholar 

  19. Lei, C., Xing, Y., Chen, Q.: Blind video temporal consistency via deep video prior. In: NeurIPS (2020)

    Google Scholar 

  20. Levin, A., Lischinski, D., Weiss, Y.: Colorization using optimization. ACM TOG 23(3), 689–694 (2004)

    Article  Google Scholar 

  21. Liu, Y., et al.: Temporally consistent video colorization with deep feature propagation and self-regularization learning. arXiv preprint arXiv:2110.04562 (2021)

  22. Luan, Q., Wen, F., Cohen-Or, D., Liang, L., Xu, Y., Shum, H.: Natural image colorization. In: ESRT (2007)

    Google Scholar 

  23. Oquab, M., Darcet, T., Moutakanni, T., et al.: Dinov2: Learning robust visual features without supervision. TMLR (2024)

    Google Scholar 

  24. Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: NeurIPS (2019)

    Google Scholar 

  25. Paul, S., Bhattacharya, S., Gupta, S.: Spatiotemporal colorization of video using 3D steerable pyramids. IEEE TCSVT 27(8), 1605–1619 (2016)

    Google Scholar 

  26. Perazzi, F., et al.: A benchmark dataset and evaluation methodology for video object segmentation. In: CVPR (2016)

    Google Scholar 

  27. Qu, Y., Wong, T., Heng, P.: Manga colorization. ACM TOG 25(3), 1214–1220 (2006)

    Article  Google Scholar 

  28. Sangkloy, P., Lu, J., Fang, C., Yu, F., Hays, J.: Scribbler: controlling deep image synthesis with sketch and color. In: CVPR (2017)

    Google Scholar 

  29. Sheng, B., Sun, H., Magnor, M., Li, P.: Video colorization using parallel optimization in feature space. IEEE TCSVT 24(3), 407–417 (2013)

    Google Scholar 

  30. Su, J.W., Chu, H.K., Huang, J.B.: Instance-aware image colorization. In: CVPR (2020)

    Google Scholar 

  31. Teed, Z., Deng, J.: Raft: recurrent all-pairs field transforms for optical flow. In: ECCV (2020)

    Google Scholar 

  32. Thasarathan, H., Nazeri, K., Ebrahimi, M.: Automatic temporally coherent video colorization. In: CRV (2019)

    Google Scholar 

  33. Unterthiner, T., van Steenkiste, S., Kurach, K., Marinier, R., Michalski, M., Gelly, S.: FVD: a new metric for video generation. In: ICLR (2019)

    Google Scholar 

  34. Wan, Z., Zhang, B., Chen, D., Liao, J.: Bringing old films back to life. In: CVPR (2022)

    Google Scholar 

  35. Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE TIP 13(4), 600–612 (2004)

    Google Scholar 

  36. Xu, Z., Wang, T., Fang, F., Sheng, Y., Zhang, G.: Stylization-based architecture for fast deep exemplar colorization. In: CVPR (2020)

    Google Scholar 

  37. Yang, Y., Peng, Z., Du, X., Tao, Z., Tang, J., Pan, J.: BistNet: semantic image prior guided bidirectional temporal feature fusion for deep exemplar-based video colorization. IEEE TPAMI 1–14 (2024)

    Google Scholar 

  38. Yatziv, L., Sapiro, G.: Fast image and video colorization using chrominance blending. IEEE TIP 15(5), 1120–1129 (2006)

    Google Scholar 

  39. Zamir, S.W., Arora, A., Khan, S., Hayat, M., Khan, F.S., Yang, M.H.: Restormer: Efficient transformer for high-resolution image restoration. In: CVPR (2022)

    Google Scholar 

  40. Zhang, B., et al.: Deep exemplar-based video colorization. In: CVPR (2019)

    Google Scholar 

  41. Zhang, R., Isola, P., Efros, A.A.: Colorful image colorization. In: ECCV (2016)

    Google Scholar 

  42. Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: CVPR (2018)

    Google Scholar 

  43. Zhang, R., et al.: Real-time user-guided image colorization with learned deep priors. ACM TOG 36(4), 1–11 (2017)

    Google Scholar 

  44. Zhao, H., Wu, W., Liu, Y., He, D.: Color2embed: fast exemplar-based image colorization using color embeddings. arXiv preprint arXiv:2106.08017 (2021)

  45. Zhao, Y., et al.: VCGAN: video colorization with hybrid generative adversarial network. IEEE TMM (2022)

    Google Scholar 

Download references

Acknowledgements

This work has been partly supported by the National Natural Science Foundation of China (Nos. U22B2049, 62272233, 62332010), the Fundamental Research Funds for the Central Universities (No. 30922010910), and the Postgraduate Research & Practice Innovation Program of Jiangsu Province (No. KYCX24_0680).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jinshan Pan .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (zip 65022 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yang, Y., Dong, J., Tang, J., Pan, J. (2025). ColorMNet: A Memory-Based Deep Spatial-Temporal Feature Propagation Network for Video Colorization. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15062. Springer, Cham. https://doi.org/10.1007/978-3-031-73235-5_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-73235-5_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-73234-8

  • Online ISBN: 978-3-031-73235-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics