Skip to main content

MMSISP: A Satellite Image Sequence Prediction Network with Multi-factor Decoupling and Multi-modal Fusion

  • Conference paper
  • First Online:
Pattern Recognition (ICPR 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15322))

Included in the following conference series:

  • 166 Accesses

Abstract

Satellite image sequence prediction is a branch of spatio-temporal prediction, which holds considerable potential for practical applications. However, the complex and diverse changes of satellite images over time hinder existing spatio-temporal prediction models from achieving high-accuracy long-term predictions. In this paper, we propose a method called MMSISP (Multi-Factor Multi-Modal Satellite Image Sequence Predictor). This method decomposes satellite image changes into multiple factors and models them using two branches. The motion branch is utilized for predicting cloud movement, while the appearance branch is employed for forecasting cloud variations (e.g., formation and dissipation), as well as brightness change. Additionally, we introduce two modalities: capture time and meteorological data, enabling the model to have more clues for predicting future frames. For the capture time, we design a time embedding module that enables the model to infer brightness and learn seasonal patterns of cloud formation and dissipation. Regarding meteorological data, which contains information about cloud movement and cloud variations, we devise different spatio-temporal multi-modal fusion mechanisms for the two branches. Based on experiments conducted on the Himawari-8 satellite images, our method demonstrates a significant improvement in accuracy compared to other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Dai, K., Li, X., Ma, C., Lu, S., Ye, Y., Xian, D., Tian, L., Qin, D.: Learning spatial-temporal consistency for satellite image sequence prediction. IEEE Transactions on Geoscience and Remote Sensing (2023)

    Google Scholar 

  2. Dai, K., Li, X., Ye, Y., Feng, S., Qin, D., Ye, R.: Mstcgan: Multiscale time conditional generative adversarial network for long-term satellite image sequence prediction. IEEE Trans. Geosci. Remote Sens. 60, 1–16 (2022)

    Google Scholar 

  3. Gao, Z., Tan, C., Wu, L., Li, S.Z.: Simvp: Simpler yet better video prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 3170–3180 (2022)

    Google Scholar 

  4. Guen, V.L., Thome, N.: Disentangling physical dynamics from unknown factors for unsupervised video prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 11474–11484 (2020)

    Google Scholar 

  5. Hirpa, F.A., Hopson, T.M., De Groeve, T., Brakenridge, G.R., Gebremichael, M., Restrepo, P.J.: Upstream satellite remote sensing for river discharge forecasting: Application to major rivers in south asia. Remote Sens. Environ. 131, 140–151 (2013)

    Article  Google Scholar 

  6. Horn, B.K., Schunck, B.G.: Determining optical flow. Artificial intelligence 17(1–3), 185–203 (1981)

    Google Scholar 

  7. Lee, J.H., Lee, S.S., Kim, H.G., Song, S.K., Kim, S., Ro, Y.M.: Mcsip net: Multichannel satellite image prediction via deep neural network. IEEE Trans. Geosci. Remote Sens. 58(3), 2212–2224 (2019)

    Article  Google Scholar 

  8. Leinonen, J., Hamann, U., Nerini, D., Germann, U., Franch, G.: Latent diffusion models for generative precipitation nowcasting with accurate uncertainty quantification. arXiv preprint arXiv:2304.12891 (2023)

  9. Ravuri, S., Lenc, K., Willson, M., Kangin, D., Lam, R., Mirowski, P., Fitzsimons, M., Athanassiadou, M., Kashem, S., Madge, S., et al.: Skilful precipitation nowcasting using deep generative models of radar. Nature 597(7878), 672–677 (2021)

    Article  Google Scholar 

  10. Shi, X., Chen, Z., Wang, H., Yeung, D.Y., Wong, W.K., Woo, W.c.: Convolutional lstm network: A machine learning approach for precipitation nowcasting. Advances in neural information processing systems 28 (2015)

    Google Scholar 

  11. Shukla, B.P., Kishtawal, C.M., Pal, P.K.: Prediction of satellite image sequence for weather nowcasting using cluster-based spatiotemporal regression. IEEE Trans. Geosci. Remote Sens. 52(7), 4155–4160 (2013)

    Article  Google Scholar 

  12. Son, Y., Zhang, X., Yoon, Y., Cho, J., Choi, S.: Lstm-gan based cloud movement prediction in satellite images for pv forecast. J. Ambient. Intell. Humaniz. Comput. 14(9), 12373–12386 (2023)

    Article  Google Scholar 

  13. Tan, C., Gao, Z., Wu, L., Xu, Y., Xia, J., Li, S., Li, S.Z.: Temporal attention unit: Towards efficient spatiotemporal predictive learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 18770–18782 (2023)

    Google Scholar 

  14. Tang, S., Li, C., Zhang, P., Tang, R.: Swinlstm: Improving spatiotemporal prediction accuracy using swin transformer and lstm. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 13470–13479 (2023)

    Google Scholar 

  15. Valada, A., Mohan, R., Burgard, W.: Self-supervised model adaptation for multimodal semantic segmentation. Int. J. Comput. Vision 128(5), 1239–1285 (2020)

    Article  Google Scholar 

  16. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)

    Google Scholar 

  17. Wang, Y., Gao, Z., Long, M., Wang, J., Philip, S.Y.: Predrnn++: Towards a resolution of the deep-in-time dilemma in spatiotemporal predictive learning. In: International Conference on Machine Learning. pp. 5123–5132. PMLR (2018)

    Google Scholar 

  18. Wang, Y., Long, M., Wang, J., Gao, Z., Yu, P.S.: Predrnn: Recurrent neural networks for predictive learning using spatiotemporal lstms. Advances in neural information processing systems 30 (2017)

    Google Scholar 

  19. Wang, Y., Zhang, J., Zhu, H., Long, M., Wang, J., Yu, P.S.: Memory in memory: A predictive neural network for learning higher-order non-stationarity from spatiotemporal dynamics. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 9154–9162 (2019)

    Google Scholar 

  20. Wu, H., Yao, Z., Wang, J., Long, M.: Motionrnn: A flexible model for video prediction with spacetime-varying motions. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 15435–15444 (2021)

    Google Scholar 

  21. Xu, Z., Du, J., Wang, J., Jiang, C., Ren, Y.: Satellite image prediction relying on gan and lstm neural networks. In: ICC 2019-2019 IEEE international conference on communications (ICC). pp. 1–6. IEEE (2019)

    Google Scholar 

  22. Zhang, Y., Long, M., Chen, K., Xing, L., Jin, R., Jordan, M.I., Wang, J.: Skilful nowcasting of extreme precipitation with nowcastnet. Nature 619(7970), 526–532 (2023)

    Article  Google Scholar 

  23. Zhong, Y., Liang, L., Zharkov, I., Neumann, U.: Mmvp: Motion-matrix-based video prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 4273–4283 (2023)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chuang Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mo, F., Huang, Y., Wu, M., Zhu, X., Zhang, C. (2025). MMSISP: A Satellite Image Sequence Prediction Network with Multi-factor Decoupling and Multi-modal Fusion. In: Antonacopoulos, A., Chaudhuri, S., Chellappa, R., Liu, CL., Bhattacharya, S., Pal, U. (eds) Pattern Recognition. ICPR 2024. Lecture Notes in Computer Science, vol 15322. Springer, Cham. https://doi.org/10.1007/978-3-031-78312-8_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-78312-8_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-78311-1

  • Online ISBN: 978-3-031-78312-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics