MMSISP: A Satellite Image Sequence Prediction Network with Multi-factor Decoupling and Multi-modal Fusion

Mo, Fanbin; Huang, Yixiang; Wu, Ming; Zhu, Xun; Zhang, Chuang

doi:10.1007/978-3-031-78312-8_15

Fanbin Mo¹³,
Yixiang Huang¹³,
Ming Wu¹³,
Xun Zhu¹³ &
…
Chuang Zhang¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15322))

Included in the following conference series:

International Conference on Pattern Recognition

219 Accesses

Abstract

Satellite image sequence prediction is a branch of spatio-temporal prediction, which holds considerable potential for practical applications. However, the complex and diverse changes of satellite images over time hinder existing spatio-temporal prediction models from achieving high-accuracy long-term predictions. In this paper, we propose a method called MMSISP (Multi-Factor Multi-Modal Satellite Image Sequence Predictor). This method decomposes satellite image changes into multiple factors and models them using two branches. The motion branch is utilized for predicting cloud movement, while the appearance branch is employed for forecasting cloud variations (e.g., formation and dissipation), as well as brightness change. Additionally, we introduce two modalities: capture time and meteorological data, enabling the model to have more clues for predicting future frames. For the capture time, we design a time embedding module that enables the model to infer brightness and learn seasonal patterns of cloud formation and dissipation. Regarding meteorological data, which contains information about cloud movement and cloud variations, we devise different spatio-temporal multi-modal fusion mechanisms for the two branches. Based on experiments conducted on the Himawari-8 satellite images, our method demonstrates a significant improvement in accuracy compared to other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Modeling and prediction of land use land cover change dynamics based on spatio-temporal analysis of optical and radar time series of remotely sensed images

Article 14 August 2023

A sequence-to-sequence based multi-scale deep learning model for satellite cloud image prediction

Article 15 February 2023

A modified flexible spatiotemporal data fusion model

Article 10 January 2020

References

Dai, K., Li, X., Ma, C., Lu, S., Ye, Y., Xian, D., Tian, L., Qin, D.: Learning spatial-temporal consistency for satellite image sequence prediction. IEEE Transactions on Geoscience and Remote Sensing (2023)
Google Scholar
Dai, K., Li, X., Ye, Y., Feng, S., Qin, D., Ye, R.: Mstcgan: Multiscale time conditional generative adversarial network for long-term satellite image sequence prediction. IEEE Trans. Geosci. Remote Sens. 60, 1–16 (2022)
Google Scholar
Gao, Z., Tan, C., Wu, L., Li, S.Z.: Simvp: Simpler yet better video prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 3170–3180 (2022)
Google Scholar
Guen, V.L., Thome, N.: Disentangling physical dynamics from unknown factors for unsupervised video prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 11474–11484 (2020)
Google Scholar
Hirpa, F.A., Hopson, T.M., De Groeve, T., Brakenridge, G.R., Gebremichael, M., Restrepo, P.J.: Upstream satellite remote sensing for river discharge forecasting: Application to major rivers in south asia. Remote Sens. Environ. 131, 140–151 (2013)
Article Google Scholar
Horn, B.K., Schunck, B.G.: Determining optical flow. Artificial intelligence 17(1–3), 185–203 (1981)
Google Scholar
Lee, J.H., Lee, S.S., Kim, H.G., Song, S.K., Kim, S., Ro, Y.M.: Mcsip net: Multichannel satellite image prediction via deep neural network. IEEE Trans. Geosci. Remote Sens. 58(3), 2212–2224 (2019)
Article Google Scholar
Leinonen, J., Hamann, U., Nerini, D., Germann, U., Franch, G.: Latent diffusion models for generative precipitation nowcasting with accurate uncertainty quantification. arXiv preprint arXiv:2304.12891 (2023)
Ravuri, S., Lenc, K., Willson, M., Kangin, D., Lam, R., Mirowski, P., Fitzsimons, M., Athanassiadou, M., Kashem, S., Madge, S., et al.: Skilful precipitation nowcasting using deep generative models of radar. Nature 597(7878), 672–677 (2021)
Article Google Scholar
Shi, X., Chen, Z., Wang, H., Yeung, D.Y., Wong, W.K., Woo, W.c.: Convolutional lstm network: A machine learning approach for precipitation nowcasting. Advances in neural information processing systems 28 (2015)
Google Scholar
Shukla, B.P., Kishtawal, C.M., Pal, P.K.: Prediction of satellite image sequence for weather nowcasting using cluster-based spatiotemporal regression. IEEE Trans. Geosci. Remote Sens. 52(7), 4155–4160 (2013)
Article Google Scholar
Son, Y., Zhang, X., Yoon, Y., Cho, J., Choi, S.: Lstm-gan based cloud movement prediction in satellite images for pv forecast. J. Ambient. Intell. Humaniz. Comput. 14(9), 12373–12386 (2023)
Article Google Scholar
Tan, C., Gao, Z., Wu, L., Xu, Y., Xia, J., Li, S., Li, S.Z.: Temporal attention unit: Towards efficient spatiotemporal predictive learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 18770–18782 (2023)
Google Scholar
Tang, S., Li, C., Zhang, P., Tang, R.: Swinlstm: Improving spatiotemporal prediction accuracy using swin transformer and lstm. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 13470–13479 (2023)
Google Scholar
Valada, A., Mohan, R., Burgard, W.: Self-supervised model adaptation for multimodal semantic segmentation. Int. J. Comput. Vision 128(5), 1239–1285 (2020)
Article Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)
Google Scholar
Wang, Y., Gao, Z., Long, M., Wang, J., Philip, S.Y.: Predrnn++: Towards a resolution of the deep-in-time dilemma in spatiotemporal predictive learning. In: International Conference on Machine Learning. pp. 5123–5132. PMLR (2018)
Google Scholar
Wang, Y., Long, M., Wang, J., Gao, Z., Yu, P.S.: Predrnn: Recurrent neural networks for predictive learning using spatiotemporal lstms. Advances in neural information processing systems 30 (2017)
Google Scholar
Wang, Y., Zhang, J., Zhu, H., Long, M., Wang, J., Yu, P.S.: Memory in memory: A predictive neural network for learning higher-order non-stationarity from spatiotemporal dynamics. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 9154–9162 (2019)
Google Scholar
Wu, H., Yao, Z., Wang, J., Long, M.: Motionrnn: A flexible model for video prediction with spacetime-varying motions. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 15435–15444 (2021)
Google Scholar
Xu, Z., Du, J., Wang, J., Jiang, C., Ren, Y.: Satellite image prediction relying on gan and lstm neural networks. In: ICC 2019-2019 IEEE international conference on communications (ICC). pp. 1–6. IEEE (2019)
Google Scholar
Zhang, Y., Long, M., Chen, K., Xing, L., Jin, R., Jordan, M.I., Wang, J.: Skilful nowcasting of extreme precipitation with nowcastnet. Nature 619(7970), 526–532 (2023)
Article Google Scholar
Zhong, Y., Liang, L., Zharkov, I., Neumann, U.: Mmvp: Motion-matrix-based video prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 4273–4283 (2023)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, China
Fanbin Mo, Yixiang Huang, Ming Wu, Xun Zhu & Chuang Zhang

Authors

Fanbin Mo
View author publications
You can also search for this author in PubMed Google Scholar
Yixiang Huang
View author publications
You can also search for this author in PubMed Google Scholar
Ming Wu
View author publications
You can also search for this author in PubMed Google Scholar
Xun Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Chuang Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chuang Zhang .

Editor information

Editors and Affiliations

University of Salford, Salford, Lancashire, UK
Apostolos Antonacopoulos
Indian Institute of Technology Bombay, Mumbai, Maharashtra, India
Subhasis Chaudhuri
Johns Hopkins University, Baltimore, MD, USA
Rama Chellappa
Chinese Academy of Sciences, Beijing, China
Cheng-Lin Liu
IIT Kharagpur, Kharagpur, West Bengal, India
Saumik Bhattacharya
Indian Statistical Institute Kolkata, Kolkata, West Bengal, India
Umapada Pal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mo, F., Huang, Y., Wu, M., Zhu, X., Zhang, C. (2025). MMSISP: A Satellite Image Sequence Prediction Network with Multi-factor Decoupling and Multi-modal Fusion. In: Antonacopoulos, A., Chaudhuri, S., Chellappa, R., Liu, CL., Bhattacharya, S., Pal, U. (eds) Pattern Recognition. ICPR 2024. Lecture Notes in Computer Science, vol 15322. Springer, Cham. https://doi.org/10.1007/978-3-031-78312-8_15

Download citation

DOI: https://doi.org/10.1007/978-3-031-78312-8_15
Published: 04 December 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-78311-1
Online ISBN: 978-3-031-78312-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)