A data mining approach for heavy rainfall forecasting based on satellite image sequence analysis

https://doi.org/10.1016/j.cageo.2006.05.010Get rights and content

Abstract

Investigating the evolvement process of Mesoscale Convective Systems (MCSs) over the Tibetan Plateau using satellite remote sensing image sequence is a very important and effective method of forecasting heavy rainfall. This paper presents a spatial data mining approach, by which a possible heavy rainfall forecast can be made, based on MCS tracking using remote sensing satellite images. Firstly, an automatic method for object tracking from the satellite image sequence is proposed, aiming at identification of the qualified MCSs, their characteristics and their moving trajectories. Then, a novel two-phase spatial data mining framework is designed to enable the deduction of the correlations and causalities between MCS activities and possible heavy rainfall occurrences. The proposed approach proves to be capable of lifting the heavy burden of manual rainfall forecasting from the shoulders of meteorologists, by automatically analyzing and interpreting massive, meteorological remote sensing data sets to assist weather forecasting.

Introduction

Meteorological satellite data have been operationally used in weather services for more than 30 years. During this period, forecasting of severe weather based on satellite remote sensing data has been a challenging task. Early warnings of severe weather, made possible by timely and accurate forecasting will help prevent casualties and damages caused by natural disasters. This is particularly significant and urgent in China's Yangtze River Basin, which has so often suffered from flooding as to cause the flood control situation in China to become increasingly grave and urgent. For example, the unprecedented, severe flood in the Yangtze River Basin in 1998 resulted in the deaths of 4150 people and damage to property of approximately 32 billion US dollars. Since almost all floods are caused by intensive heavy rainfalls, the responsible authorities have a key and clear mandate to be able to provide both accurate and advance forecasting of possible heavy rainfall.

Meanwhile, the study of the life cycles, moving trajectories and evolvement trends of Mesoscale Convective Systems (MCSs) remains a challenging and important issue facing the meteorological community, because these phenomena often cause severe weather such as heavy rainfalls, thunderstorms and hurricanes (Houze et al., 1990). In China, the MCSs over the Tibetan Plateau were recently revealed as a major factor resulting in heavy rainfalls occurring in the Yangtze River Basin, which directly caused the severe floods in South China, such as in 1991, 1994, 1998 and 1999 (Jiang and Fan, 2002).

In order to improve on current severe flood control measures in China, it is necessary to accurately track and characterize the active MCSs over the Tibetan Plateau, using the satellite remote sensing images. A meteorological analysis of all MCSs can then be performed, by taking into account their environmental, physical variables, such as temperature, wind divergence and water vapor flux divergence. As a result, the correlations and causalities between the MCS evolvement process and heavy rainfall occurrences can be deduced from the historical remote sensing scenarios, and be represented as known knowledge to assist prediction of potential occurrences of heavy precipitation.

Unfortunately, meteorologists continue to manually track, characterize and analyze MCSs. In this, so-called, “expert-eye-scanning” technique, meteorologists carry out extensive manual work to discover the moving trajectories and evolvement trends of MCSs from the satellite remote sensing images, using their professional experience and knowledge (Arnaud et al., 1992). However, the volumes of satellite image data can be huge, making this method inadequate for tracking MCSs covering wide ranges and long time periods. The method is time consuming, ineffective and often yields unstable and variable results from the different experts involved, affecting the reliability and practicability of heavy rainfall forecasting.

To address the above problems, this paper aims to provide meteorologists with an automatic, spatial data mining method based on MCS tracking and analysis in the satellite image sequence, with which possible heavy rainfalls can be predicted so that effective flood control measures can be taken. The basic principle behind the method is formulated from a recent observation that the eastward movement and propagation of MCSs over the Tibetan Plateau is the crucial factor leading to the heavy rainfalls in the Yangtze River Basin (Jiang and Fan, 2002). The method seeks to model and uncover the latent patterns of MCS activities and their evolvement trends over the Tibetan Plateau, using data mining and knowledge discovery techniques. Firstly, using the image sequences of Temperature of Black Body (TBB) acquired from the Geostationary Meteorological Satellite (GMS-5), the qualified MCSs are automatically identified and tracked by image processing and computer vision techniques. An automatic object-tracking approach of investigation of moving MCS trajectories is developed for this purpose. Then, the High-resolution Limited Area Analysis and Forecasting System (HLAFS) remote sensed data around the geographical location of each MCS are used and a novel two-phase spatial data mining process adopting the C4.5 decision tree algorithm (Quinlan, 1993) is integrated to discover the hidden knowledge that helps reveal the correlations and causalities between the moving trajectories of MCSs and the observed environmental, physical variables. The discovered knowledge is represented in two forms, i.e., derivation rules and environmental, physical model graphs, which can reveal evolvement trends of the MCSs causing heavy rainfall occurrences in the Yangtze River Basin.

The rest of this paper is organized as follows. Firstly, in Section 2, the satellite data sources used in the study are introduced. Section 3 presents a framework of the proposed meteorological data mining approach. MCS tracking techniques and MCS data mining techniques are detailed in 4 MCS tracking, 5 MCS data mining, respectively. The experimental results are illustrated in Section 6. Finally, concluding remarks are provided in Section 7.

Section snippets

Data sources

To track MCS and discover useful information and knowledge crucial to heavy rainfall forecast, the collection of large amounts of satellite data with high spatial and temporal resolutions is indispensable. For this purpose, satellite remote sensing images of the TBB data, taken by the GMS-5 satellite, and data of the HLAFS were provided by the National Satellite Meteorological Center, China Meteorological Administration, for use in this study. The data cover the time period from June to August

Framework

The framework and the processing flow of our meteorological data mining approach are depicted in Fig. 2.

The framework of the proposed approach consists of two stages: MCS tracking and MCS data mining. Firstly, in Stage I, the presence of objects is identified, i.e., MCS structures and the existence of a certain state, such as splitting, merging, vanishing and the new emergence of the MCS. The next step is to identify the same MCS and track its trajectory using the satellite image sequence over

Related work

The meteorology community has already established a number of numerical cloud analysis and forecasting systems. Much cloud analysis work has been carried out, using the empirical models and different types of satellite imaging, or satellite observed data. For example, Souto et al. (2003) proposed a cloud analysis method for rainfall forecasting in the Galician region of Spain, applying a high-resolution non-hydrostatic numerical model to the satellite observations. Plonski et al. (2000)

MCS data mining

Identifying and tracking the MCSs are just the first steps in revealing the meteorological correlations and causalities between MCS evolvement trends and heavy rainfall occurrences. Many other research issues, such as trajectory prediction and causality analysis, are difficult to solve by numerical means. Exploration of data mining techniques is a good means for discovering the meteorological knowledge and phenomena hidden behind the massive data collections. To address this issue, a data

Experimental results

We have carried out experiments to evaluate both the MCS tracking method and the data mining approach proposed in the paper. Firstly, we compared our MCS tracking approach with the area-overlapping tracking method proposed by Arnaud et al. (1992). The manually “expert-eye-scanning” method is also adopted as a performance benchmark of experimental results. Table 1 illustrates the experimental results and comparisons of the above methods.

As shown in Table 1, MCS no. measures the number of

Conclusions

The research reported here proposes an efficient tracking, characterization and analysis tool for MCS and a novel, two-phase spatial data mining framework, based on meteorological satellite remote sensing image sequence analysis. The tool can be used to discover the knowledge used to reveal the correlations and causalities between MCS evolvement trends and possible heavy rainfall occurrences in the Yangtze River Basin of China. Experimental results show that the proposed approach simplifies and

Acknowledgments

The authors thank their collaborators in the National Satellite Meteorological Center of China Meteorological Administration and at the Hong Kong Observatory, for their stimulating discussion on domain-specific knowledge and provision of data sources. The authors are also grateful to the editors and reviewers for their valuable comments and suggestions on this manuscript.

References (13)

  • Y. Arnaud et al.

    Automatic tracking and characterization of African convective systems on meteosat pictures

    Journal of Applied Meteorology

    (1992)
  • L. Breiman et al.

    Classification and Regression Trees

    (1984)
  • H. Freeman

    Computer processing of line-drawing image

    Computing Surveys

    (1974)
  • Holder, L.B., 1995. Intermediate decision trees. In: Proceedings of the 14th International Joint Conference on...
  • R. Houze et al.

    Mesoscale organization of springtime rainstorms in Oklahoma

    Monthly Weather Review

    (1990)
  • J. Jiang et al.

    Convective clouds and mesoscale convective systems over the Tibetan Plateau in summer

    Atmosphere Science

    (2002)
There are more references available in the full text version of this article.

Cited by (30)

  • Extreme precipitation prediction based on neural network model – A case study for southeastern Brazil

    2022, Journal of Hydrology
    Citation Excerpt :

    They applied clustering to obtain spatially well-defined groups in relation to the volume of precipitation, and were able to predict the impact of floods with a volume of 70 mm (in 6 h). Yang et al. (2007) segmented sequences of satellite images to extract information from images. Subsequently, they trained a decision tree to classify rainfall events as extreme or ordinary.

  • Satellite data: Big data extraction and analysis

    2021, Artificial Intelligence in Data Mining: Theories and Applications
  • A mutual-information-based mining method for marine abnormal association rules

    2015, Computers and Geosciences
    Citation Excerpt :

    With remote sensing images, there are two approaches of exploring marine environmental relationships, i.e. pixel-oriented and object-oriented. After storing pixels or objects into the database, the association rule-mining has the similar processing workflow regardless of the approaches, and generally includes three components: data pretreatment to catalog the remote sensing image into Boolean or quantitative values (discretization), algorithm implementation to discover the association relationships of the discretized items, and evaluation and analysis to find the interesting patterns (Yang et al., 2007; Korting et al., 2013). In this paper, we assume that discretization has been performed, and we focus on the mining algorithm.

  • A new algorithm for seasonal precipitation forecast based on global atmospheric hydrological water budget

    2015, Applied Mathematics and Computation
    Citation Excerpt :

    Shamseldin [15], Ju et al. [4] and Jain and Srinivasulu [5] brought forward the season rainfall-runoff prediction method based on BP neural network. Yang et al. [22] raise a data mining approach for heavy rainfall forecasting based on satellite image sequence analysis. Schellart et al. [14], Gu et al. [1] put forward an objective prediction method of monthly precipitation.

  • A new classification approach for detecting severe weather patterns

    2013, Computers and Geosciences
    Citation Excerpt :

    Finally, we cite two works whose goal is more similar to the one we propose, i.e. early prediction of heavy rainfalls. Yang et al. (2007) perform the automatic identification and tracking of mesoscale convective systems from sequences of satellite images. A decision tree is trained to find correlations and causalities between (i) some meteorological and environmental variables, and (ii) the evolvement and trajectories of these systems.

  • Statistical cloud detection from SEVIRI multispectral images

    2008, Remote Sensing of Environment
    Citation Excerpt :

    Generally some decision rules are set involving a few selected spectral bands; then thresholds on the value of radiances are empirically chosen to discriminate between the cloudy and clear sky conditions. Methods based on decision rules underwent a significant evolution during recent years, even permitting to retrieve not only the presence of clouds but also several related features, e.g., tracking, shape (Yang et al., 2006, 2007). Physical methodologies suffer from some drawbacks as high variability of clouds, dependence of radiance on the emissivity of the surface, which is very difficult to estimate accurately over land, and the choice of suitable bands for the decision rules.

View all citing articles on Scopus

This work is supported by the National Natural Science Foundation of PR China under Grant No. 60505008 and the RGC grant from Hong Kong Research Grant Council under Grant No. CUHK4132/99H.

View full text