A data mining approach for heavy rainfall forecasting based on satellite image sequence analysis☆
Introduction
Meteorological satellite data have been operationally used in weather services for more than 30 years. During this period, forecasting of severe weather based on satellite remote sensing data has been a challenging task. Early warnings of severe weather, made possible by timely and accurate forecasting will help prevent casualties and damages caused by natural disasters. This is particularly significant and urgent in China's Yangtze River Basin, which has so often suffered from flooding as to cause the flood control situation in China to become increasingly grave and urgent. For example, the unprecedented, severe flood in the Yangtze River Basin in 1998 resulted in the deaths of 4150 people and damage to property of approximately 32 billion US dollars. Since almost all floods are caused by intensive heavy rainfalls, the responsible authorities have a key and clear mandate to be able to provide both accurate and advance forecasting of possible heavy rainfall.
Meanwhile, the study of the life cycles, moving trajectories and evolvement trends of Mesoscale Convective Systems (MCSs) remains a challenging and important issue facing the meteorological community, because these phenomena often cause severe weather such as heavy rainfalls, thunderstorms and hurricanes (Houze et al., 1990). In China, the MCSs over the Tibetan Plateau were recently revealed as a major factor resulting in heavy rainfalls occurring in the Yangtze River Basin, which directly caused the severe floods in South China, such as in 1991, 1994, 1998 and 1999 (Jiang and Fan, 2002).
In order to improve on current severe flood control measures in China, it is necessary to accurately track and characterize the active MCSs over the Tibetan Plateau, using the satellite remote sensing images. A meteorological analysis of all MCSs can then be performed, by taking into account their environmental, physical variables, such as temperature, wind divergence and water vapor flux divergence. As a result, the correlations and causalities between the MCS evolvement process and heavy rainfall occurrences can be deduced from the historical remote sensing scenarios, and be represented as known knowledge to assist prediction of potential occurrences of heavy precipitation.
Unfortunately, meteorologists continue to manually track, characterize and analyze MCSs. In this, so-called, “expert-eye-scanning” technique, meteorologists carry out extensive manual work to discover the moving trajectories and evolvement trends of MCSs from the satellite remote sensing images, using their professional experience and knowledge (Arnaud et al., 1992). However, the volumes of satellite image data can be huge, making this method inadequate for tracking MCSs covering wide ranges and long time periods. The method is time consuming, ineffective and often yields unstable and variable results from the different experts involved, affecting the reliability and practicability of heavy rainfall forecasting.
To address the above problems, this paper aims to provide meteorologists with an automatic, spatial data mining method based on MCS tracking and analysis in the satellite image sequence, with which possible heavy rainfalls can be predicted so that effective flood control measures can be taken. The basic principle behind the method is formulated from a recent observation that the eastward movement and propagation of MCSs over the Tibetan Plateau is the crucial factor leading to the heavy rainfalls in the Yangtze River Basin (Jiang and Fan, 2002). The method seeks to model and uncover the latent patterns of MCS activities and their evolvement trends over the Tibetan Plateau, using data mining and knowledge discovery techniques. Firstly, using the image sequences of Temperature of Black Body (TBB) acquired from the Geostationary Meteorological Satellite (GMS-5), the qualified MCSs are automatically identified and tracked by image processing and computer vision techniques. An automatic object-tracking approach of investigation of moving MCS trajectories is developed for this purpose. Then, the High-resolution Limited Area Analysis and Forecasting System (HLAFS) remote sensed data around the geographical location of each MCS are used and a novel two-phase spatial data mining process adopting the C4.5 decision tree algorithm (Quinlan, 1993) is integrated to discover the hidden knowledge that helps reveal the correlations and causalities between the moving trajectories of MCSs and the observed environmental, physical variables. The discovered knowledge is represented in two forms, i.e., derivation rules and environmental, physical model graphs, which can reveal evolvement trends of the MCSs causing heavy rainfall occurrences in the Yangtze River Basin.
The rest of this paper is organized as follows. Firstly, in Section 2, the satellite data sources used in the study are introduced. Section 3 presents a framework of the proposed meteorological data mining approach. MCS tracking techniques and MCS data mining techniques are detailed in 4 MCS tracking, 5 MCS data mining, respectively. The experimental results are illustrated in Section 6. Finally, concluding remarks are provided in Section 7.
Section snippets
Data sources
To track MCS and discover useful information and knowledge crucial to heavy rainfall forecast, the collection of large amounts of satellite data with high spatial and temporal resolutions is indispensable. For this purpose, satellite remote sensing images of the TBB data, taken by the GMS-5 satellite, and data of the HLAFS were provided by the National Satellite Meteorological Center, China Meteorological Administration, for use in this study. The data cover the time period from June to August
Framework
The framework and the processing flow of our meteorological data mining approach are depicted in Fig. 2.
The framework of the proposed approach consists of two stages: MCS tracking and MCS data mining. Firstly, in Stage I, the presence of objects is identified, i.e., MCS structures and the existence of a certain state, such as splitting, merging, vanishing and the new emergence of the MCS. The next step is to identify the same MCS and track its trajectory using the satellite image sequence over
Related work
The meteorology community has already established a number of numerical cloud analysis and forecasting systems. Much cloud analysis work has been carried out, using the empirical models and different types of satellite imaging, or satellite observed data. For example, Souto et al. (2003) proposed a cloud analysis method for rainfall forecasting in the Galician region of Spain, applying a high-resolution non-hydrostatic numerical model to the satellite observations. Plonski et al. (2000)
MCS data mining
Identifying and tracking the MCSs are just the first steps in revealing the meteorological correlations and causalities between MCS evolvement trends and heavy rainfall occurrences. Many other research issues, such as trajectory prediction and causality analysis, are difficult to solve by numerical means. Exploration of data mining techniques is a good means for discovering the meteorological knowledge and phenomena hidden behind the massive data collections. To address this issue, a data
Experimental results
We have carried out experiments to evaluate both the MCS tracking method and the data mining approach proposed in the paper. Firstly, we compared our MCS tracking approach with the area-overlapping tracking method proposed by Arnaud et al. (1992). The manually “expert-eye-scanning” method is also adopted as a performance benchmark of experimental results. Table 1 illustrates the experimental results and comparisons of the above methods.
As shown in Table 1, MCS no. measures the number of
Conclusions
The research reported here proposes an efficient tracking, characterization and analysis tool for MCS and a novel, two-phase spatial data mining framework, based on meteorological satellite remote sensing image sequence analysis. The tool can be used to discover the knowledge used to reveal the correlations and causalities between MCS evolvement trends and possible heavy rainfall occurrences in the Yangtze River Basin of China. Experimental results show that the proposed approach simplifies and
Acknowledgments
The authors thank their collaborators in the National Satellite Meteorological Center of China Meteorological Administration and at the Hong Kong Observatory, for their stimulating discussion on domain-specific knowledge and provision of data sources. The authors are also grateful to the editors and reviewers for their valuable comments and suggestions on this manuscript.
References (13)
- et al.
Automatic tracking and characterization of African convective systems on meteosat pictures
Journal of Applied Meteorology
(1992) - et al.
Classification and Regression Trees
(1984) Computer processing of line-drawing image
Computing Surveys
(1974)- Holder, L.B., 1995. Intermediate decision trees. In: Proceedings of the 14th International Joint Conference on...
- et al.
Mesoscale organization of springtime rainstorms in Oklahoma
Monthly Weather Review
(1990) - et al.
Convective clouds and mesoscale convective systems over the Tibetan Plateau in summer
Atmosphere Science
(2002)
Cited by (30)
Extreme precipitation prediction based on neural network model – A case study for southeastern Brazil
2022, Journal of HydrologyCitation Excerpt :They applied clustering to obtain spatially well-defined groups in relation to the volume of precipitation, and were able to predict the impact of floods with a volume of 70 mm (in 6 h). Yang et al. (2007) segmented sequences of satellite images to extract information from images. Subsequently, they trained a decision tree to classify rainfall events as extreme or ordinary.
Satellite data: Big data extraction and analysis
2021, Artificial Intelligence in Data Mining: Theories and ApplicationsA mutual-information-based mining method for marine abnormal association rules
2015, Computers and GeosciencesCitation Excerpt :With remote sensing images, there are two approaches of exploring marine environmental relationships, i.e. pixel-oriented and object-oriented. After storing pixels or objects into the database, the association rule-mining has the similar processing workflow regardless of the approaches, and generally includes three components: data pretreatment to catalog the remote sensing image into Boolean or quantitative values (discretization), algorithm implementation to discover the association relationships of the discretized items, and evaluation and analysis to find the interesting patterns (Yang et al., 2007; Korting et al., 2013). In this paper, we assume that discretization has been performed, and we focus on the mining algorithm.
A new algorithm for seasonal precipitation forecast based on global atmospheric hydrological water budget
2015, Applied Mathematics and ComputationCitation Excerpt :Shamseldin [15], Ju et al. [4] and Jain and Srinivasulu [5] brought forward the season rainfall-runoff prediction method based on BP neural network. Yang et al. [22] raise a data mining approach for heavy rainfall forecasting based on satellite image sequence analysis. Schellart et al. [14], Gu et al. [1] put forward an objective prediction method of monthly precipitation.
A new classification approach for detecting severe weather patterns
2013, Computers and GeosciencesCitation Excerpt :Finally, we cite two works whose goal is more similar to the one we propose, i.e. early prediction of heavy rainfalls. Yang et al. (2007) perform the automatic identification and tracking of mesoscale convective systems from sequences of satellite images. A decision tree is trained to find correlations and causalities between (i) some meteorological and environmental variables, and (ii) the evolvement and trajectories of these systems.
Statistical cloud detection from SEVIRI multispectral images
2008, Remote Sensing of EnvironmentCitation Excerpt :Generally some decision rules are set involving a few selected spectral bands; then thresholds on the value of radiances are empirically chosen to discriminate between the cloudy and clear sky conditions. Methods based on decision rules underwent a significant evolution during recent years, even permitting to retrieve not only the presence of clouds but also several related features, e.g., tracking, shape (Yang et al., 2006, 2007). Physical methodologies suffer from some drawbacks as high variability of clouds, dependence of radiance on the emissivity of the surface, which is very difficult to estimate accurately over land, and the choice of suitable bands for the decision rules.
- ☆
This work is supported by the National Natural Science Foundation of PR China under Grant No. 60505008 and the RGC grant from Hong Kong Research Grant Council under Grant No. CUHK4132/99H.