Identifying anomalous signals in GPS data using HMMs: An increased likelihood of earthquakes?

doi:10.1016/j.csda.2011.09.019

Computational Statistics & Data Analysis

Volume 58, February 2013, Pages 27-44

https://doi.org/10.1016/j.csda.2011.09.019 Get rights and content

Abstract

A way of combining a hidden Markov model (HMM) and mutual information analysis is proposed to detect possible precursory signals for earthquakes from Global Positioning System (GPS) data. A non-linear filter, which measures the short-term deformation rate ranges, is introduced to extract anomalous signals from the GPS measurements of ground deformation. An HMM fitted to the filtered GPS measurements can classify the deformation data into different states which form proxies for elements of the earthquake cycle. Mutual information is then used to examine whether any of these states possesses any precursory characteristics. The class of GPS measurements identified by the HMM as having the largest variation of deformation rate shows some precursory information and is hence considered as a “precursory state”. The performance of possible earthquake forecasts is assessed by comparing a decision rule (based on model characteristics) with the actual outcome.

Introduction

There has been considerable interest in whether GPS measurements have any predictive power for earthquake occurrences. Roeloffs (2006) listed at least ten credible examples of tectonic earthquakes preceded by deformation rate changes, and GPS observations around the future rupture source detected slow slip leading up to the October 2004 earthquake of magnitude 6.6 in the Chuetsu area, central Japan (Ogata, 2007). These anomalies manifested as long-term pre-earthquake slip, but it is possible that precursory GPS signals may operate on a shorter time-scale, and therefore techniques which can detect or extract subtle changes (such as irregular spikes, step jumps and trend changes) in GPS measurements, which may be related to earthquakes, are necessary.

Granat and Donnellan (2002) and Granat (2003) used an HMM based method to analyze GPS data from the southern California region. From the daily displacement time series collected in Claremont, California, they observed a clear separation of the states before and after the Hector Mine quake in October 1999. Granat (2006) applied this method to the daily GPS data from 127 stations from the Southern California Integrated Geodetic Network. Approximately 70 of the stations had state changes on the day of the Hector Mine earthquake, indicating that GPS signals from earthquakes are detectable over hundreds of km. However, the different states were clearly dominated by the long-term trends of each component of the data, and the states are entered and existed only once (see e.g. Figure 5 in Granat (2006)); thus the method is not suitable for predictive purposes.

We therefore introduce a non-linear filter for the GPS process to extract distinguishable signals from the majority of the data. This nonlinear filter calculates the range of the short-term deformation rates and captures anomalies.

Studies concerning earthquake genesis suggest an acceleration in small scale seismicity before a large event (Bowman et al., 1998, Jaumé and Sykes, 1999, Vere-Jones et al., 2001) which may have a cyclic nature (Jaumé and Bebbington, 2004, Bebbington et al., 2010). GPS measurements of deformation may therefore capture this acceleration, indirectly reflecting the underlying dynamics (the unobservable or hidden states) of the earthquake system (cf. Wang et al., in press). Under this assumption, our nonlinear filter and the underlying dynamics form an HMM framework. The HMM categorizes the data into different states, each state suggesting particular dynamics, one or more of which may have a precursory character. We thus use the Viterbi algorithm (Viterbi, 1967, Forney, 1973) to track the most probable sequence of states from the GPS data, and calculate the mutual information (MI) between each state from the most likely state sequence and the earthquake occurrences to examine if there is any association. We also discuss a possible way of producing earthquake forecasts based on the mutual information results. We illustrate the method on data from the central North Island, New Zealand and data from Southern California.

Section snippets

Mutual information analysis

The association between signals (‘states’ in the HMM formulation) and the earthquake occurrences will be measured using mutual information, which quantifies the amount of information that one random variable contains about another (Cover and Thomas, 1991). It can provide evidence of any significant association between two series of events.

The mutual information of a bivariate random variable $(U, V)$ is defined as $I_{U V} = E {{log}_{2} (\frac{p_{U V} (u, v)}{p_{U} (u) p_{V} (v)})},$ where $p_{U V} (u, v)$ is the joint probability mass

Case study—data from the central North Island, New Zealand

As a case study, we consider data from the central North Island, New Zealand, located near the boundary of the Australian tectonic plate. According to DeMets et al. (1994), the Pacific and Australian tectonic plates are converging obliquely at about 42 mm/yr, accommodated by subduction of the Pacific plate and deformation of the overlying Australian plate (see e.g. Figure 1 in Reyners et al. (2006)). This area also contains the Taupo Volcanic Zone (TVZ), an active continental rift in the

Case study—data from Southern California

We will now consider another data set from a different (strike-slip, rather than subduction-related rifting) tectonic environment, with longer sequences of observations, in Southern California. The southern part of the San Andreas fault (as shown in Fig. 10) which forms the tectonic boundary between the Pacific Plate (on the west) and the North American Plate (on the east) runs through Southern California. The motion of the San Andreas fault is right-lateral strike-slip. The Pacific Plate moves

Discussion

First, we must emphasize that what we have presented here is an exploratory paper, developing preliminary techniques for analyzing potential precursory GPS signals. We have been greatly limited by the available data. There must be a lengthy GPS series, with constant variance, and moreover, a reasonable number of earthquakes need to have occurred in proximity to the GPS stations. Over the next decade or two, much more data will accumulate, in both time and space, allowing for further model

Acknowledgments

This work was supported by the Marsden Fund, administered by the Royal Society of New Zealand. We would like to thank John Beavan, Marco Brenna, the anonymous reviewers and the editor for providing helpful suggestions on an earlier draft which have improved the paper and the geophysical interpretation of our results.

References (45)

D.R. Brillinger et al.
Mutual information in the frequency domain
Journal of Statistical Planning and Inference
(2007)
J. Bulla et al.
Stylized facts of financial time series and hidden semi-Markov models
Computational Statistics & Data Analysis
(2006)
J.W. Cole et al.
Lithic types in ignimbrites as a guide to the evolution of a caldera complex, Taupo volcanic centre, New Zealand
Journal of Volcanology and Geothermal Research
(1998)
V.I. Keilis-Borok et al.
Premonitory activation of earthquake flow: algorithm M8
Physics of the Earth and Planetary Interiors
(1990)
R. Langrock et al.
Hidden Markov models with arbitrary state dwell-time distributions
Computational Statistics & Data Analysis
(2011)
A. Peltier et al.
Structures involved in the vertical deformation at Lake Taupo (New Zealand) between 1979 and 2007: new insights from numerical modelling
Journal of Volcanology and Geothermal Research
(2009)
M. Reyners
Stress and strain from earthquakes at the southern termination of the Taupo Volcanic Zone, New Zealand
Journal of Volcanology and Geothermal Research
(2010)
H. Akaike
A new look at the statistical model identification
IEEE Transactions on Automatic Control
(1974)
A. Antos et al.
Convergence properties of functional estimates for discrete distributions
Random Structures and Algorithms
(2001)
M.S. Bebbington
Identifying volcanic regimes using hidden Markov models
Geophysical Journal International
(2007)

M.S. Bebbington et al.

Repeated intermittent earthquake cycles in the San Francisco bay region

Pure and Applied Geophysics

(2010)

D.D. Bowman et al.

An observational test of the critical earthquake concept

Journal of Geophysical Research

(1998)

D.R. Brillinger

Some data analyses using mutual information

Brazilian Journal of Probability and Statistics

(2004)

J. Bulla et al.

Computational issues in parameter estimation for stationary hidden Markov models

Computational Statistics & Data Analysis

(2008)

G. Celeux et al.

Selecting hidden Markov model state number with cross-validated likelihood

Computational Statistics & Data Analysis

(2008)

R. Christensen

Log-Linear Models for Logistic Regression

(1997)

D. Collett

Modelling Binary Data

(1991)

T.M. Cover et al.

Elements of Information Theory

(1991)

C. DeMets et al.

Effect of recent revisions to the geomagnetic reversal time scale on estimates of current plate motions

Geophysical Research Letters

(1994)

C. DeMets et al.

A revised estimate of Pacific–North America motion and implications for western North America plate boundary zone tectonics

Geophysical Research Letters

(1987)

G. Di Grazia et al.

A multiparameter approach to volcano monitoring based on 4D analyses of seismo-volcanic and acoustic signals: the 2008 Mt. Etna eruption

Geophysical Research Letters

(2009)

G.D. Forney

The Viterbi algorithm

Proceedings of the IEEE

(1973)

Cited by (13)

Diagnosis of Sucker Rod Pump based on generating dynamometer cards
2019, Journal of Process Control
Citation Excerpt :
Hidden Markov Model (HMM) [12,13] is a widely used approach due to its outstanding capabilities in modeling complicated physical processes. It has become a useful statistic-based learning tool in speech recognition [14] and signals processing [15], etc. In recent research, HMM shows excellent potential in system fault diagnosing at different application contexts because this method could embody the interaction between system state and observation signal, as well as the correlative transition between each system state.
The dynamometer cards (DC) are the data shown as closed curves collected from Sucker Rod Pumps, which are essential evidence to monitor the working states in modern oil are engineering. To meet the actual needs of oil fields, recently, the computer-aided diagnosis techniques are becoming useful measurements to help engineers monitoring the wells. Nevertheless, how to collect the various kinds of fault data from a well is always a puzzle for the application of the computer-aided methods, because of a well hardly experiences many types of faulty working states. The typical solution for this problem is building an album containing DCs collected from different wells, but this approach neglects the property differences between wells, which may influence the diagnosis accuracy. In order to address this tough issue, in this paper, a novel approach regarding generating DCs is proposed based on the analysis of the mechanism of a sucker rod pump (SRP) at normal and several faulty scenarios. This method could use the productive parameters and operation rules of a well to calculate the DCs at different working states based on dynamic mechanism analysis. Subsequently, according to the data support of generating DCs, the Hidden Markov Models under a specifically designed framework is used to build the relationships between DCs and working states. At last, the proposed method is verified experimentally through the productive parameters of many wells collected from an oilfield, and then some conventional techniques are employed in the comparison studies. The obtained results demonstrate the effectiveness of the proposed method for diagnosing the working states of Sucker Rod Pumps.
The third special issue on Statistical Signal Extraction and Filtering
2013, Computational Statistics and Data Analysis
Inhomogeneous hidden semi-Markov models for incompletely observed point processes
2023, Annals of the Institute of Statistical Mathematics
Hidden-state modeling of a cross-section of geoelectric time series data can provide reliable intermediate-term probabilistic earthquake forecasting in Taiwan
2022, Natural Hazards and Earth System Sciences
Anomalies in continuous GPS data as precursors of 15 large earthquakes in Western North America during 2007-2016
2020, Earth Science Informatics
Model Checking for Hidden Markov Models
2020, Journal of Computational and Graphical Statistics

View all citing articles on Scopus

View full text

Identifying anomalous signals in GPS data using HMMs: An increased likelihood of earthquakes?

Abstract

Introduction

Section snippets

Mutual information analysis

Case study—data from the central North Island, New Zealand

Case study—data from Southern California

Discussion

Acknowledgments

Journal of Statistical Planning and Inference

Computational Statistics & Data Analysis

Journal of Volcanology and Geothermal Research

Physics of the Earth and Planetary Interiors

Computational Statistics & Data Analysis

Journal of Volcanology and Geothermal Research

Journal of Volcanology and Geothermal Research

A new look at the statistical model identification

IEEE Transactions on Automatic Control

Convergence properties of functional estimates for discrete distributions

Random Structures and Algorithms

Identifying volcanic regimes using hidden Markov models

Geophysical Journal International

Repeated intermittent earthquake cycles in the San Francisco bay region

Pure and Applied Geophysics

An observational test of the critical earthquake concept

Journal of Geophysical Research

Some data analyses using mutual information

Brazilian Journal of Probability and Statistics

Computational issues in parameter estimation for stationary hidden Markov models

Computational Statistics & Data Analysis

Selecting hidden Markov model state number with cross-validated likelihood

Computational Statistics & Data Analysis

Log-Linear Models for Logistic Regression

Modelling Binary Data

Elements of Information Theory

Effect of recent revisions to the geomagnetic reversal time scale on estimates of current plate motions

Geophysical Research Letters

A revised estimate of Pacific–North America motion and implications for western North America plate boundary zone tectonics

Geophysical Research Letters

A multiparameter approach to volcano monitoring based on 4D analyses of seismo-volcanic and acoustic signals: the 2008 Mt. Etna eruption

Geophysical Research Letters

The Viterbi algorithm

Proceedings of the IEEE