Identifying anomalous signals in GPS data using HMMs: An increased likelihood of earthquakes?

https://doi.org/10.1016/j.csda.2011.09.019Get rights and content

Abstract

A way of combining a hidden Markov model (HMM) and mutual information analysis is proposed to detect possible precursory signals for earthquakes from Global Positioning System (GPS) data. A non-linear filter, which measures the short-term deformation rate ranges, is introduced to extract anomalous signals from the GPS measurements of ground deformation. An HMM fitted to the filtered GPS measurements can classify the deformation data into different states which form proxies for elements of the earthquake cycle. Mutual information is then used to examine whether any of these states possesses any precursory characteristics. The class of GPS measurements identified by the HMM as having the largest variation of deformation rate shows some precursory information and is hence considered as a “precursory state”. The performance of possible earthquake forecasts is assessed by comparing a decision rule (based on model characteristics) with the actual outcome.

Introduction

There has been considerable interest in whether GPS measurements have any predictive power for earthquake occurrences. Roeloffs (2006) listed at least ten credible examples of tectonic earthquakes preceded by deformation rate changes, and GPS observations around the future rupture source detected slow slip leading up to the October 2004 earthquake of magnitude 6.6 in the Chuetsu area, central Japan (Ogata, 2007). These anomalies manifested as long-term pre-earthquake slip, but it is possible that precursory GPS signals may operate on a shorter time-scale, and therefore techniques which can detect or extract subtle changes (such as irregular spikes, step jumps and trend changes) in GPS measurements, which may be related to earthquakes, are necessary.

Granat and Donnellan (2002) and Granat (2003) used an HMM based method to analyze GPS data from the southern California region. From the daily displacement time series collected in Claremont, California, they observed a clear separation of the states before and after the Hector Mine quake in October 1999. Granat (2006) applied this method to the daily GPS data from 127 stations from the Southern California Integrated Geodetic Network. Approximately 70 of the stations had state changes on the day of the Hector Mine earthquake, indicating that GPS signals from earthquakes are detectable over hundreds of km. However, the different states were clearly dominated by the long-term trends of each component of the data, and the states are entered and existed only once (see e.g. Figure 5 in Granat (2006)); thus the method is not suitable for predictive purposes.

We therefore introduce a non-linear filter for the GPS process to extract distinguishable signals from the majority of the data. This nonlinear filter calculates the range of the short-term deformation rates and captures anomalies.

Studies concerning earthquake genesis suggest an acceleration in small scale seismicity before a large event (Bowman et al., 1998, Jaumé and Sykes, 1999, Vere-Jones et al., 2001) which may have a cyclic nature (Jaumé and Bebbington, 2004, Bebbington et al., 2010). GPS measurements of deformation may therefore capture this acceleration, indirectly reflecting the underlying dynamics (the unobservable or hidden states) of the earthquake system (cf. Wang et al., in press). Under this assumption, our nonlinear filter and the underlying dynamics form an HMM framework. The HMM categorizes the data into different states, each state suggesting particular dynamics, one or more of which may have a precursory character. We thus use the Viterbi algorithm (Viterbi, 1967, Forney, 1973) to track the most probable sequence of states from the GPS data, and calculate the mutual information (MI) between each state from the most likely state sequence and the earthquake occurrences to examine if there is any association. We also discuss a possible way of producing earthquake forecasts based on the mutual information results. We illustrate the method on data from the central North Island, New Zealand and data from Southern California.

Section snippets

Mutual information analysis

The association between signals (‘states’ in the HMM formulation) and the earthquake occurrences will be measured using mutual information, which quantifies the amount of information that one random variable contains about another (Cover and Thomas, 1991). It can provide evidence of any significant association between two series of events.

The mutual information of a bivariate random variable (U,V) is defined as IUV=E{log2(pUV(u,v)pU(u)pV(v))}, where pUV(u,v) is the joint probability mass

Case study—data from the central North Island, New Zealand

As a case study, we consider data from the central North Island, New Zealand, located near the boundary of the Australian tectonic plate. According to DeMets et al. (1994), the Pacific and Australian tectonic plates are converging obliquely at about 42 mm/yr, accommodated by subduction of the Pacific plate and deformation of the overlying Australian plate (see e.g. Figure 1 in Reyners et al. (2006)). This area also contains the Taupo Volcanic Zone (TVZ), an active continental rift in the

Case study—data from Southern California

We will now consider another data set from a different (strike-slip, rather than subduction-related rifting) tectonic environment, with longer sequences of observations, in Southern California. The southern part of the San Andreas fault (as shown in Fig. 10) which forms the tectonic boundary between the Pacific Plate (on the west) and the North American Plate (on the east) runs through Southern California. The motion of the San Andreas fault is right-lateral strike-slip. The Pacific Plate moves

Discussion

First, we must emphasize that what we have presented here is an exploratory paper, developing preliminary techniques for analyzing potential precursory GPS signals. We have been greatly limited by the available data. There must be a lengthy GPS series, with constant variance, and moreover, a reasonable number of earthquakes need to have occurred in proximity to the GPS stations. Over the next decade or two, much more data will accumulate, in both time and space, allowing for further model

Acknowledgments

This work was supported by the Marsden Fund, administered by the Royal Society of New Zealand. We would like to thank John Beavan, Marco Brenna, the anonymous reviewers and the editor for providing helpful suggestions on an earlier draft which have improved the paper and the geophysical interpretation of our results.

References (45)

  • M.S. Bebbington et al.

    Repeated intermittent earthquake cycles in the San Francisco bay region

    Pure and Applied Geophysics

    (2010)
  • D.D. Bowman et al.

    An observational test of the critical earthquake concept

    Journal of Geophysical Research

    (1998)
  • D.R. Brillinger

    Some data analyses using mutual information

    Brazilian Journal of Probability and Statistics

    (2004)
  • J. Bulla et al.

    Computational issues in parameter estimation for stationary hidden Markov models

    Computational Statistics & Data Analysis

    (2008)
  • G. Celeux et al.

    Selecting hidden Markov model state number with cross-validated likelihood

    Computational Statistics & Data Analysis

    (2008)
  • R. Christensen

    Log-Linear Models for Logistic Regression

    (1997)
  • D. Collett

    Modelling Binary Data

    (1991)
  • T.M. Cover et al.

    Elements of Information Theory

    (1991)
  • C. DeMets et al.

    Effect of recent revisions to the geomagnetic reversal time scale on estimates of current plate motions

    Geophysical Research Letters

    (1994)
  • C. DeMets et al.

    A revised estimate of Pacific–North America motion and implications for western North America plate boundary zone tectonics

    Geophysical Research Letters

    (1987)
  • G. Di Grazia et al.

    A multiparameter approach to volcano monitoring based on 4D analyses of seismo-volcanic and acoustic signals: the 2008 Mt. Etna eruption

    Geophysical Research Letters

    (2009)
  • G.D. Forney

    The Viterbi algorithm

    Proceedings of the IEEE

    (1973)
  • Cited by (13)

    • Diagnosis of Sucker Rod Pump based on generating dynamometer cards

      2019, Journal of Process Control
      Citation Excerpt :

      Hidden Markov Model (HMM) [12,13] is a widely used approach due to its outstanding capabilities in modeling complicated physical processes. It has become a useful statistic-based learning tool in speech recognition [14] and signals processing [15], etc. In recent research, HMM shows excellent potential in system fault diagnosing at different application contexts because this method could embody the interaction between system state and observation signal, as well as the correlative transition between each system state.

    • The third special issue on Statistical Signal Extraction and Filtering

      2013, Computational Statistics and Data Analysis
    • Inhomogeneous hidden semi-Markov models for incompletely observed point processes

      2023, Annals of the Institute of Statistical Mathematics
    • Model Checking for Hidden Markov Models

      2020, Journal of Computational and Graphical Statistics
    View all citing articles on Scopus
    View full text