Elsevier

Signal Processing

Volume 85, Issue 3, March 2005, Pages 449-456
Signal Processing

Cumulant-based order selection of non-Gaussian autoregressive moving average models: the corner method

https://doi.org/10.1016/j.sigpro.2004.10.011Get rights and content

Abstract

This paper presents a new corner location method to model order selection of an autoregressive moving average (ARMA) model. The criterion is determined in terms of the minimum eigenvalue of the third-order cumulant matrix derived from the observed data sequence. The observed sequence is modeled as the output of an ARMA system that is excited by an unobservable input, and is corrupted by zero-mean Gaussian additive noise. The system is driven by a zero-mean independent and identically distributed (i.i.d.) non-Gaussian sequence. The method is an extension to recent results based on third-order cumulant (TOC) by Al-Smadi and Wilkes. Simulations verify the performance of the proposed method even when the observed signal is heavily corrupted by additive noise. The proposed estimator, via computer simulation, is found to outperform the TOC estimator of Al-Smadi and Wilkes.

Introduction

Autoregressive moving average (ARMA) models are mathematical models of persistence in time series analysis. A time series is a sequence of observations that are ordered in time or space. If observations are made on some phenomenon throughout time, it is most sensible to display the data in the order in which they arose. This is reasonable since successive observations will probably be dependent. Time series models or ARMA models are excellent for random data if the model type and the model order are known [8]. ARMA models find applications in many diverse fields such as in signal modeling, spectrum estimation, communications, biomedical signal processing, speech signal processing, system identification, adaptive control, etc. ARMA models describe a stationary stochastic process very accurately if the right number of parameters is used [5]. For the estimation of an ARMA model of a stationary stochastic process, three basic problems can be distinguished: estimation of the model order, estimation of the model parameters, and estimation of the expected fit of a selected model to future data [7]. Successful identification of ARMA parameters depends on the correct model order selection. Selecting the order of ARMA model is one of the most difficult problems in developing a linear model for data [13]. The determination of the number of parameters in a model used to fit a data set is a well-known and well-researched problem.

Model order determination of ARMA models is an area of research to which many efforts have been devoted in the past. This problem has been of considerable interest for some time and it has a long and continuity history. The reason for this interest is twofold: relevance of the issue in many practical applications and the unsatisfaction got by the users of the existing methods. This unsatisfaction comes from the fact that the problem of order determination is an ill-posed problem; that is, desirable features of an algorithm can hardly be written in mathematical form. In most practical cases, the model order is not known. This vital and crucial step is ignored, chosen rather arbitrarily, or assumed to be available in many of the commonly employed ARMA modeling algorithms. For example, in spectrum analysis and modeling, the problem of model order selection is of most importance [15]. That is because the accuracy of the frequency estimates depends on the estimated order of the prediction filter [14].

Pioneering work on order selection has been done by Akaike [1], Rissanen [11], Schwarz [12], and Parzen [10]. Much attention has been given in the literature for determination of the autoregressive (AR) order. However, the problem of ARMA model order determination is much more difficult [15]. One method by Liang et al. [9] is shown to yield a level of performance for a general ARMA model order estimation never before achieved. This method is derived from the minimum description length (MDL) principle [11], [12]. It is based on the minimum eigenvalue (MEV) of a family of covariance matrices computed from the observed data. Liang et al. showed that the MDL did not work well at low signal-to-noise ratio (SNR) and is computationally expensive. This is due to the prediction error used in computing MDL that is directly affected by the accuracy of the parameter estimates.

An extensive review of the literature on order determination methods was done by Al-Smadi and Wilkes [4]. In their paper, they also formulated four practical criteria for a successful model order selection technique. Among the desirable features is the robustness to additive noise. This is the main point of their paper and it is dealt by taking advantage of the insensitiveness of the higher order statistics to Gaussian noise. The paper is strongly based on the work by Liang [9]; more specifically, the properties that are developed there for the signal samples are extended in [4] to the third-order cumulant (TOC) sequence. Another important contribution of the paper [4] is that its ability to drop the white noise assumption [9]. That is, they extended the original results of Liang to the case of colored Gaussian noise. Although the problem of model order determination is an “old” problem and widely considered to be solved, recent results by Liang [9] and Al-Smadi and Wilkes [4] indicate that this is far from the actual case. In fact, the much higher accuracy of these new algorithms calls for re-examination of this important problem.

In this paper, we present a new approach to the problem of ARMA model order estimation by utilizing theoretical ideas. The proposed algorithm is based on the minimum eigenvalue of a third-order cumulants matrix derived from the observed data sequence. The observed sequence is modeled as the output of an ARMA system that is excited by an unobservable input, and is corrupted by zero-mean Gaussian additive noise. A comparison will be presented between the proposed and the TOC methods [4] for different SNRs on the output signal. The TOC method is briefly reviewed in Section 2. Section 3 describes the proposed method. Section 4 contains examples of the proposed algorithm. Section 5 is devoted to concluding remarks.

Section snippets

Problem formulation

Let x(t) denote a real-valued stationary ARMA(p,q) signal given by i=0paix(t-i)=i=0qbiw(t-i),where w(t) is the excitation sequence and x(t) is the noiseless output signal. The excitation signal w(t) is assumed to be zero-mean, non-Gaussian, independent and identically distributed (i.i.d.) process. The parameters a0,…,ap are the AR parameters; the number of AR parameters is the order p. The parameters b0,…,bq are the MA parameters; q is the MA order. We model the noisy output as x0(t)=x(t)+v(t)

Proposed algorithm

Now, the JMEV and JTOC criterion are theoretically sound. However, the row/column ratio tables’ method was observed in [9] as a method that works without any kind of mathematical proof. Even though this method provides good estimates of the true model order [9], [4], it has no justification of why it works. We will now investigate a new method to locate the corner that works and can be justified. The method is based on theoretical viewpoints and is derived from the cost function, JTOC, in Eq.

Simulation examples

In this section, we present simulation results concerning the proposed approach to model order selection from only the observed noisy output data. To study the robustness of the algorithm, a number of experiments were performed. In these experiments, the proposed method has been compared with the TOC method. The computations were performed in MATLAB. A finite length of N=1500 points was considered in each experiment. The driving input sequence is not observed. However, it is needed for

Conclusion

In this paper, the problem of estimating the model order of a general ARMA process has been investigated. The method presented is an extension to the results by Al-Smadi and Wilkes. As in the TOC method, we look for a corner in the tabulation of the cost function JTOC. The corner is detected by transforming the JTOC matrix into row vector and column vector. Each vector defines one side of the corner; i.e., the AR and MA model orders. The proposed method demonstrated superior performance over

Acknowledgments

The author would like to thank the anonymous referees whose comments and suggestions contributed to the improvement of the paper.

References (15)

  • A. Al-Smadi et al.

    Fitting ARMA models to linear non-Gaussian processes using higher order statistics

    Signal Process. Internat. J.

    (November 2002)
  • J. Rissanen

    Modeling by shortest data description

    Automatica

    (1978)
  • H. Akaike

    A new look at statistical model identification

    IEEE Trans. Automat. Control

    (1974)
  • A. Al-Smadi, D.M. Wilkes, On estimating ARMA model orders, IEEE International Symposium on Circuits and Systems, May...
  • A. Al-Smadi et al.

    Robust and accurate ARX and ARMA model order estimation of non-Gaussian processes

    IEEE Trans. Signal Process.

    (March 2002)
  • N. Beamish et al.

    A study of autoregressive and window spectral estimation

    Appl. Stat.

    (1981)
  • P. Broersen

    The quality of models for ARMA processes

    IEEE Trans. Signal Process.

    (June 1998)
There are more references available in the full text version of this article.

Cited by (10)

  • A new approach for geological pattern recognition using high-order spatial cumulants

    2010, Computers and Geosciences
    Citation Excerpt :

    High-order cumulants are combinations of moment statistical parameters that allow the characterization of non-Gaussian random variables (Billinger and Rosenblatt, 1966), and may be seen as an extension of the well-known covariance function. They are critical contributors to non-Gaussian and non-linear modelling, where related developments include cumulants for signal filtering and deconvolution (Al-Smadi, 2004; Nikias and Petropulu, 1993; Sadler et al., 1995; Delopoulos and Giannakis, 1996; Dembélé and Favier, 1998; Zhang, 2005), or for estimating the gravitational evolution of the cosmic distribution function (Gaztanaga et al., 2000) and conditional cumulants and high-order statistics in the so-called high-precision astronomy (Bernardeau et al., 2002). A key justification for the use of cumulants is the wealth of information they contain compared to second-order statistical measures (Pan and Szapudi, 2005).

  • Parameter estimation of DSSS signals in non-cooperative communication system

    2007, Journal of Systems Engineering and Electronics
  • A new technique for arma-system identification based on qr-decomposition of third order cumulants matrix

    2021, International Journal of Circuits, Systems and Signal Processing
  • Seismic wavelet extraction based on auto-regressive and moving average model and particle swarm optimization

    2011, Zhongguo Shiyou Daxue Xuebao (Ziran Kexue Ban)/Journal of China University of Petroleum (Edition of Natural Science)
View all citing articles on Scopus
View full text