Elsevier

Signal Processing

Volume 83, Issue 1, January 2003, Pages 79-90
Signal Processing

Nonparametric log spectrum estimation using disconnected regression splines and genetic algorithms

https://doi.org/10.1016/S0165-1684(02)00379-1Get rights and content

Abstract

This article proposes a new nonparametric procedure for estimating log spectra. This procedure consists of three major components: (1) a novel statistical model for modelling the unknown target log spectrum, (2) an AIC-based model selection criterion for choosing a ‘best’ fitting model, and (3) a genetic algorithm for effectively searching the ‘best’ fitting model. Numerical experiments are conducted to evaluate and compare the practical performance of the proposed procedure with some other common log spectral estimation procedures appearing in the literature. These other procedures include wavelet techniques, kernel smoothing and regression spline fitting. Empirical results suggest that the proposed procedure compares favourably against all these procedures, especially when the unknown log spectrum contains inhomogeneous structures.

Introduction

This article studies the problem of nonparametric log spectral density estimation. Various log spectrum estimation procedures that adopt the idea of smoothing the log periodogram have been proposed. These include kernel smoothing (e.g., [6], [14], [17], [20]), smoothing spline methods (e.g., [19], [23]) and wavelet techniques (e.g., [9], [18], [24]). The new procedure that this article proposes uses a different statistical model to model the target log spectrum: the target log spectrum is modelled by a series of disconnected regression splines that partition the domain of the spectrum (briefly, regression splines are a special kind of polynomial functions and a brief introduction is given in Section 2.2). As will be demonstrated below, such a model is extremely suitable for spectra with inhomogeneous structures.

It will be shown below that the problem of estimating a log spectrum using this disconnected regression spline approach can be posed as a statistical model selection problem, in which different candidate models may have different dimensions. In order to tackle this model selection problem, we employ a modified form of the Akaike's information criterion (AIC) [1] to construct an objective function for which the best fitting model for our problem is defined as its optimizer. However, optimizing this objective function would involve solving a hard and large scale optimization problem. In this work we propose using genetic algorithms for solving such problems (e.g., see [5], [7] and references given therein). Simulation results suggest that the use of genetic algorithms is very effective.

The rest of this article is organized as follows. Background material on the properties of log periodograms and regression splines is given in Section 2. In Section 3 we present our new disconnected regression spline model for log spectra. Section 4 describes the above mentioned AIC model selection method while Section 5 proposes a genetic algorithm for solving the related optimization problem. Section 6 reports simulation results. Finally, conclusions are given in Section 7.

Section snippets

Log periodogram

Suppose that {xt} is a real-valued, zero-mean strictly stationary process with unknown spectral density S, and that a finite-sized realization x0,…,x2n−1 of {xt} is observed. The periodogram is defined asI(ω)=12π×2nt=02n−1xtexp(−iωt)2,ω∈[0,2π).To simplify notation, write ωl=2πl/(2n). Since the spectral density S is symmetric about ω=π, we shall focus our discussion on S(ωl) for l=0,…,n−1.

Let γr=E(xtrxt), r=0,1,…, be the autocovariance function. If all moments of xt exist, the sum of all |γr

Log spectrum model: disconnected regression splines

This section presents our model for the log spectrum f. One characteristic of our model is that it is capable of handling f with inhomogeneous structures. The idea is to approximate f by a series of disconnected quadratic regression splines. In this way boundary points between any two adjacent quadratic regression splines can serve as locations of sudden changes in f; see Fig. 2 for an illustration. In the sequel we shall call these boundary points break points. Despite that regression splines

Model selection and parameter estimation

If f is modelled by the above disconnected regression splines , , then an estimate f̂ of f can be obtained by first estimating θ={B,b,m,{kj,αj,βj}j=1B} and then plugging the resulting estimate θ̂={B̂,b̂,m̂,{k̂j,α̂j,β̂j}j=1B̂} into (3) and (4). Hence using the disconnected regression spline approach, our original log spectrum estimation problem can be posed as a model selection problem, with each candidate model specified by a θ̂. The goal, then, is to select a “best” θ̂. Notice that different θ̂

Optimization by genetic algorithms

When the number of data points is large, finding the best estimate defined by the above AIC/BIC criterion would involve solving a hard, large scale minimization problem. Common techniques for dealing with these types of problems include knot addition, knot deletion, knot movement or combinations of them; e.g., [8], [13], [16]. However, these techniques do not provision the inclusion of break points in our model. In this article we suggest using genetic algorithms, which are also known as

Numerical experiments

This section reports results of those numerical experiments that were conducted for evaluating and comparing the practical performance of the proposed log spectrum estimation procedure with some other procedures appearing in the literature.

Conclusion

In this article an automatic log spectrum estimation procedure based on the disconnected regression spline model is proposed. Numerical experiments have been conducted to compare the practical performances of the proposed procedure with some other log spectral density estimation procedures commonly found in the literature. Empirical results from the experiments suggest that the proposed procedure, despite of its high complexity, is a very promising and reliable procedure, especially when the

References (25)

  • T.C.M Lee

    Regression spline smoothing using the minimum description length principle

    Statist. Probab. Lett.

    (2000)
  • P Stoica et al.

    Optimally smoothed periodogram

    Signal Process.

    (1999)
  • H Akaike

    A new look at the statistical model identification

    IEEE Trans. Autom. Control

    (1974)
  • D.R Brillinger

    Time Series: Data Analysis and Theory

    (1981)
  • K.P Burnham et al.

    Model Selection and Inference: A Practical Information-Theoretic Approach

    (1998)
  • G.W Cobb

    Introduction to Design and Analysis of Experiments

    (1997)
  • L Davis

    Handbook of Genetic Algorithms

    (1991)
  • J Fan et al.

    Automatic local smoothing for spectral density estimation

    Scand. J. Statist.

    (1998)
  • D.B. Fogel, Evolutionary computing, IEEE Spectrum (February 2000)...
  • J.H Friedman et al.

    Flexible parsimonious smoothing and additive modeling

    Technometrics

    (1989)
  • H.-Y Gao

    Choice of thresholds for wavelet shrinkage estimate of the spectrum

    J. Time Ser. Anal.

    (1997)
  • T.J Hastie et al.

    Generalized Additive Models

    (1990)
  • Cited by (6)

    • Linex discrepancy for bandwidth selection

      2017, Communications in Statistics: Simulation and Computation
    • Structural break estimation for nonstationary time series models

      2006, Journal of the American Statistical Association
    • Multipath model selection for UWB channels

      2006, ICUWB2006: 2006 IEEE International Conference on Ultra-Wideband - Proceedings
    View full text