The fan-chirp transform for non-stationary harmonic signals

doi:10.1016/j.sigpro.2007.01.006

Signal Processing

Volume 87, Issue 6, June 2007, Pages 1504-1522

https://doi.org/10.1016/j.sigpro.2007.01.006 Get rights and content

Abstract

This paper presents a novel transform related to the framework of warping operators when the continuous time warping mapping is a second-order polynomial. This case is proven in the paper to be the only one from the aforementioned group that marginalizes the Wigner distribution along line paths, in particular, with a fan geometry. The properties and attributes of the fan-chirp transform (FChT) along with the analytical characterization of harmonically related Gaussian chirplets bear especial relevance in the paper. This analysis shows that for chirp-periodic signals the FChT can reach the limit of the time–frequency (TF) uncertainty principle, while simultaneously keeping the cross-terms at minimum level. The formulation of the fast digital computation of the FChT is also provided in the paper. Two practical scenarios—the analysis of speech with natural intonation and bat ultrasound—validate the theoretical developments and shows manifestly the eloquent competitive performance of the new transform.

Introduction

Identifying frequency-modulated (FM) sinusoids or chirps in a signal is known to be a tough challenge for classical linear analysis. Most of the alternative solutions for this problem has come from the field of time–frequency (TF) analysis, mainly in the form of Cohen's class bilinear time–frequency distributions (TFD) [1]. However, the long debate on the relevance and meaning of the cross-terms [2] or the dilemma on the need of positive TFDs [3], [4] have moved the attention towards different approaches, such as matching pursuits [5] over redundant chirplet dictionaries [6], [7], [8], or chirp-based transforms that marginalize the Wigner–Ville distribution according to certain geometries [10], [11], [12], [13], [14]. Although the redundant dictionary remains the most popular method for chirplet decomposition so far, chirp-based transforms are especially interesting because they can provide a broader picture of the TF content of the signal.

Several signal processing transforms are related to the term “chirp”: the Chirp-Z [9] and the “Chirplet” transforms [10] contain explicitly the term, whilst the fractional Fourier transform [11], [12] and the warped-time operators [13], [14] are conceptually related to it. The Chirp-Z transform [9] is an efficient algorithm for calculating discrete Fourier transforms (DFTs) at frequencies not related to a power-of-two fraction of the sampling bandwidth. Although a discrete-time chirp signal is used in the mechanism, the usage of this algorithm is not actually related to the context of this paper.

The first relevant chirp-based transform, the chirplet transform (CT) [10], is described by the inner product between the signal and a chirplet as $X_{β} (f) = \int_{- \infty}^{\infty} x (t) g_{ϱ, τ, f, β}^{*} (t) d t,$ where t is time, $x (t)$ is the analysis signal, $^{*}$ denotes complex conjugate, and $g_{ϱ, τ, f, β} (t)$ is a Gaussian chirplet of unit energy $g_{ϱ, τ, f, β} (t) = \frac{e^{- (1 / 2) ((t - τ) / ϱ)^{2}}}{\sqrt[4]{π ϱ^{2}}} e^{j 2 π (f (t - τ) + (1 / 2) β (t - τ)^{2})} .$ Here $ν$ is the instantaneous frequency (IF) at $t = τ, β$ the frequency variation rate and $ϱ$ the time spread. By disregarding the Gaussian term and setting $τ = 0$ for simplicity, it is easy to deduce that the squared magnitude of the Chirp(let) transform is $| X_{β} (f) |^{2} = \int_{- \infty}^{\infty} {WD}_{x} (t, f - β t) d t,$ where ${WD}_{x} (t, f)$ is the Wigner–Ville distribution [2] of $x (t)$ (henceforth Wigner distribution, WD). Thus, the CT yields the “slanted” marginal of the WD.

Another well-established chirp-based transform is the fractional Fourier transform (FrFT) [11] $X_{θ} (u) = \int_{- \infty}^{\infty} x (v) K_{θ} (v, u) d v,$ where $K_{θ} (v, u)$ is the transformation kernel [12]. The FrFT involves products with linear-FM chirps in such a way that it yields the marginalization of the WD along the angular direction $θ$ , i.e., $| X_{θ} (u) |^{2} = \int_{- \infty}^{\infty} {WD}_{x} (c u - s v, s u + c v) d v,$ where $c = \cos θ$ and $s = \sin θ$ .

The last chirp-based transform considered here is the warping operator [13], [14], [15], defined as $X_{ψ (\cdot)} (f) = \int_{- \infty}^{\infty} x (ψ (t)) \sqrt{| ψ^{'} (t) |} e^{- j 2 π ft} d t,$ where $ψ (t)$ is a continuous differentiable time mapping and $ψ^{'} (t)$ its derivative. An equivalent formulation to this warped-time Fourier transform is $X (f; φ (\cdot)) = \int_{- \infty}^{\infty} x (t) \sqrt{| φ^{'} (t) |} e^{- j 2 π f φ (t)} d t,$ where $φ (t) = ψ^{- 1} (t)$ .¹ Eq. (7) represents the inner product of signal $x (t)$ with a non-linear chirp, which is the basic mechanism in TF redundant dictionaries for chirp-based signal decomposition [8]. The warped-time framework has also given rise to new TF distributions, such as the generalized warped Cohen's class (GWCC) [16], which introduces new ways of interpreting the TF content.

This paper tackles the warping operator (7) constrained to the mapping $φ (t)$ being a second-order polynomial. This case is proven here to marginalize the WD along straight line paths. Thus, it can be seamlessly compared against other linear chirp-based transforms, such as the CT and FrFT (see Fig. 1 for an introductory illustration), and studied with more detail apart from the general framework. The organization of the paper follows: Section 2 introduces the analysis and synthesis equations of the proposed fan-chirp transform (FChT); in Section 3 its main basic attributes, the marginalization geometry and representation of chirp-periodic signals are studied; Section 4 contains a discussion on previous works related to the FChT; Section 5 addresses the estimation of the only user-defined parameter of the FChT, the chirp rate, in order to better match the TF geometry of the analysis signal; Section 6 elaborates on the practical aspects of the digital implementation; Section 7 presents the performance evaluation of the FChT on synthetic and real scenarios, namely, the analysis of natural speech and sound of mammals; the conclusions close the paper.

Section snippets

The FChT

The analysis formula of the FChT of signal $x (t)$ is defined as $X (f, α) ≜ \int_{- \infty}^{\infty} x (t) \sqrt{| φ_{α}^{'} (t) |} e^{- j 2 π f φ_{α} (t)} d t,$ where t is time, f is frequency,² and $φ_{α} (t)$ is the second-order polynomial controlled by the so-called chirp rate $α$ $φ_{α} (t) ≜ (1 + \frac{1}{2} α t) t .$ The FChT involves the inner product between $x (t)$ and the complex signals $ξ (t, f, α) = \sqrt{| 1 + α t |} e^{j 2 π f (1 + (1 / 2) α t) t}$ which are chirps whose IF, defined as the time derivative of the exponent, varies linearly over time $ν (t) =$

Properties

In this section we derive the most relevant property of the FChT regarding the marginalization of the WD; Parseval's theorem along with other basic properties are also derived; the TF resolution over harmonically related chirplets covers the final part of the section.

Prior related contributions

The FChT compares seamlessly against the Chirplet [10], the fractional Fourier [12], and the Fourier transforms, all yielding the marginalization of the TF plane along different straight line geometries (see Fig. 1 again for an illustrative comparison). Additionally, it is important to address here previous works [19], [20], [21], [22] related to the FChT to a larger or lesser extent.

The work [19] proposes the so-called Harmonic fractional Fourier transform (HFT) $HFT (ω) ≜ \int_{- \infty}^{\infty} x (t) e^{- j ω (1 + A t) t} d t$ as

Chirp rate estimation

The most important aspect of the FChT in practical scenarios regards the adequacy of the law $φ_{α} (t)$ to the actual TF characteristics of the signal. Signals with fan geometry are found in practice only in short segments, such as in case of speech [17] or the song of some mammals [18]. In that sense, two options are at hand: either to use a warping function $φ (t)$ with more degrees of freedom than $φ_{α} (t)$ for matching the possible non-linear geometry, or to parse the signal into short segments and

Discrete-time formulation

In analogy to the discrete-time Fourier transform, the formulation of the discrete-time FChT could be thought as the continuous-time transform of signal $x (t) = \sum_{n = - \infty}^{\infty} x [n] δ (t - {nT}_{s}),$ where $x [n]$ is the discrete-time signal and $T_{s}$ is the sampling interval. This way of proceeding results in $X (Ω, \hat{α}) = \sum_{n = - \infty}^{\infty} x [n] \sqrt{| 1 + \hat{α} n |} e^{- j Ω (1 + (1 / 2) \hat{α} n) n},$ where n is discrete time, $Ω$ is frequency, and the analysis chirp rate $\hat{α}$ is the discrete counterpart of the chirp rate $α$ , that is, $\hat{α} = α T_{s} .$ Likewise, frequency $Ω$ is related to

Results

The first experiment provides a comparison among the CT, FrFT and FChT on a toy synthetic example. The synthetic signal corresponds to a train of pulses non-equidistantly spaced, in such way that the fundamental frequency changes in a linear fashion; for the sake of clarity, the spectral envelope delineated by the harmonics is not flat. The mentioned transforms were applied over that signal, in such a way that the resolution achieved around the fourth harmonic $k =+ 4$ were the highest. The results

Conclusions

The fan-chirp transform (FChT) is an effective method for representing signals with fan time–frequency (TF) structure. This type of signals, denoted here as chirp-periodic signals, are common in nature, such as segments of the song of mammals and human speech in natural intonation. The FChT possesses the property of marginalizing the WD along straight line paths according to a fan geometry. This geometry is entirely described by the user-defined chirp rate $α$ , or by its inverse, the focal point

References (29)

M. Kepesi et al.
Adaptive chirp-based time–frequency analysis of speech signals
Speech Commun.
(May 2006)
L. Cohen
Time–frequency Analysis
(1995)
Mecklenbräuker, Hlawatsch (Eds.), The Wigner Distribution—Theory and Applications in Signal Processing, Elsevier,...
L. Cohen et al.
Positive time–frequency distribution functions
IEEE Trans. Acoust. Speech Signal Process.
(1985)
P.J. Loughlin et al.
Construction of positive time–frequency distributions
IEEE Trans. Signal Process.
(October 1994)
S. Mallat et al.
Matching pursuit with time–frequency dictionaries
IEEE Trans. Signal Process.
(December 1993)
A. Bultan
A four-parameter atomic decomposition of chirplets
IEEE Trans. Signal Process.
(March 1999)
Q. Yin et al.
A fast refinement for adaptive Gaussian chirplet decomposition
IEEE Trans. Signal Process.
(June 2002)
A. Papandreou-Suppapola et al.
Analysis and classification of time-varying signals with multiple time–frequency structures
IEEE Signal Process. Lett.
(March 2002)
L.R. Rabiner et al.
The Chirp-Z transform algorithm and its application
Bell System Technical J.
(May–June 1969)

S. Mann et al.

The chirplet transform: physical considerations

IEEE Trans. Signal Process.

(November 1995)

H.M. Ozaktas et al.

The Fractional Fourier Transform with Applications in Optics and Signal Processing

(2001)

L.B. Almeida

The fractional Fourier transform and time–frequency representations

IEEE Trans. Signal Process.

(November 1994)

R.G. Baraniuk, D.L. Jones, Warped wavelet bases: unitary equivalence and signal processing, in: Proceedings of IEEE...

Cited by (54)

A novel channel estimate for noise robust speech recognition
2024, Computer Speech and Language
We propose a novel technique to estimate the channel characteristics for robust speech recognition. The method focuses on reliable time–frequency speech patches which are highly independent of the noise condition. Combined with a root-based approximation of the logarithm in the MFCC computation, this reduces the variance caused by the noise on the spectral features, and therefore also the constrain on the acoustic model in a multi-style training setup. We show that compared to the standard mean normalization, the proposed method estimates the channel equally well under clean conditions and better under noisy conditions. When integrated in the feature extraction pipeline, we show improvements in speech recognition accuracy on noisy speech and a status quo on clean speech. Our experiments reveal that this method helps the most for generative models that need to model the complex noise variability, and less so for discriminative models, which can learn to ignore noise instead of accurately modeling it. Our approach outperforms the state of the art on the noisy Aurora4 task.
An iterative approach for spectrogram reassignment of frequency modulated multicomponent signals
2020, Mathematics and Computers in Simulation
Citation Excerpt :
Due to non stationarity1 of chirp-like signals, Time–frequency (TF) transforms are recommended for their analysis instead of simple Fourier Transform (FT). Linear TF/time-scale transforms and quadratic distributions have been proposed to address this issue [1,7,9,10,14,20,28,34,36]. Short Time Fourier Transform (STFT), Gabor Transform and the Wavelet Transform are some of the most popular linear transforms.
In this paper, an evolution law for the modulus of the Short Time Fourier Transform, referred as spectrogram, of a frequency modulated multicomponent signal is proposed. Based on this result, an iterative reallocating method for the enhancement of spectrogram resolution is proposed. Compared to the standard reassignment method, the presented procedure allows us to obtain a time–frequency representation which better localizes the individual modes of a multicomponent signal even in the non separable case. The proposed method is computationally advantageous and robust to modes interference, then it could be employed for accurate instantaneous frequency estimation, which is a fundamental goal in applications dealing with non stationary signals such as radar, surveillance and audio signals.
An improvement of time-reassigned synchrosqueezing transform algorithm and its application in mechanical fault diagnosis
2020, Measurement: Journal of the International Measurement Confederation
Citation Excerpt :
In most practical applications, signals under consideration are non-stationary [1]. The non-stationary signals processed by the time–frequency (TF) analysis (TFA) method could be roughly classified into two types: harmonic-like signals [2] and impulsive-like signals [3]. For harmonic signals, the conventional synchrosqueezing transform (SST) [4] can achieve good analysis results.
Time-reassigned synchrosqueezing transform (TSST) provides a high concentration of the time–frequency (TF) representation of impulsive-like signals. However, when TSST is used to process non-stationary real-world signals, it often faces a large number of data points which will cause huge computational overhead. To address this problem, an improved algorithm for TSST is proposed in this paper. The improved algorithm mainly includes the following ideas: increasing the window function shifting point number H and decreasing the point number N_f of Fourier transform combining fast matrix operation and time–frequency band selection and reassignment. Moreover, in order to ensure phase consistency in the time-reassigned operation, phase correction is carried out for the TF matrix. The proposed method is validated with numerical simulation and then applied in mechanical fault diagnosis. The results show that the method can greatly improve computational efficiency while ensuring a high concentration of the TF representation.
Parameterised time-frequency analysis methods and their engineering applications: A review of recent advances
2019, Mechanical Systems and Signal Processing
It is well known that time-frequency analysis (TFA) characterises signals in time-frequency plane. Theoretically, traditional non-parameterised TFA can analyze any signal, but it is unable to provide the best representation for complex signals. On the other hand, parameterised TFAs provide a better representation of signal by parameterising kernel functions using additional parameters. Recently, parameterised TFAs have attracted widespread attention. In this paper, we first briefly revisit non-parameterised TFAs, then further discuss adaptive TFAs developed from non-parameterised TFAs, and then review four types of recent parameterised TFAs: Warped TFAs, Chirplet transforms, parameterised atomic decomposition, and parameterised TFA affine. From underlying principles and implementation point of view, we introduced the relationships, advantages and disadvantages of different types of parameterised TFAs. At the same time, we summarized the application of parameterised TFA in various fields and discussed research directions and trends in parameterised TFA study. This review focuses on a class of methods in TFA, parameterised TFA, summarizing its latest research progress and related engineering applications, so as to provide reference and guidance for researchers applying parametric TFA in different fields.
A fast time-frequency multi-window analysis using a tuning directional kernel
2018, Signal Processing
Citation Excerpt :
In fact, these signals are often non-stationary and consist of numerous components, including noise [2]. Despite many existing approaches [3–10], this paper focuses on well known tools such as the STFT and one of its variants, the chirplet transform (CT) [11,12]. This choice is motivated by results of our recent researches [13,14] which allow to directly use the STFT in order to locally estimate the CR in the TF domain.
In this paper, a novel approach for time-frequency analysis and detection, based on the chirplet transform and dedicated to non-stationary as well as multi-component signals, is presented. Its main purpose is the estimation of spectral energy, instantaneous frequency (IF), spectral delay (SD), and chirp rate (CR) with a high time-frequency resolution (separation ability) achieved by adaptive fitting of the transform kernel. We propose two efficient implementations of this idea, which allow to use the fast Fourier transform (FFT). In the first one, referred to as “self-tuning”, a previously proposed CR estimation is used for a local fitting of the chirplet kernel over time. For this purpose, we use the CR associated with the dominant (prominent) component. In the second one, we define a new measure for evaluating at each time-frequency point, how the used analyzing window is matched to the signal. This measure is defined as the absolute difference between the estimated CR and the CR parameter associated to the used analysis window. Our method is able to produce combined time-frequency distributions of the spectral energy, IF, SD, and CR. They are obtained using several classical chirplet transforms with analysis windows of various CRs. The compositions are made by finding the lowest fitting measure for every time-frequency points over all transforms. Finally, we assess the robustness of the methods by a detection application and time-frequency localization, both in the presence of high additive white Gaussian noise (additive white Gaussian noise (AWGN)) as well as we present many time-frequency (TF) images of synthetic and real-world signals.
Query by humming: Automatically building the database from music recordings
2014, Pattern Recognition Letters
Singing or humming to a music search engine is an appealing multimodal interaction paradigm, particularly for small sized portable devices that are ubiquitous nowadays. The aim of this work is to overcome the main shortcoming of the existing query-by-humming (QBH) systems: their lack of scalability in terms of the difficulty of automatically extending the database of melodies from audio recordings. A method is proposed to extract the singing voice melody from polyphonic music providing the necessary information to index it as an element in the database. The search of a query pattern in the database is carried out combining note sequence matching and pitch time series alignment. A prototype system was developed and experiments are carried out pursuing a fair comparison between manual and automatic expansion of the database. In the light of the obtained performance (85% in the top-10), which is encouraging given the results reported to date, this can be considered a proof of concept that validates the approach.

View all citing articles on Scopus

View full text

The fan-chirp transform for non-stationary harmonic signals

Abstract

Introduction

Section snippets

The FChT

Properties

Prior related contributions

Chirp rate estimation

Discrete-time formulation

Results

Conclusions

Speech Commun.

Time–frequency Analysis

Positive time–frequency distribution functions

IEEE Trans. Acoust. Speech Signal Process.

Construction of positive time–frequency distributions

IEEE Trans. Signal Process.

Matching pursuit with time–frequency dictionaries

IEEE Trans. Signal Process.

A four-parameter atomic decomposition of chirplets

IEEE Trans. Signal Process.

A fast refinement for adaptive Gaussian chirplet decomposition

IEEE Trans. Signal Process.

Analysis and classification of time-varying signals with multiple time–frequency structures

IEEE Signal Process. Lett.

The Chirp-Z transform algorithm and its application

Bell System Technical J.

The chirplet transform: physical considerations

IEEE Trans. Signal Process.

The Fractional Fourier Transform with Applications in Optics and Signal Processing

The fractional Fourier transform and time–frequency representations

IEEE Trans. Signal Process.