Europe PMC

This website requires cookies, and the limited processing of your personal data in order to function. By using the site you are agreeing to this as outlined in our privacy notice and cookie policy.

Abstract 


This paper aims at estimating causal relationships between signals to detect flow propagation in autoregressive and physiological models. The main challenge of the ongoing work is to discover whether neural activity in a given structure of the brain influences activity in another area during epileptic seizures. This question refers to the concept of effective connectivity in neuroscience, i.e. to the identification of information flows and oriented propagation graphs. Past efforts to determine effective connectivity rooted to Wiener causality definition adapted in a practical form by Granger with autoregressive models. A number of studies argue against such a linear approach when nonlinear dynamics are suspected in the relationship between signals. Consequently, nonlinear nonparametric approaches, such as transfer entropy (TE), have been introduced to overcome linear methods limitations and promoted in many studies dealing with electrophysiological signals. Until now, even though many TE estimators have been developed, further improvement can be expected. In this paper, we investigate a new strategy by introducing an adaptive kernel density estimator to improve TE estimation.

Free full text 


Logo of halLink to Publisher's site
Conf Proc IEEE Eng Med Biol Soc. Author manuscript; available in PMC 2014 Feb 3.
Published in final edited form as:
Conf Proc IEEE Eng Med Biol Soc. 2013; 2013: 4342–4345.
https://doi.org/10.1109/EMBC.2013.6610507
PMCID: PMC3898961
HALMS: HALMS881459
PMID: 24110694

Exploring neural directed interactions with transfer entropy based on an adaptive kernel density estimator

Abstract

This paper aims at estimating causal relationships between signals to detect flow propagation in autoregressive and physiological models. The main challenge of the ongoing work is to discover whether neural activity in a given structure of the brain influences activity in another area during epileptic seizures. This question refers to the concept of effective connectivity in neuroscience, i.e. to the identification of information flows and oriented propagation graphs. Past efforts to determine effective connectivity rooted to Wiener causality definition adapted in a practical form by Granger with autoregressive models. A number of studies argue against such a linear approach when nonlinear dynamics are suspected in the relationship between signals. Consequently, nonlinear nonparametric approaches, such as transfer entropy (TE), have been introduced to overcome linear methods limitations and promoted in many studies dealing with electrophysiological signals. Until now, even though many TE estimators have been developed, further improvement can be expected. In this paper, we investigate a new strategy by introducing an adaptive kernel density estimator to improve TE estimation.

Keywords: bioelectric phenomena, brain models, causality, electroencephalography, entropy, medical signal detection, neurophysiology, nonlinear dynamical systems, stochastic processes

I. Introduction

In neuroscience, recent works have been devoted to detecting effective connectivity [1] defined as a causal influence of the dynamics of a first system on the dynamics of a second one. In this context, two questions are commonly addressed: (i) how to choose a formal quantitative definition of effective connectivity and (ii) how to provide corresponding estimators defined as functions of signals recorded in both systems. Nowadays two approaches contrast. The first one does not rely on an underlying physiological model while the second one, namely dynamical causal modeling, does. In this contribution, we are only concerned with the first approach including linear and nonlinear methodologies, and we consider nonlinear nonparametric entropic characterization of this connectivity using the so-called transfer entropy (TE). When computed on a stationary bivariate time series (X, Y), this quantity measures the amount of information transferred from channel X (resp. Y) to channel Y (resp. X) and is denoted TExy (resp. TEyx) hereafter. It was introduced by Schreiber [2] and defined as the Kullback-Leibler divergence between two different predictive probability distributions of Yn. The first distribution is defined conditionally to amplitudes of Xn, n′ < n and Yn, n′ < n, at time instants prior to n, and the second one is only defined conditionally to Yn, n′ < n. A simple exchange of X and Y leads to the definition of TEyx. Formal definition is given by

TExy=Rk+l+1log(pYn+1/Ynk,Xnl(y,yk,xl)pYn+1/Ynk(y,yk))×pYn+1,Ynk,Xnl(y,yk,xl)d(y,yk,xl)
(1)

with Unm(Un,,Un-m+1), pU (u) denotes the probability density of a random vector U at u, k and l are the predictor dimensions. Let us note that the definition of TE is qualitatively consistent with Wiener and Granger approaches, which only compare mean square prediction errors. In theory, TExy is never negative and is equal to zero iif

pYn+1/Ynk,Xnl(y,yk,xl)=pYn+1/Ynk(y,yk),yR
(2)

The choice of k and l can impact drastically on theoretical TE value and, without a priori information on the hidden nonlinear dynamics generating (X, Y), this issue is not trivial and is not discussed in this paper. Given the theoretical index and an N point observation (X, Y)n, n = 1..N, we have to determine an estimation procedure to compute TE^xy from

(yn+1,ynk,xnl),n=n0,,N-1(n0=1+max(k,l))
(3)

If all probability densities are known, the trivial Monte Carlo estimator could be:

1N-n0n=n0N-1log(pYn+1,Ynk,Xnl(yn+1,ynk,xnl)pYnk(ynk)pYn+1,Ynk(yn+1,ynk)pYnk,Xnl(ynk,xnl))
(4)

Since the densities are unknown, a first method consists in replacing each density pU by an estimation [p with hat]U possibly obtained by a fixed size kernel estimation approach as proposed in [3]. A second method computing estimations log(pU)^ of log (pU) from K Nearest Neighbors (KNN) selection was developed for mutual information estimation in [4] and applied in [5]. It is also possible to compute the estimation of a density pU with adaptive size kernels. We propose this improvement to compute TE and compare our results on linear Gaussian models (i) with corresponding theoretical values, and (ii) with TE estimated using a fixed bandwidth kernel and/or KNN kernel as in [5]. Then, these methods are compared on a neurophysiological model [11].

II. Methods and Materials

A. Kernel methods

In order to reconstruct a probability density pU from N observed states un, the general form of a fixed kernel density estimator (FKDE) of bandwidth h is given by:

p^U(u)=1Nn=1N1hK(u-unh)=1Nn=1NKh(u-un)
(5)

where K denotes a kernel function. For joint density probability estimated at (y, yk, xl), we write:

p^YYX(y,yk,xl)=1ΔmΔ1hyk+1hxlK(y-ym+1hy)j=1kK(yk(j)-ym-j+1hy)j=1lK(xl(j)-xm-j+1hx)
(6)

where Δ = {m / max(k, l) < mN − 1, |mn| > τc}, and τc is the decorrelation time defined as the minimum delay leading to a correlation coefficient equal to 0.1. Parameters hx, hy are the respective kernel bandwidths for signals x and y which are normalized. A fixed bandwidth (independent of m) is unable to deal satisfactorily with the tails of the distribution without over smoothing the main part of this distribution. To avoid this issue, two methods help in estimating a density f at a point x :

f^1(x)=1Nm=1NKhx(x-xm)wherehxh(x)
(7)

f^2(x)=1Nm=1NKhm(x-xm)wherehmh(xm)
(8)

The first method (7) uses a x dependent bandwidth, this bandwidth being unchanged for different points xm. One example of this method is a KNN estimator [6]. The second method (8) uses a xm dependent bandwidth, which does not depend on x, leading to the adaptive kernel density estimator (AKDE) [7] we adopted.

B. Adaptive kernel density estimator

AKDE is an improved alternative to FKDE. Given an initial bandwidth h and a first FKDE based estimation f0, Abramson [8] adapted the bandwidth according to these initial quantities:

hmh0/f^0(xm)
(9)

In [9], Hwang extended this procedure:

hm=(f^0(xm)/g)-rh0
(10)

where g is the geometric mean of (f0 (xm))m, i.e.

logg=1Nm=1Nlogf^0(xm)
(11)

r is an user defined sensitivity parameter generally satisfying 0 < r < 1. For different dimensions, the value of r should change. However, it is difficult to choose the proper r for four different probability density estimations. We suggest to compute p^YYX(yn+1,ynk,xnl) before its substitution in (4) using (10) and to leave the other densities unchanged. The three steps of the proposed algorithm are as follows:

  • Step 1: for nh ([similar, equals]10) values vi of h, 0 < v1 <..<vnh ≤3, compute fixed kernel density estimations at (yn+1,ynk),(ynk,xnl),ynk and (yn+1, ynk,xnl) ;

  • Step 2: compute hm using (10) only to update p^YYX(yn+1,ynk,xnl) leaving the other densities unchanged, and search the set Vh of values vi leading to unimodality of fvi:r{0<r1,,r20<1}TE^xy(hm(vi,r))=fvi(r) for vi [set membership] {v1,.., vnh}hence eliminating monotonic curves (no multimodal curve was observed);

  • Step 3: finally retain the maximum value TE^xy=max(h,r)Vh×]0;1[(fh(r)).

Experimentally, the selected value hs in the last step is often close to the initial bandwidth h0. Clearly, the computation time is increased with AKDE (multiplied by 20) for updating the density in step 2.

III. Experimental Results

We tested our method with Gaussian kernels to compute TEx y and TEy x on two kinds of signals. The first kind included two toy linear AutoRegressive (AR) models and the second one was a realistic EEG model. Predictor dimensions k and l were chosen equal to the corresponding AR models orders estimated by the generalized Bayesian Information Criterion as in [10]. For AR models, the decorrelation time was τc=20 and experiments were repeated 200 times on 1024-point signals to get averaged values.

A. Unidirectional linear model

For the first linear stochastic system (model 1), the following two signals were generated:

{x(n)=1.3435x(n-1)-0.9025x(n-2)+e1(n)y(n)=0.5x(n-3)-0.4y(n-2)+e2(n)
(12)

where e1 and e2 were independent white Gaussian noises with zero means and unit variances.

Fig. 1 displays TE computed with a fixed bandwidth h. Experimental transfer entropy TE^xy is smaller than the theoretical value which is equal to Granger Causality index divided by 2 (GC/2 was computed from the model coefficients) in the case of Gaussian signals (see Table I). Fig. 2 corresponds to TE values vs. r using AKDE. The flow propagation from signal x to signal y was correctly established whereas the estimated influence from signal y to signal x was not significant. Step 3 of the algorithm led to hs = 0.45. Comparing Fig. 1 and Fig. 2, TE estimated using AKDE is much closer to the exact value (0.41). We also compared our estimator with Trentool toolbox [5] and concluded to its relevant behavior as seen in Table I which allows to compare the different estimators in terms of mean and standard deviation. It reveals visible improvement in TE^xy all other performance with Gaussian AKDE over estimators.

An external file that holds a picture, illustration, etc.
Object name is halms881459f1.jpg

Results of TE between two time series x and y (model 1) using a Gaussian kernel and a fixed bandwidth h

Solid line: TE^xy, dashed line: TE^yx

Horizontal line: exact value TExy=0.41

An external file that holds a picture, illustration, etc.
Object name is halms881459f2.jpg

Results of TE between two time series x and y (model 1) using a Gaussian kernel and AKDE (10)

Solid line: TE^xy, dashed line: TE^yx

Horizontal line: exact value TExy=0.41

TABLE I

Mean values (and standard deviation in parentheses) of transfer entropy using the different estimators on model 1

EstimatorXYYX
GC/20.41460
TE (fixed h)0.3242 (0.0158)0 (0)
TE (AKDE)0.4063 (0.0179)0.01267(0.0045)
Trentool0.3484 (0.0115)−0.0158 (0.0070)

B. Bidirectional linear model

For the second linear stochastic system (model 2), we generated the following signals:

{x(n)=0.5x(n-1)+0.3y(n-2)+e1(n)y(n)=0.5x(n-3)-0.4y(n-1)+e2(n)
(13)

where e1 and e2 were as in (12). Fig. 3 and Fig. 4 allow to compare TE values using either a fixed bandwidth (Fig. 3) or AKDE (Fig. 4, hs = 0.45). For this model, the exact value of TE from signal x to signal y (resp. from signal y to signal x) given in Table II is represented by a solid grey line (resp. a dashed grey line) in Fig. 3 and and4.4. Focusing on Fig. 4, the bidirectional flow propagation was correctly detected using AKDE, the mean values of TE being close to the exact ones (see also Table II). This figure reveals that the bias of AKDE estimator is negligible. As for TE estimated with a fixed bandwidth (Fig. 3 and Table II), its values remain lower than the exact ones, similarly as those estimated with Trentool toolbox. For all estimators tested, the standard deviation is 5 to 10 times lower than the corresponding mean value.

An external file that holds a picture, illustration, etc.
Object name is halms881459f3.jpg

Results of TE between two time series x and y (model 2) using a Gaussian kernel and a fixed bandwidth h

Solid line: TE^xy, dashed line: TE^yx, Horizontal lines: exact values

An external file that holds a picture, illustration, etc.
Object name is halms881459f4.jpg

Results of TE between two time series x and y (model 2) using a Gaussian kernel and AKDE (10)

Solid line: TE^xy, dashed line: TE^yx, Horizontal lines: exact values

TABLE II

Mean values (and standard deviation in parentheses) of transfer entropy using the different estimators on model 2

EstimatorXYYX
GC/20.15110.0630
TE (fixed h)0.1118 (0.0123)0.0422 (0.0083)
TE (AKDE)0.1457 (0.0133)0.0689 (0.0087)
Trentool0.1120 (0.0091)0.0446 (0.0079)

C. Physiology based model

We simulated EEG signals with a nonlinear SDE (stochastic differential equation) model [11] of order 20 to represent the activities of two neuronal populations PopX and PopY:

d[XITXET]=gX(XI,XE,θ)dt+dWXd[YITYET]=gY(YI,YE,θ,KXYSig(X))dt+dWYX(t)=L(XI(t),XE(t)),Y(t)=L(YI(t),YE(t))
(14)

where the line vectors XITXET and YITYET are in R10 (T is the transpose operator) and denote dynamical state evolutions for PopX and PopY aggregating components []I and []E modeling respectively interacting inhibitory and excitatory activities. Parameterized nonlinear functions (u, v) → g (u, v, θ) drive the state evolutions. Sig(.) is a nonlinear sigmoidal function. The coupling parameter KXY allows effective connectivity adjustment from the first population to the second one. X and Y are computed with a linear function L and are interpretable as two intracranial EEG signals recorded from proximal field electrodes. The independent Brownian processes WX and WY represent random surrounding populations influences. The parameters vector θ includes three scalar components A, B and G, allowing to modify the type of activity (normal/epileptic). The model was time discretized by Euler scheme to produce two discrete time outputs. As in [11], we fixed these parameters to 5, 3, 20 in PopX and to 3.5, 3.5, 84 in PopY. This resulted in a narrow band fast activity around 25 Hz (similar to that observed at seizure onset) in each population. KXY was set to 1500. Fifty blocks of 8-second length signals were simulated with a sampling rate of 256 Hz. In this experiment, as the influence from one physiological signal to another one may be largely delayed, and to get a not too large predictor dimension l, we first shifted signal Y as proposed in [1]. The delay corresponded to 33 sampling time instants and was determined from cross covariance maximization. The maximum order in the model (after shifting) was set to 2 and τc was set to 500. According to Fig. 5 and Table III, we conclude to the relevance of the new estimator compared to the “references” given by Granger causality index (GC/2) and Trentool toolbox. As a matter of fact, TE^xy and TE^yx are sensibly more contrasted (considering means and standard deviations) with Gaussian AKDE than with the two other methods (Table III). Moreover, when comparing the mean values of TE^ obtained for the different methods with this physiological model, a larger dispersion was observed than with previous linear models 1 and 2. The difference between the AKDE based estimator and GC/2 could be expected due to the nonrobustness of Granger index to nonlinearities. On the other hand, the difference between the AKDE based estimator and Trentool estimator (which even failed in detecting the flow direction) was unexpected.

An external file that holds a picture, illustration, etc.
Object name is halms881459f5.jpg

Results of TE between two delayed simulated EEG signals using a Gaussian kernel and AKDE (10)

Solid line: TE^xy, dashed line: TE^yx

TABLE III

Mean values (and standard deviation in parentheses) of transfer entropy using the different estimators on the physiological model

EstimatorXYYX
GC/20.011 (0.0193)0.0028 (0.0009)
TE (AKDE)0.2521 (0.1143)0.1249 (0.0615)
Trentool0.0049 (0.3182)0.0091 (0.0083)

IV. Conclusion

In this paper, we focused on information propagation between two observations using TE and introduced an adaptive kernel density estimator to improve fixed kernel TE estimator. Results on simulated AR models revealed a very low bias with AKDE approach and proved the relevance of this new method in detecting uni/bi-directional propagation flows. Using a fixed bandwidth or Trentool approach led to much more biased values. For physiological signals, even if we had no ground-truth, the causal effects were perfectly identified and allowed characterizing the driving system and the responding one. In the future, the AKDE method will be tested on real EEG signals and on more complex scenarios including stronger nonlinearities and/or multivariate observations. A validation phase including statistical hypothesis tests based on surrogate data will complete this work.

References

1. Vicente R, Wibral M, Lindner M, Pipa G. Transfer entropy - a model-free measure of effective connectivity for the neurosciences. J of Computational Neuroscience. 2011;30:45–67. [Europe PMC free article] [Abstract] [Google Scholar]
2. Schreiber T. Measuring information transfer. Physical Review Letters. 2000;85:461–464. [Abstract] [Google Scholar]
3. Kaiser A, Schreiber T. Information transfer in continuous processes. Physica D. 2002;166:43–62. [Google Scholar]
4. Kraskov A, Stögbauer H, Grassberger P. Estimating mutual information. Physical review E. 2004;69(6):066138:1–16. [Abstract] [Google Scholar]
5. Lindner M, Vicente R, Priesemann V, Wibral M. TRENTOOL: A Matlab open source toolbox to analyse information flow in time series data with transfer entropy. BMC Neuroscience. 2011;12:119. [Europe PMC free article] [Abstract] [Google Scholar]
6. Loftsgaarden DO, Quesenberry CP. A nonparametric estimate of a multivariate density function. The Annals of Mathematical Statistics. 1965;36(3):1049–1051. [Google Scholar]
7. Silverman BW. Density estimation for statistics and data analysis, Monographs on Statistics and Applied Probability. London: Chapman & Hall; 1986. [Google Scholar]
8. Abramson IS. On bandwidth variation in kernel estimate-A square root law. Annals Statist. 1982;10:1217–1223. [Google Scholar]
9. Hwang JN. Nonparametric multivariate density estimation: a comparative study. IEEE Trans Signal Processing. 1994;42:2795–2810. [Google Scholar]
10. Yang C, Le Bouquin Jeannès R, Bellanger JJ, Shu H. A new strategy for model order identification and its application to transfer entropy for EEG signals analysis. IEEE Trans on Biomedical Engineering. (under press) [Abstract] [Google Scholar]
11. Wendling F, Hernandez A, Bellanger JJ, Chauvel P, Bartolomei F. Interictal to ictal transition in human temporal lobe epilepsy: insights from a computational model of intracerebral EEG. J of Clinical Neurophysiology. 2005;22:343–356. [Europe PMC free article] [Abstract] [Google Scholar]

Citations & impact 


Impact metrics

Jump to Citations

Article citations

Similar Articles 


To arrive at the top five similar articles we use a word-weighted algorithm to compare words from the Title and Abstract of each citation.