Abstract
Statistical inferences are essentially important in analyzing neural spike trains in computational neuroscience. Current approaches have followed a general inference paradigm where a parametric probability model is often used to characterize the temporal evolution of the underlying stochastic processes. To directly capture the overall variability and distribution in the space of the spike trains, we focus on a data-driven approach where statistics are defined and computed in the function space in which spike trains are viewed as individual points. To this end, we at first develop a parametrized family of metrics that takes into account different warpings in the time domain and generalizes several currently used spike train distances. These new metrics are essentially penalized L p norms, involving appropriate functions of spike trains, with penalties associated with time-warping. The notions of means and variances of spike trains are then defined based on the new metrics when p = 2 (corresponding to the “Euclidean distance”). Using some restrictive conditions, we present an efficient recursive algorithm, termed Matching-Minimization algorithm, to compute the sample mean of a set of spike trains with arbitrary numbers of spikes. The proposed metrics as well as the mean spike trains are demonstrated using simulations as well as an experimental recording from the motor cortex. It is found that all these methods achieve desirable performance and the results support the success of this novel framework.
Similar content being viewed by others
References
Aronov, D. (2003). Fast algorithm for the metric-space analysis of simultaneous responses of multiple single neurons. Journal of Neuroscience Methods, 124, 175–179.
Aronov, D., Reich, D. S., Mechler, F., & Victor, J. (2003). Neural coding of spatial phase in v1 of the macaque monkey. Journal of Neurophysiology, 89, 3304–3327.
Aronov, D., & Victor, J. (2004). Non-Euclidean properties of spike train metric spaces. Physical Review E, 69, 061905.
Bertsekas, D. P. (1995). Dynamic programming and optimal control. Athena Scientific.
Bhattacharya, A. (1943). On a measure of divergence between two statistical populations defined by their probability distributions. Bulletin of the Calcutta Mathematical Society, 35, 99–109.
Box, G. E. P., Hunter, W. G., & Hunter, J. S. (1978). Statistics for experimenters: An introduction to design, data analysis, and model building. New York: Wiley.
Brockwell, A. E., Rojas, A. L., & Kass, R. E. (2004). Recursive Bayesian decoding of motor cortical signals by particle filtering. Journal of Neurophysiology, 91, 1899–1907.
Brown, E. N., Barbieri, R., Ventura, V., Kass, R. E., & Frank, L. M. (2002). The time-rescaling theorem and its application to neural spike train data analysis. Neural Computation, 14, 325–346.
Čencov, N. N. (1982). Statistical decision rules and optimal inferences, translations of mathematical monographs (Vol. 53). Providence: AMS.
Chi, Z., Wu, W., Haga, Z., Hatsopoulos, N., & Margoliash, D. (2007). Template-based spike pattern identification with linear convolution and dynamic time warping. Journal of Neurophysiology, 97, 1221–1235.
Curran-Everett, D., & Benos, D. J. (2004). Guidelines for reporting statistics in journals published by the american physiological society. Journal of Applied Physiology, 97, 457–459.
Dayan, P., & Abbott, L. F. (2001). Theoretical neuroscience: Computational and mathematical modeling of neural systems. Cambridge: MIT.
Dubbs, A. J., Seiler, B. A., & Magnasc, M. O. (2009). A fast Lp spiek alighment metric. Neural Computation, 22, 2785–2808..
Houghton, C. (2009). Studying spike trains using a van rossum metric with a synapse-like filter. Journal of Computational Neuroscience, 26, 149–155.
Houghton, C., & Sen, K. (2008). A new multineuron spike train metric. Neural Computation, 20, 1495–1511.
Hunter, J. D., & Milton, J. G. (2003). Amplitude and frequency dependence of spike timing: Implications for dynamic regulation. Journal of Neurophysiology, 90, 387–394.
Karcher, H. (1977). Riemann center of mass and mollifier smoothing. Communications on Pure and Applied Mathematics, 30, 509–541.
Kass, R. E., & Ventura, V. (2001). A spike-train probability model. Neural Computation, 13, 1713–1720.
Kass, R. E., Ventura, V., & Brown, E. N. (2005). Statistical issues in the analysis of neuronal data. Journal of Neurophysiology, 94,8–25.
Kass, R. E., & Vos, P. W. (1997). Geometric foundations of asymptotic inference. New York: Wiley.
Klassen, E., Srivastava, A., Mio, W., & Joshi, S. H. (2004). Analysis of planar shapes using geodesic paths on shape spaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26, 372–383.
Kreuz, T., Haas, J. S., Morelli, A., Abarbanel, H., & Politi, A. (2007). Measuring spike train synchrony. Journal of Neuroscience Methods, 165, 151–161.
Lim, D., & Capranica, R. R. (1994). Measurement of temporal regularity of spike train responses in auditory nerve fibers of the green treefrog. Journal of Neurosceince Methods, 52, 203–213.
MacLeod, K., Backer, A., & Laurent, G. (1998). Who reads temporal information contained across synchronized and oscillatory spike trains? Nature, 395, 693–698.
Michor, P. W., & Mumford, D. (2007). An overview of the riemannian metrics on spaces of curves using the hamiltonian approach. Applied and Computational Harmonic Analysis, 23, 74–113.
Paiva, A. R. C., Park, I., & Principe, J. C. (2009a). A comparison of binless spike train measures. Neural Computing and Applications. doi:10.1007/s00521-009-0307-6.
Paiva, A. R. C., Park, I., & Principe, J. C. (2009b). A reproducing kernel hilbert space framework for spike train signal processing. Neural Computation, 21, 424–449.
Perkel, D. H., Gerstein, G. L., & Mooren, G. P. (1967a). Neuronal spike trains and stochastic point processes. I. The single spike train. Biophysics Journal, 7, 391–418.
Perkel, D. H., Gerstein, G. L., & Mooren, G. P. (1967b). Neuronal spike trains and stochastic point processes. II. Simultaneous spike trains. Biophysics Journal, 8, 419–440.
Quiroga, R. Q., Kreuz, T., & Grassberger, P. (2002). Event synchronization: A simple and fast method to measure synchronicity and time delay patterns. Physical Review E, 66, 041904.
Rieke, F., Warland, D., Ruyter van Steveninck, R. R., & Bialek, W. (1997). Spikes: Exploring the neural code. Cambridge: MIT.
Schrauwen, B., & van Campenhout, J. (2007). Linking non-binned spike train kernels to several existing spike train metrics. Neurocomputing. 70, 1247–1253.
Schreiber, S., Fellousb, J., Whitmerc, D., Tiesingaa, P., & Sejnowskib, T. (2003). A new correlation-based measure of spike timing reliability. Neurocomputing, 52–54, 925–931.
Seber, G. A. F. (2004). Multivariate observations. New York: Wiley.
Srivastava, A., Jermyn, I. H., & Joshi, S. H. (2007). Riemannian analysis of probability density functions with applications in vision. In IEEE conference on computer vision and pattern recognition (CVPR)
Truccolo, W., Eden, U., Fellows, M., Donoghue, J., & Brown, E. (2005). A point process framework for relating neural spiking activity to spiking history, neural ensemble and extrinsic covariate effects. Journal of Neurophysiology, 93, 1074–1089.
Tukey, J. W. (1977). Exploratory data analysis. Reading: Addison-Wesley.
van Rossum, M. C. W. (2001). A novel spike distance. Neural Computation, 13, 751–763.
Victor, J. D., Goldberg, D. H., & Gardner, D. (2007). Dynamic programming algorithms for comparing multineuronal spike trains via cost-based metrics and alignments. Journal of Neuroscience Methods, 161, 351–360.
Victor, J. D., & Purpura, K. P. (1996). Nature and precision of temporal coding in visual cortex: A metric-space analysis. Journal of Neurophysiology, 76, 1310–1326.
Victor, J. D., & Purpura, K. P. (1997). Metric-space analysis of spike trains: Theory, algorithms and application. Network, 8, 127–164.
Wu, W., & Srivastava, A. (2011). Towards statistical summaries of spike train data. Journal of Neuroscience Methods, 195, 107–110.
Younes, L., Michor, P. W., Shah, J., & Mumford, D. (2008). A metric on shape space with explicit geodesics. Rendiconti Lincei – Matematica E Applicazioni, 9, 25–57.
Acknowledgements
This research is supported in part by the grants NSF IIS-0916154 to WW and NSF DMS-0915003, ONR N00014-09-1-0664, and AFOSR FA9550-06-1-0324 to AS. We thank Prof. Nicholas Hatsopoulos for providing experimental data. We also thank Prof. Eric Klassen for helpful discussions.
Author information
Authors and Affiliations
Corresponding author
Additional information
Action Editor: Rob Kass
Appendices
Appendix A: Metric D p [σ, λ]
Here we need to show the distance D p [σ, λ] satisfies the following three conditions in the definition of a metric space, where p ≥ 1, σ > 0, and λ > 0.
(a) Non-negativity: D(f,g) ≥ 0 for any f, g \(\in \mathcal{S}^{(\sigma)}\). The equality holds if and only if f = g.
Proof
The non-negativity is apparent.□
(b) Symmetry: D(f,g) = D(g,f) for any f, g \(\in \mathcal{S}^{(\sigma)}\).
Proof
For any γ(t) ∈ Γ, let s = γ(t). Then t = γ − 1(s), and s = γ(γ − 1(s)), \(1 = \dot \gamma (\gamma^{-1}(s)) \dot \gamma^{-1}(s)\). Therefore, \(\int |f(t)^{1/p}\) − \([g(\gamma(t))\dot \gamma(t)]^{1/p}|^p\) + λ|1 − \(\dot \gamma(t)^{1/p}|^p dt\) = \(\int |[f(\gamma^{-1}(s))\dot \gamma^{-1}(s)]^{1/p}\) − g(s)1/p|p + \(\lambda |\dot \gamma^{-1}(s)^{1/p}\) − 1|p ds. Taking infimum over γ(t) and γ − 1(s) on both sides, the symmetry holds.□
(c) Triangle-inequality: D(f,g) ≤ D(f,h) + D(h,g) for any f, g, h \(\in \mathcal{S}^{(\sigma)}\).
Proof
The triangle-inequality will hold if for any γ(t), γ 1(t) ∈ Γ we have
where ∘ denotes function composition in Γ (and therefore \(\gamma \circ \gamma_1^{-1}\) is still a warping in the group Γ). Let s = γ 1(t), the second term on the right hand side of Eq. (14) is equal to
Therefore, Eq. (14) will hold if we can show that for any a, b, c, d, e, and f in the L p space
Indeed, Eq. (15) is a direct result from the triangle inequalities in the L p space and the L p norm in the Euclidean \(\Re^2\) vector space.□
Appendix B: Metric d p [λ]
(B.1) At first, we show the distance d p [λ] also satisfies the following three conditions in the definition of a metric space, where p ≥ 1 and λ > 0.
-
(a)
Non-negativity: d(f,g) ≥ 0 for any f, g \(\in \mathcal{S}\). The equality holds if and only if f = g.
Proof
The non-negativity is apparent.□
-
(b)
Symmetry: d(f, g) = d(g,f) for any f, g \( \in\! \mathcal{S}\).
Proof
Assume \(f \in \cal{S}_M\) and \(g \in \cal{S}_N\). For any γ(t) ∈ Γ, let s = γ(t). Then t = γ − 1(s), and s = γ(γ − 1(s)), \(1 = \dot \gamma (\gamma^{-1}(s)) \dot \gamma^{-1}(s)\). Therefore, \(M \!+\! N \!-\! 2\sum_{i=1}^M \sum_{j=1}^N {\bf 1}_{t_i = \gamma(s_j)} \!+ \lambda \int |1 - \dot \gamma(t)^{1/p}|^p dt =\) \( N \!+\! M \!-\! 2\sum_{j=1}^N \sum_{i=1}^M {\bf 1}_{s_j = \gamma^{-1}(t_i)} \!+ \lambda \int |\dot \gamma^{-1}(s)^{1/p} -\) 1|p ds. Taking infimum over γ(t) and γ − 1(s) on both sides, the symmetry holds.□
-
(c)
Triangle-inequality: d(f,g) ≤ d(f,h) + d(h, g) for any f, g, h \(\in \mathcal{S}\).
Proof
Assume \(f \! = \! \sum_{l=i}^L \delta(t \! - \! r_i)\), \(g \! = \! \sum_{j=1}^M \delta(t -\) s j ), and h = \(\sum_{k=1}^N \delta(t\) − t k ). The triangle-inequality will hold if for any γ(t), γ 1(t) ∈ Γ we have
where ∘ denotes function composition in Γ. Let s = γ(t) in the first and third integrations and let s = γ 1(t) in the the second integration, Eq. (16) can be written as
It is straightforward to verify that
This implies that
Therefore, Eq. (16) will hold if we can show that for any non-negative values x, y, z with x ≤ y + z and a, b, c in the L p space
Indeed, Eq. (17) is also a direct result from the triangle inequalities in the L p space and the L p norm in the Euclidean \(\Re^2\) vector space.□
(B.2) Then, we show that when σ → 0, the limiting form of D p [σ, λ] is d p [λ].
Proof
Assume \(f(t) \!=\! \sum_{i=1}^M K_{\sigma}(t\!-\!t_i) \!\in\! {\cal S}_M^{(\sigma)}\) and \(g(t) = \sum_{j=1}^N K_{\sigma}(t-s_j) \in {\cal S}_N^{(\sigma)}\) are two smoothed spike trains in [0, T]. When σ is sufficiently small, spikes in f and g will either fully match (two spike kernels are entirely overlapped) or do not match at all (two spike kernels are well separated). Moreover, when two spike kernels K σ (t − t i ) in f and K σ (t − s j ) in g are fully matched, there must be no time warping in this match. That is, γ(s j ) = t i and \(\dot \gamma (t) = 1\) when s j − σ < t < s j + σ.
This indicates that
Therefore, when σ is sufficiently small, \(\int_0^T |\) \({f(t)}^{1/p} \!-\! [g(\gamma(t))\dot \gamma(t)]^{1/p}|^p dt = M + N - 2\sum_{i=1}^M\) \( \sum_{j=1}^N {\bf 1}_{t_i = \gamma(s_j)}\). This is actually the total number of spike kernels that are not overlapped after warping. As the penalty term in d p [λ] is exactly the same as that in D p [σ, λ], d p [λ] is the limiting form of D p [σ, λ] when σ→0.□
Appendix C: Limiting form for upper bounds
Here we show that the upper bound is exact in some limiting form for both D 2 and D 1 distances. Our proof is based on an example plot in Fig. 11, which shows an infinite sequence of piecewise linear warpings from [0, T] to [0, T]. Computing the integral of the piecewise linear warping, we have
and
This proves that the upper bound, 2 λT, is exact for both D 2 and D 1.
Appendix D: Optimal time warping
Here we state some theoretical facts on the optimal time warping.
Lemma 1
For any positive diffeomorphism γ from [a, c] to [b, d] (i.e. \(\gamma(a) = b, \gamma(c) = d, 0 < \dot \gamma(t) < \infty\) ) and 1 ≤ p < ∞, the time warping has the following optimal cost:
When p > 1, the equality holds if and only if the warping is linear from [a, c] to [b, d]. When p = 1, such linear property is sufficient.
Proof
In domain [a, c], for any function f(t) ∈ L p, we denote its L p norm as \(||f(t)||_p = (\int_a^c |f(t)|^{p}| dt)^{1/p}\). Then,
Using theory in the L p spaces, when p > 1, the equality holds if and only if \(\dot \gamma(t)\) is a positive constant, i.e. the warping is linear from [a, c] to [b, d]. When p = 1, the equality holds if the sign of \(1 - \dot \gamma(t)\) does not change in [a, c] (always nonnegative or always nonpositive), for example, if the warping is linear from [a, c] to [b, d].□
Theorem 1
Let spike trains \(f, g \in \mathcal{S}_M\) in [0, T] and the corresponding ISIs are Δs 1, ⋯ , Δs M + 1 and Δt 1, ⋯ , Δt M + 1 , respectively. If λ < 1/(2p − 1 T), then
Proof
In domain [0, T], for any function h(t) ∈ L p, we denote its L p norm as \(||f(t)||_p = (\int_0^T |f(t)|^{p} dt)^{1/p}\). Then,
This indicates that the total penalty is bounded by λ2p T. If one spike in one train does not match with any spike in the other train, the cost in the matching term should be at least 2. Therefore, when λ2p T < 2, or λ < 1/(2p − 1 T), all M spikes in f and all M spikes in g must be one-to-one matched for the optimal distance.
When all spikes are matched, the distance between two spike trains only depends on the optimal warping cost for each pair of ISIs. The distance in Eq. (19) is a direct result from Lemma 1.□
Theorem 2
Let spike trains \(f \in \mathcal{S}_M\) and \(g \in \mathcal{S}_N\) in [0, T]. If λ < 1/(2p − 1 T), then
Proof
Without loss of generality, we assume M ≥ N. The matching cost is minimal with value M − N when all N spikes in g are matched with N (of M) spikes in f. In this case, we have N + 1 pairs of ISIs associated the N spikes in f and N spikes in g. As shown in Theorem 1, the total penalty is always bounded by λ2p T. Therefore, when λ < 1/(2p − 1 T), the N spikes in g must match N (of M) spikes in f and the optimal total cost which results in the following distance:
Moreover, using Theorem 1, if the N + 1 pairs of ISIs are \(\{\Delta s_k\}_{k=1}^{N+1}\) and \(\{\Delta t_k\}_{k=1}^{N+1}\), then the optimal time warping cost is: \(\sum_{k=1}^{N+1} |{\Delta s_k}^{1/p} - {\Delta t_k}^{1/p}|^p.\) Therefore, the distance between f and g can be further represented in the following form:
□
Appendix E: Convergence of the MM-algorithm
Denote the estimated mean in the ith iteration of the MM-algorithm as S (i). Now we show that the sum of squared distance \(\sum_{k=1}^K d_2^2(S_k, S^{(i)})\) decreases iteratively. That is,
As 0 is a natural lower bound, the iteration will always converge.
Proof
In the ith iteration, we have the initial S (i − 1) and let d k,i − 1 denote the distance between S k and S (i − 1) based on current spike train matching subset. The number of spikes in the mean, n, is known beforehand. Hence the squared distance on the matching part is always |n k − n|, which is invariant with respect to any subset of spikes being used in the matching process. Therefore, we only need to focus on how the optimal matching cost changes from S (i − 1) to S (i).
In case 1, we compute the distance, r k , between S k and S (i − 1) and find n + 1 ISIs in S k that have the optimal time warping with the n + 1 ISIs in S (i − 1). As the ISIs are optimally selected, r k ≤ d k,i − 1. In case 2, The descrease in distance also holds. However, as n k < n, the number of ISIs is only n k + 1, which could not be used to compute the mean using the closed-form formula in Eq. (11). In the MM-algorithm, we propose to linearly interpolate n − n k imaginary spikes in S k based on the n k pairs of spikes between S k and S (i − 1). Here we show this interpolation does not change the warping distance between S k and S (i − 1).
Indeed, as the warping distance is represented in the sum of squared form (see the proof in Theorem 2), we only need to show the invariance in each squared term. Let Δa be one ISI in S k and Δb be the associated ISI in S (i − 1). Their squared distance is
Assume there are h spikes in the ISI in S (i − 1), which partition Δb into h + 1 ISIs, denoted by (u 1, ⋯ , u h + 1) with \(\sum_{j=1}^{h+1} u_j = \Delta b\). Our algorithm inserts h spikes in the ISI in S k using linear interpolation. Therefore the (h + 1) ISIs in S k have lengths
Then the new warping distance based on these ISIs are
In summary, over step 2, the distance between S k and S (i − 1) has decreased and we have found a set of n + 1 ISIs in S k that correspond to the n + 1 ISIs in S (i − 1). In step 3, we update the ISIs in S (i − 1) to S (i). This update is based on the closed-form formula in Eq. (11) which minimizes the sum of squared distance. Therefore,
As r k ≤ d k,i − 1, k = 1, ⋯ , K, we have
□
Rights and permissions
About this article
Cite this article
Wu, W., Srivastava, A. An information-geometric framework for statistical inferences in the neural spike train space. J Comput Neurosci 31, 725–748 (2011). https://doi.org/10.1007/s10827-011-0336-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10827-011-0336-x