Abstract
Estimating sample averages and sample variability is important in analyzing neural spike trains data in computational neuroscience. Current approaches have focused on advancing the use of parametric or semiparametric probability models of the underlying stochastic process, where the probabilistic distribution is characterized at each time point with basic statistics such as mean and variance. To directly capture and analyze the average and variability in the observation space of the spike trains, we focus on a data-driven approach where statistics are defined and computed in a function space in which the spike trains are viewed as individual points. Based on the definition of a “Euclidean” metric, a recent paper introduced the notion of the mean of a set of spike trains and developed an efficient algorithm to compute it under some restrictive conditions. Here we extend this study by: (1) developing a novel algorithm for mean computation that is quite general, and (2) introducing a notion of covariance of a set of spike trains. Specifically, we estimate the covariance matrix using the geometry of the warping functions that map the mean spike train to each of the spike trains in the dataset. Results from simulations as well as a neural recording in primate motor cortex indicate that the proposed mean and covariance successfully capture the observed variability in spike trains. In addition, a “Gaussian-type” probability model (defined using the estimated mean and covariance) reasonably characterizes the distribution of the spike trains and achieves a desirable performance in the classification of the spike trains.
Similar content being viewed by others
References
Abbott, L.F., & Sejnowski, T.J. (1999). Neural codes and distributed representations: foundations of neural computation. The MIT Press.
Aronov, D. (2003). Fast algorithm for the metric-space analysis of simultaneous responses of multiple single neurons. Journal of Neuroscience Methods, 124, 175–179.
Aronov, D., & Victor, J. (2004). Non-Euclidean properties of spike train metric spaces. Physical Review E, 69, 061905.
Aronov, D., Reich, D.S., Mechler, F., Victor, J. (2003). Neural coding of spatial phase in v1 of the macaque monkey. Journal of Neurophysiology, 89, 3304–3327.
Bickel, P.J., & Levina, E. (2008). Regularized estimation of large covariance matrices. The Annals of Statistics, 36, 199–227.
Bilodeau, M., & Brenner, D. (1999). Theory of multivariate statistics. Springer.
Box, G.E.P., Hunter, W.G., Hunter, J.S. (1978). Statistics for experimenters: An introduction to design, data analysis, and model building. New York: Wiley.
Brown, E.N., Barbieri, R., Ventura, V., Kass, R.E., Frank, L.M. (2002). The time-rescaling theorem and its applicationto neural spike train data analysis. Neural Computation, 14, 325–346.
Dayan, P., & Abbott, L.F. (2001). Theoretical neuroscience: Computational and mathematical modeling of neural systems. The MIT Press.
Dryden, I.L., Koloydenko, A., Zhou, D. (2009). Non-Euclidean statistics for covariance matrices, with applications to diffusion tensor imaging. Annals of Applied Statistics, 3, 1102–1123.
Dubbs, A.J., Seiler, B.A., Magnasco, M.O. (2010). A fast L p spike alighment metric. Neural Computation, 22, 2785–2808.
Houghton, C. (2009). Studying spike trains using a van rossum metric with a synapse-like filter. Journal of Computational Neuroscience, 26, 149–155.
Houghton, C., & Sen, K. (2008). A new multineuron spike train metric. Neural Computation, 20, 1495–1511.
Hunter, J.D., & Milton, J.G. (2003). Amplitude and frequency dependence of spike timing: implications for dynamic regulation. Journal of Neurophysiology, 90, 387–394.
Karcher, H. (1977). Riemann center of mass and mollifier smoothing. Communications on Pure and Applied Mathematics, 30, 509–541.
Kass, R.E., & Ventura, V. (2001). A spike-train probability model. Neural Computation, 13, 1713–1720.
Kass, R.E., & Vos, P.W. (1997). Geometric foundations of asymptotic inference. Wiley.
Kass, R.E., Ventura, V., Brown, E.N. (2005). Statistical issues in the analysis of neuronal data. Journal of Neurophysiology, 94, 8–25.
Klassen, E., Srivastava, A., Mio, W., Joshi, S.H. (2004). Analysis of planar shapes using geodesic paths on shape spaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26, 372–383.
Kreuz, T., Haas, J.S., Morelli, A., Abarbanel, H., Politi, A. (2007). Measuring spike train synchrony. Journal of Neuroscience Methods, 165, 151–161.
Kurtek, S., Srivastava, A., Wu, W. (2011). Signal estimation under random time-warpings and nonlinear signal alignment. In Neural Information Processing Systems (NIPS).
Levina, E., Rothman, A., Zhu, J. (2008). Sparse estimation of large covariance matrices via a nested lasso penalty. The Annals of Applied Statistics, 2, 245–263.
Lim, D., & Capranica, R.R. (1994). Measurement of temporal regularity of spike train responses in auditory nerve fibers of the green treefrog. Journal of Neurosceince Methods, 52, 203–213.
MacLeod, K., Backer, A., Laurent, G. (1998). Who reads temporal information contained across synchronized and oscillatory spike trains? Nature, 395, 693–698.
Michor, P.W., & Mumford, D. (2007). An overview of the riemannian metrics on spaces of curves using the hamiltonian approach. Applied and Computational Harmonic Analysis, 23, 74–113.
Paiva, A.R.C., Park, I., Principe, J.C. (2009a). A comparison of binless spike train measures. Neural Computing and Applications. doi:10.1007/s00521-009-0307-6.
Paiva, A.R.C., Park, I., Principe, J.C. (2009b). A reproducing kernel hilbert space framework for spike train signal processing. Neural Computation, 21, 424–449.
Pennec, X. (2006). Intrinsic statistics on riemannian manifolds: basic tools for geometric measurements. Journal of Mathematical Imaging and Vision, 25, 127–154.
Quiroga, R.Q., Kreuz, T., Grassberger, P. (2002). Event synchronization: a simple and fast method to measure synchronicity and time delay patterns. Physical Review E, 66, 041904.
Ramsay, J.O., & Silverman, B.W. (2005). Functional data analysis (2nd ed.). Springer Series in Statistics.
Rencher, A.C. (2002). Methods of multivariate analysis. Wiley.
Rieke, F., Warland, D., Ruyter van Steveninck, R.R., Bialek, W. (1997). Spikes: Exploring the neural code. MIT Press.
Schreiber, S., Fellousb, J., Whitmerc, D., Tiesingaa, P., Sejnowskib, T. (2003). A new correlation-based measure of spike timing reliability. Neurocomputing, 52–54, 925–931.
Srivastava, A., & Jermyn, I.H. (2009). Looking for shapes in two-dimensional, cluttered point clouds. IEEE Transactions on on Pattern Analysis and Machine Intelligence, 31(9), 1616–1629.
Tukey, J.W. (1977). Exploratory data analysis. Reading: Addison-Wesley.
Valderrama, M.J. (2007). An overview to modelling functional data. Computational Statistics, 22, 331–334.
van Rossum, M.C.W. (2001). A novel spike distance. Neural Computation, 13, 751–763.
Victor, J.D., & Purpura, K.P. (1996). Nature and precision of temporal coding in visual cortex: a metric-space analysis. Journal of Neurophysiology, 76, 1310–1326.
Victor, J.D., & Purpura, K.P. (1997). Metric-space analysis of spike trains: theory, algorithms and application. Network, 8, 127–164.
Victor, J.D., Goldberg, D.H., Gardner, D. (2007). Dynamic programming algorithms for comparing multineuronal spike trains via cost-based metrics and alignments. Journal of Neuroscience Methods, 161, 351–360.
Wu, W., & Srivastava, A. (2011). An information-geometric framework for statistical inferences in the neural spike train space. Journal of Computational Neuroscience, 31, 725–748.
Younes, L., Michor, P.W., Shah, J., Mumford, D. (2008). A metric on shape space with explicit geodesics. Rendiconti Lincei Matematica e Applicazioni, 9, 25–57.
Acknowledgements
This research is supported in part by the grants NSF IIS-0916154 to WW and NSF DMS-0915003, ONR N00014-09-1-0664, and AFOSR FA9550-06-1-0324 to AS. We thank Prof. Nicholas Hatsopoulos at the University of Chicago for providing experimental data.
Author information
Authors and Affiliations
Corresponding author
Additional information
Action Editor: Mark van Rossum
Appendices
Appendix A: Existence of the optimal warping function
We at first state one theoretical fact on the optimal time warping on the metric d p , which is actually the same as Lemma 1 in Wu and Srivastava (2011). We restate it here to emphasize its importance.
Lemma 1
For p ≥ 1 and any positive diffeomorphism γ from [a, c] to [b, d] (i.e. \(\gamma(a) = b, \gamma(c) = d, 0 < \dot \gamma(t) < \infty\) ), the time warping has the following optimal cost:
The infimum is reached if the warping is linear from [a, c] to [b, d] (i.e. \(\dot \gamma(t) = \frac{d-b}{c-a}\) ). This condition is also necessary for any p > 1.
Based on this result, we have the following theorem on the existence of the optimal warping function between any two spike trains.
Theorem 1
Assume \(f(t) = \sum_{i=1}^M \delta(t-t_i) \in {\cal S}_M\) and \(g(t) = \sum_{j=1}^N \delta(t-s_j) \in {\cal S}_N\) are two spike trains in [0, T]. Then there exists a warping function \(\gamma^\ast \in \Gamma_{f,g}\) , such that
where 1 ≤ p < ∞ and 0 < λ < ∞.
Proof
For any warping function γ ∈ Γ, we have two sets of spike times, [f] and [g ∘ γ]. Let \(\{a_l\}_{l=1}^L\) be the intersection of [f] and [g ∘ γ], where L is the cardinality of {a l }. This implies X([f],[g ∘ γ]) = M + N − 2L. Denoting a 0 = 0 and a L + 1 = T, we define a warping function \(\tilde \gamma \in \Gamma_{f,g}\) as follows:
For illustration, we show the warping functions γ and \(\tilde \gamma\) in Fig. 12, where \(\tilde \gamma\) is basically a piecewise linear version of γ by linearly connecting all grid points (a l , γ(a l )), l = 0, ⋯ , L + 1. We note that \(\tilde \gamma\) remains the time matchings between {a l } and {γ(a l )}. This implies that
As \(\tilde \gamma\) is linear from [a l − 1, a l ] to [γ(a l − 1), γ(a l )], using Lemma 1, we have
Therefore,
This indicates \(\tilde \gamma \in \Gamma_{f,g}\) provides a better time warping to measure the distance between f and g than γ. Note that the set Γ f,g is finite. This implies that,
Therefore, there exists \(\gamma^\ast \in \Gamma_{f, g} \subset \Gamma\), such that
□
Appendix B: Convergence of the MCP-algorithm
Theorem 2
Denote the estimated mean in the j th iteration of the MCP-algorithm as S (j) . Then the sum of squared distances \(\sum_{i=1}^N d_2(S_i, S^{(j)})^2\) decreases iteratively. That is,
Proof
In the jth iteration, we at first perform the Matching Step and found the optimal warping γ i from S i to S (j), i = 1, ⋯ , N. That is,
Then, we perform the Centering Step to compute the extrinsic mean of warping functions \(\{\gamma_i\}_{i=1}^N\). As shown in Section 2.2, the mean warping function, \(\bar \gamma\), of \(\{\gamma_i\}_{i=1}^N\) is also piecewise linear and its SRVF has the closed form given by Eq. (6). By the definition of the extrinsic mean, we have
We then apply \({\bar \gamma}^{-1}\) to both S i ∘ γ i and S (j). In this operation, the spikes in both trains are moved with the same warping function. Using the definition of the d 2 metric, we have
One can easily verify that
Therefore,
Next, we perform the Pruning Step and update the number of spikes in the mean. \([\widetilde {S^{(j)} \circ {\bar \gamma^{-1}}}]\) is the set of spikes in \([S^{(j)} \circ {\bar \gamma^{-1}}]\) that appear more than N/2 times in {[\(S_i\circ\gamma_i \circ \bar\gamma^{-1}\)]}. Base on the definition of the d 2 metric, we have
Using the basic rule of the Exclusive OR, it is easy to find that
Therefore,
Finally, we test whether one spike in \([\widetilde {S^{(j)}\circ {\bar \gamma}^{-1}}]\) that appears the minimal number of times in {[\(S_i\circ\gamma_i \circ \bar\gamma^{-1}\)]} can be removed.
-
(i)
If the spike can be removed, the remaining set of spikes is denoted as \([\widetilde {S^{(j)}\circ {\bar \gamma}^{-1}}^{-}]\). This is under the condition that \(\sum_{i=1}^N d_2(S_i, \widetilde {S^{(j)} \circ {\bar \gamma^{-1}}}^{-})^2 \le \sum_{i=1}^N d_2(S_i, \widetilde {S^{(j)} \circ {\bar \gamma^{-1}}})^2\). In this case, we update the mean spike train \(S^{(j+1)} = \widetilde {S^{(j)} \circ {\bar \gamma^{-1}}}^{-}\), and the number of spikes in the mean n = |S (j + 1)|. In this case, we have
$$ \begin{array}{rll} \sum\limits_{i=1}^N d_2\left(S_i, S^{(j+1)}\right)^2 &=& \sum\limits_{i=1}^N d_2\left(S_i, \widetilde {S^{(j)} \circ {\bar \gamma^{-1}}}^{-}\right)^2\\ &\le& \sum\limits_{i=1}^N d_2\left(S_i, \widetilde {S^{(j)} \circ {\bar \gamma^{-1}}}\right)^2 \\ &\le& \sum\limits_{i=1}^N d_2\left(S_i, S^{(j)}\right)^2 \end{array} $$ -
(ii)
If the spike cannot he removed, we update the mean spike train \(S^{(j+1)} = \widetilde {S^{(j)} \circ {\bar \gamma^{-1}}}\), and the number of spikes in the mean n = |S (j + 1)|. In this case, we also have
$$ \begin{array}{rll} \displaystyle\sum\limits_{i=1}^N d_2\left(S_i, S^{(j+1)}\right)^2 &=& \sum\limits_{i=1}^N d_2\left(S_i, \widetilde {S^{(j)} \circ {\bar \gamma^{-1}}}\right)^2\\ &\le& \sum\limits_{i=1}^N d_2\left(S_i, S^{(j)}\right)^2. \end{array} $$
In summary, we have shown that the sum of squared distances \(\sum_{i=1}^N d_2(S_i, S^{(j)})^2\) decreases iteratively.□
Rights and permissions
About this article
Cite this article
Wu, W., Srivastava, A. Estimating summary statistics in the spike-train space. J Comput Neurosci 34, 391–410 (2013). https://doi.org/10.1007/s10827-012-0427-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10827-012-0427-3