Skip to main content
Log in

Capacity analysis in multi-state synaptic models: a retrieval probability perspective

  • Published:
Journal of Computational Neuroscience Aims and scope Submit manuscript

Abstract

We define the memory capacity of networks of binary neurons with finite-state synapses in terms of retrieval probabilities of learned patterns under standard asynchronous dynamics with a predetermined threshold. The threshold is set to control the proportion of non-selective neurons that fire. An optimal inhibition level is chosen to stabilize network behavior. For any local learning rule we provide a computationally efficient and highly accurate approximation to the retrieval probability of a pattern as a function of its age. The method is applied to the sequential models (Fusi and Abbott, Nat Neurosci 10:485–493, 2007) and meta-plasticity models (Fusi et al., Neuron 45(4):599–611, 2005; Leibold and Kempter, Cereb Cortex 18:67–77, 2008). We show that as the number of synaptic states increases, the capacity, as defined here, either plateaus or decreases. In the few cases where multi-state models exceed the capacity of binary synapse models the improvement is small.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. The fraction of non-selective neurons above the threshold is not specified in the criterion since it has been controlled in the threshold selection. Moreover, because of the strong inhibition, there cannot be many non-selective neurons above threshold throughout the dynamics.

  2. The amount of allowable non-selective neurons above the threshold is not specified in the criterion since it has been controlled in the threshold selection.

References

  • Abraham, W. C., & Bear, M. F. (1996). Metaplasticity: The plasticity of synaptic plasticity. Trends in Neurosciences, 19(4), 126–130. doi:10.1016/S0166-2236(96)80018-X.

    Article  PubMed  CAS  Google Scholar 

  • Amit, D. J., & Brunel, N. (1997a). Dynamics of recurrent network of spiking neurons before and following learning. Network, 8, 373–404.

    Article  Google Scholar 

  • Amit, D. J., & Brunel, N. (1997b). Model of global spontaneous activity and local structured activity during delay periods in the cerebral cortex. Cerebral Cortex, 7, 237–252.

    Article  PubMed  CAS  Google Scholar 

  • Amit, D. J., & Fusi, S. (1994). Learning in neural networks with material synapses. Neural Computation, 6, 957–982.

    Article  Google Scholar 

  • Amit, D. J., & Mongillo, G. (2003). Selective delay activity in the cortex: Phenomena and interpretation. Cerebral Cortex, 13, 1139–1150.

    Article  PubMed  Google Scholar 

  • Amit, Y., & Huang, Y. (2010). Precise capacity analysis in binary networks with multiple coding level inputs. Neural Computation, 22(3), 660–688. doi:10.1162/neco.2009.02-09-967.

    Article  PubMed  Google Scholar 

  • Attwell, D., & Laughlin, S. B. (2001). An energy budget for signaling in the grey matter of the brain. Journal of Cerebral Blood Flow and Metabolism, 21(10), 1133–1145.

    PubMed  CAS  Google Scholar 

  • Barnes, C. A., McNaughton, B. L., Mizumori, S. J., Leonard, B. W., & Lin, L. H. (1990). Comparison of spatial and temporal characteristics of neuronal activity in sequential stages of hippocampal processing. Progress in Brain Research, 83, 287–300.

    Article  PubMed  CAS  Google Scholar 

  • Barret, A. B., & van Rossum, M. C. (2008). Optimal learning rules for discrete synapses. PLoS Computational Biology, 4, 1–7.

    Article  Google Scholar 

  • Ben Dayan Rubin, D. D., & Fusi, S. (2007). Long memory lifetimes require complex synapses and limited sparseness. Frontiers in Computational Neuroscience, 1, 1–14.

    Article  Google Scholar 

  • Brunel, N. (2003). Dynamics and plasticity of stimulus-selective persistent activity in cortical network models. Cerebral Cortex, 13, 1151–1161.

    Article  PubMed  Google Scholar 

  • Curti, E., Mongillo, G., La Camera, G., & Amit, D. J. (2004). Mean-field and capacity in realistic networks of spiking neurons storing sparsely coded random memories. Neural Computation, 16, 2597–2637.

    Article  PubMed  Google Scholar 

  • Del Giudice, P., Fusi, S., & Mattia, M. (2003). Modelling the formation of working memory with networks of integrate-and-fire neurons connected by plastic synapses. Journal of Physiology Paris, 97(4–6), 659–681. doi:10.1016/j.jphysparis.2004.01.021.

    Article  Google Scholar 

  • Fusi, S., & Abbott, L. F. (2007). Limits on the memory storage capacity of bounded synapses. Nature Neuroscience, 10, 485–493.

    PubMed  CAS  Google Scholar 

  • Fusi, S., Drew, P. J., & Abbott, L. (2005). Cascade models of synaptically stored memories. Neuron, 45(4), 599–611. doi:10.1016/j.neuron.2005.02.001.

    Article  PubMed  CAS  Google Scholar 

  • Fuster, J. (1995). Memory in the cerebral cortex: An empirical approach to neural networks in the human and nonhuman primate. Cambridge, MA: MIT Press.

  • Jung, M. W., & McNaughton, B. L. (1993). Spatial selectivity of unit activity in the hippocampal granular layer. Hippocampus, 3(0), 165–182.

    Article  PubMed  CAS  Google Scholar 

  • Leibold, C., & Kempter, R. (2008). Sparseness constrains the prolongation of memory lifetime via synaptic metaplasticity. Cerebral Cortex 18, 67–77.

    Article  PubMed  Google Scholar 

  • Lennie P (2003) The cost of cortical computation. Current Biology, 13(6), 493–497. doi:10.1016/S0960-9822(03)00135-0.

    Article  PubMed  CAS  Google Scholar 

  • Miyashita, Y., & Hayashi, T. (2000). Neural representation of visual objects: Encoding and top-down activation. Current Opinion in Neurobiology, 10(2), 187–194.

    Article  PubMed  CAS  Google Scholar 

  • Olshausen, B. A., & Field, D. J. (2004). Sparse coding of sensory inputs. Current Opinion in Neurobiology, 14(4), 481–487. doi:10.1016/j.conb.2004.07.007.

    Article  PubMed  CAS  Google Scholar 

  • Quiroga, R. Q., Reddy, L., Kreiman, G., Koch, C., & Fried, I. (2005). Invariant visual representation by single neurons in the human brain. Nature, 435, 1102–1107.

    Article  PubMed  CAS  Google Scholar 

  • Robertson, T., Wright, F. T., & Dykstra, R. L. (1988). Order restricted statistical inference. Wiley.

  • Rolls, E. T., & Tovee, M. J. (1995). Sparseness of the neuronal representation of stimuli in the primate temporal visual cortex. Journal of Physiology, 73(2), 713–726.

    CAS  Google Scholar 

  • Romani, S., Amit, D., & Amit, Y. (2008). Optimizing one-shot learning with binary synapses. Neural Computation, 20, 1928–1950.

    Article  PubMed  Google Scholar 

  • Sato, T., Uchida, G., & Tanifuji, M. (2007). The nature of neuronal clustering in inferotemporal cortex of macaque monkey revealed by optical imaging and extracellular recording. In 34th Ann. meet. of soc. for neuroscience. San Diego, USA.

  • Wang, X. J. (2001). Synaptic reverberation underlying mnemonic persistent activity. Trends in Neurosciences, 24(8), 455–463. doi:10.1016/S0166-2236(00)01868-3.

    Article  PubMed  CAS  Google Scholar 

  • Willshaw, D., Buneman, O. P., & Longuet-Higgins, H. (1969). Non-holographic associative memory. Nature (London), 222, 960–962.

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yibi Huang.

Additional information

Action Editor: Mark van Rossum

Supported in part by NSF ITR DMS-0706816.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

(PDF 131 KB)

Appendix

Appendix

1.1 A.1 Kronecker product

If A is an m×n matrix and B is a p×q matrix, then the Kronecker product A ⊗ B is the mp×nq block matrix

$$ A\otimes B = \begin{bmatrix} a_{11} B & \cdots & a_{1n}B \\ \vdots & \ddots & \vdots \\ a_{m1} B & \cdots & a_{mn} B \end{bmatrix}. $$

The Kronecker product is bilinear and associative

  • A ⊗ (B + C) = A ⊗ B + A ⊗ C,

  • (A + B) ⊗ C = A ⊗ C + B ⊗ C,

  • (kA) ⊗ B = A ⊗ (kB) = k(A ⊗ B),

  • (A ⊗ B) ⊗ C = A ⊗ (B ⊗ C),

where A, B and C are matrices and k is a scalar. If A, B, C and D are matrices of such size that one can form the matrix products AC and BD, then

$$ (A \otimes B)(C \otimes D) = AC \otimes BD. $$

This is called the mixed-product property because it mixes the ordinary matrix product and the Kronecker product.

1.2 A.2 Transition matrices and covariances

As the synaptic modification rule is local, the evolution of the state of a synapse is a Markov chain. Summing over the possible firing states of the pre- and postsynaptic neuron, the probability a synapse transits from α m to α m is

$$ f^2 q^{11}_{m m'} + f(1\!-\!f) q^{10}_{m m'}+ (1\!-\!f) f q^{01}_{m m'} + (1\!-\!f)^2 q^{00}_{m m'}, $$

so the transition matrix can then be written as

$$ P = f^2Q^{11}+f(1\!-\!f) Q^{10}+f(1\!-\!f) Q^{01}+(1\!-\!f)^2Q^{00}. $$

Similarly, a pair of synapses with a common postsynaptic neuron \((J^{(p)}_{ij}, J^{(p)}_{ik})\) is also a Markov chain, where the state space is the cross product {α 1, α 2,...,α M }×{α 1, α 2,...,α M }. Again summing over the firing status of the three neurons i, j, k, the transition probability from (α l ,α m ) to (α l,α m) is

$$\begin{array}{lll} &&{\kern-6pt}f[fq^{11}_{l l'}+(1\!-\!f) q^{10}_{l l'}][fq^{11}_{m m'}+(1\!-\!f) q^{10}_{m m'}]\\ &&+\;(1\!-\!f) [fq^{01}_{l l'}+(1\!-\!f) q^{00}_{l l'}][fq^{01}_{m m'}+(1\!-\!f) q^{00}_{m m'}] \end{array}$$

so the transition matrix is

$$\begin{array}{rll} S&=&f [fQ^{11}+(1\!-\!f) Q^{10}]\otimes[fQ^{11}+(1\!-\!f) Q^{10}]\\ &&+\;(1\!-\!f) [fQ^{01}\!+\!(1\!-\!f) Q^{00}]\otimes[fQ^{01}\!+\!(1\!-\!f) Q^{00}] \end{array}$$

Suppose the network is initialized at its stationary state. \(\pi_x^{(0)}=\pi\) and \(\gamma^{(0)}_x=\gamma\). To obtain the mean and variance after the first step of learning (Eq. (6)), given \(\xi^{(1)}_i=x\), since the presynaptic neuron is on \(\xi^{(1)}_j=\xi^{(1)}_k=1\), the transition probability for \(J^{(p)}_{ij}\) from α m to α m is \(q^{x1}_{m m'}\), and the transition probability for \((J^{(p)}_{ij}, J^{(p)}_{ik})\) from (α l ,α m ) to (α l,α m) is \(q^{x1}_{l l'}q^{x1}_{m m'}\). This explains Eq. (6).

Since \(J^{(p)}_{ij}\) is a indicators variable, \(\mathsf{E}[J^{(p)}_{ij}|\xi^{(1)}_i=x,\xi^{(1)}_j=1]\) and \(\mathsf{E}[J^{(p)}_{ij}\otimes J^{(p)}_{ik}|\xi^{(1)}_i=x,\xi^{(1)}_j=\xi^{(1)}_k=1]\) are exactly the p-step distribution of the Markov chains \(\pi^{(p)}_x\) and \(\gamma^{(p)}_x\), and Eq. (7) comes from the Kolmogorov equations.

Equation (8) is straightforward since \(W^{(p)}_{ij}=J^{(p)}_{ij}\mathbf{w}^T\). One can verify that \(\mathsf{Var}(J^{(p)}_{ij}|\xi^{(1)}_i=x,\xi^{(1)}_j=1)=\mathrm{Diag}(\pi_x^{(p)}) -\pi_x^{(p)}(\pi_x^{(p)})^T\) and deduce Eq. (9), where \(\mathrm{Diag}(\pi_x^{(p)})\) is the diagonal matrix with \(\pi_x^{(p)}\) on the diagonal. To obtain Eq. (10) we use:

$$\begin{array}{rll} \mathsf{Cov}(J_{ij}\mathbf{w}^T,J_{ik}\mathbf{w}^T)&=&\mathsf{E}[(J_{ij}\mathbf{w}^T)^T J_{ik}\mathbf{w}^T] - \mu^2\\ &=& \mathbf{w}\mathsf{E}[J_{ij}^T J_{ik}]\mathbf{w}^T - \mu^2\\&=&\gamma(\mathbf{w}\otimes\mathbf{w})^{T} -\mu^2. \end{array}$$

where the last equality comes from the mixed-product property of the Kronecker product (See Appendix A.1).

1.3 A.3 Decorrelating the synapses

If the four Q matrices are of the form \(Q^{11}=I_{2m}+q_+D_+\), \(Q^{01}=Q^{10}=I_{2m}+q_-D_-\), \(Q^{00}=I_{2m}+q_0D_+\), where \(q_0=\frac{f^2q_+}{(1-f)^2}\), \(q_-=\frac{\tau f q_+}{1-f}\), then

$$\begin{array}{lll} P_1=I_{2m}+fq_+(D_++\tau D_-),\\ P_0=I_{2m}+\frac{f^2q_+}{1-f}(D_++\tau D_-),\\ P=I+2f^2q_+(D_++\tau D_-) \end{array}$$

If π is the stationary distribution of P, i.e. πP = π, then π(D  +  + τD ) = 0, which implies πP 1 = πP 0 = π. Hence, γ = π ⊗ π of is the stationary distribution of S since

$$\begin{array}{rll} (\pi\otimes\pi) S &=& (\pi\otimes\pi) [f P_1\otimes P_1 + (1-f) P_0\otimes P_0]\\ &=&f (\pi P_1\otimes \pi P_1) + (1-f) (\pi P_0)\otimes (\pi P_0)\\ &=&f (\pi\otimes \pi) + (1-f) (\pi\otimes \pi)=(\pi\otimes \pi). \end{array}$$

Thus the stationary synaptic covariance ρ defined in Eq. (14) must be 0.

This learning rule only works for single-level coded stimuli, but not for multi-level coded stimuli.

Table 2 Summary of properties of different multi-state models

1.4 A.4 Bias in sparseness measurement

In Rolls and Tovee (1995) and Sato et al. (2007) coding level is defined as

$$a = \frac{(\sum_i^n r_i/n)^2}{\sum_i^n r_i^2/n} $$
(36)

where r i is the firing rate of the neuron to the ith stimulus in a set of n stimuli. Say the baseline firing rate of the neuron is r, and if the neuron is selective to the stimulus, the firing rate is Lr (L > 1), and assume the true sparseness is f. Then on average ∑  i r i /n ≈ fLr + (1 − f)r, \(\sum_i r_i^2/n \approx fL^2r^2 + (1-f)r^2\).

$$ a \approx \frac{(fL+1-f)^2}{fL^2+1-f} > f $$

Though af as L→ ∞, when f→0, a ≈ 1 − f. The relationship between a and f is not monotone and when L = 30, a is always above 0.1 whatever f is (Fig. 11). If the baseline firing rate is subtracted from each r i , then L will be larger and make a closer to f.

Fig. 11
figure 11

The sparseness measure a (Eq. (36)) versus the actual coding level f for L = 10 and 30

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, Y., Amit, Y. Capacity analysis in multi-state synaptic models: a retrieval probability perspective. J Comput Neurosci 30, 699–720 (2011). https://doi.org/10.1007/s10827-010-0287-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10827-010-0287-7

Keywords

Navigation