Skip to main content
Log in

Non-parametric Algorithmic Generation of Neuronal Morphologies

  • Published:
Neuroinformatics Aims and scope Submit manuscript

Abstract

Generation algorithms allow for the generation of Virtual Neurons (VNs) from a small set of morphological properties. The set describes the morphological properties of real neurons in terms of statistical descriptors such as the number of branches and segment lengths (among others). The majority of reconstruction algorithms use the observed properties to estimate the parameters of a priori fixed probability distributions in order to construct statistical descriptors that fit well with the observed data. In this article, we present a non-parametric generation algorithm based on kernel density estimators (KDEs). The new algorithm is called KDE-Neuron and has three advantages over parametric reconstruction algorithms: (1) no a priori specifications about the distributions underlying the real data, (2) peculiarities in the biological data will be reflected in the VNs, and (3) ability to reconstruct different cell types. We experimentally generated motor neurons and granule cells, and statistically validated the obtained results. Moreover, we assessed the quality of the prototype data set and observed that our generated neurons are as good as the prototype data in terms of the used statistical descriptors. The opportunities and limitations of data-driven algorithmic reconstruction of neurons are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. For the sake of readability, ‘generation’ and ‘reconstruction’ are used as synonyms in the remainder of the text.

  2. Mathematical details related to this section can be found in the Appendix.

  3. More advanced methods (in efficiency and effectivity) are possible but this is not the focus in the current work. We refer the interested reader to Ch. 11 of Bishop (2006)

  4. To avoid a pre-processing step, we directly work with compartments that are specified in the SWC format, rather than with segments.

  5. Outnumbering occurs because every bifurcation compartment is surrounded by prolongating compartments, and terminating compartments are always preluded by prolongating compartments.

  6. We selected the options in a simple manner by trying out a few different settings in preliminary experiments. So, even though we will show good results with these options, there is still space for more fine-tuning. A grid search over the options can for example be used for an extensive parameter search.

  7. Electrophysiological simulations are more precise when an odd number of compartments is used (Carnevale and Hines 2006).

  8. The low number of filtered neurons can be attributed to the fixed distribution of branching angles because, in order to pass the filter, a particular sequence of branch angles and segments lengths is required. For instance, the filter tests the Euclidean distance between soma and terminal tips, which is entirely defined by the angles and lengths of a branch.

References

  • Alpaydin, E. (2004). Introduction to machine learning. Cambridge, MA: MIT Press.

    Google Scholar 

  • Ambros-Ingerson, J., & Holmes, W. R. (2005). Analysis and comparison of morphological reconstructions of hippocampal field CA1 pyramidal cells. Hippocampus, 15, 302–315.

    Article  PubMed  Google Scholar 

  • Ascoli, G. A. (1999). Progress and perspectives in computational neuroanatomy. Anatomical Record, 257, 195–207.

    Article  PubMed  CAS  Google Scholar 

  • Ascoli, G. A. (2006). Mobilizing the base of neuroscience data: The case of neuronal morphologies. Nature Neuroscience Reviews, 318(7), 318–324.

    Article  CAS  Google Scholar 

  • Ascoli, G. A. (2007). Success and rewards in sharing digital reconstructions of neuronal morphology. Neuroinformatics, 5, 154–160.

    Article  PubMed  Google Scholar 

  • Ascoli, G. A., & Krichmar, J. L., (2000). L-Neuron: A modeling tool for the efficient generation and parsimonious description of dendritic morphology. Neurocomputing, 32–33, 1003–1011.

    Article  Google Scholar 

  • Ascoli, G. A., Krichmar, J. L., Scorcioni, R., Nasuto, S. J., & Senft, S. L. (2001). Computer generation and quantitative morphometric analysis of virtual neurons. Anatomy and Embryology, 204, 283–301.

    Article  PubMed  CAS  Google Scholar 

  • Bishop, C. (2006). Pattern recognition and machine learning. New York, NY: Springer-Verlag.

    Google Scholar 

  • Burke, R., Marks, W., & Ulfhake, B. (1992). A parsimonious description of motorneuron dendritic morphology using computer simulation. Journal of Neuroscience, 12(6), 2403–2416.

    PubMed  CAS  Google Scholar 

  • Burns, G. (2001). Knowledge management of the neuroscientific literature: The data model of the neuroscholar system. Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences, 356, 1187–1208.

    Article  PubMed  CAS  Google Scholar 

  • Cameron, W., He, F., Kalipatnapu, P., Jodkowski, J., & Guthrie, R. (1991). Morphometric analysis of phrenic motoneurons in the cat during postnatal development. Journal of Comparative Neurology, 314(4), 763–776.

    Article  PubMed  CAS  Google Scholar 

  • Cannon, R., Turner, D., Pyapali, G., & Wheal, H. (1998). An on-line archive of reconstructed hippocampal neurons. Journal of Neuroscience Methods, 84(1–2), 49–54.

    Article  PubMed  CAS  Google Scholar 

  • Carnevale, N., & Hines, M. (2006). The NEURON Book. Cambridge, UK: Cambridge University Press.

    Google Scholar 

  • da F. Costa, L., Barbosa, M., & Coupez, V. (2005). On the potential of the excluded volume and autocorrelation as neuromorphometric descriptors. Physica A: Statistical Mechanics and its Applications, 348, 317–326.

    Article  Google Scholar 

  • De Schutter, E., & Bower, J. M. (1994). An active membrane model of the cerebellar purkinje cell I. simulation of current clamps in slice. Journal of Neurophysiology, 71(1), 375–400.

    PubMed  Google Scholar 

  • Devroye, L. (1986). Non-Uniform Random Variate Generation. New York, NY: Springer-Verlag.

    Google Scholar 

  • Donohue, D. E., & Ascoli, G. A. (2005). Local diameter fully constraints dendritic size in basal but not apical trees of ca1 pyramidal neurons. Journal of Computational Neuroscience, 19, 223–238.

    Article  PubMed  Google Scholar 

  • Eberhard, J., Wanner, A., & Wittum, G. (2007). NeuGen: A tool for the generation of realistic morphology of cortical neurons and neural networks in 3D. Neurocomputing, 70(1–3), 327–342.

    Google Scholar 

  • Feng, N., Ning, G., & Zheng, X. (2005). A framework for simulating axon guidance. Neurocomputing, 68, 70–84.

    Article  Google Scholar 

  • Fernández, E., & Jelinek, H. F. (2001). Use of fractal theory in neuroscience: Methods, advantages and potential problems. Methods, 24, 309–321.

    Article  PubMed  CAS  Google Scholar 

  • Glaser, J., & Glaser, E. (1990). Neuron imaging with Neurolucida - a PC-based system for image combining microscopy. Computerized Medical Imaging and Graphics, 14, 307–317.

    Article  PubMed  CAS  Google Scholar 

  • Hillman, D. (1979). Neuronal shape parameters and substructures as a basis of neuronal form, volume The neurosciences, Fourth Study Program, chapter 27. Cambridge, MA: The MIT Press.

    Google Scholar 

  • Horch, H. W., & Katz, L. C. (2002). Bdnf release from single cells elicits local dendritic growth in nearby neurons. Nat Neurosci, 5(11), 1177–1184.

    Article  PubMed  CAS  Google Scholar 

  • Jones, C., Marron, J., & Sheather, S. (1996). A brief survey of bandwidth selection for density estimation. Journal of American Statistical Association, 91(433), 401–407.

    Article  Google Scholar 

  • Kaspirzhny, A. V., Gogan, P., Horcholle-Bossavit, G., & Tyc-Dumont, S. (2002). Neuronal morphology data bases: Morphological noise and assessment of data quality. Network: Computation in Neural Systems, 13, 357–380.

    Article  Google Scholar 

  • Lehmann, E., & Romano, J. (2006). Testing statistical hypotheses (3rd ed.). Springer-Verlag, Berlin Heidelberg.

    Google Scholar 

  • Lien, J.-M., Morales, M., & Amato, N. M. (2003). Neuron PRM: A framework for constructing cortical networks. Neurocomputing, 52–54, 191–197.

    Article  Google Scholar 

  • Lindsay, K. A., Maxwell, D. J., Rosenberg, J. R., & Tucker, G. (2007). A new approach to reconstruction models of dendritic branching patterns. Mathematical Biosciences, 205(2), 271–296.

    Article  PubMed  Google Scholar 

  • Loader, C. (1999). Bandwidth selection: Classical or plug-in? The Annals of Statistics, 27(2), 451–438.

    Article  Google Scholar 

  • Luczak, A. (2006). Spatial embedding of neuron trees modeled by diffusive growth. Journal of Neuroscience Methods, 157, 132–141.

    Article  PubMed  Google Scholar 

  • Markram, H. (2006). The blue brain project. Nature Reviews Neuroscience, 7, 153–160.

    Article  PubMed  CAS  Google Scholar 

  • Myatt, D., Hadlington, T., Ascoli, G., & Nasuto, S. (2007). Inter-user variability of semi-manually reconstructed dendritic trees with the freeware tool neuromantic. Journal of Microscopy, (submitted).

  • Neapolitan, R. E. (2003). Learning bayesian networks. Prentice Hall.

  • Nowakowski, R. S., Hayes, N. S., & Egger, M. D. (1992). Competitive interactions during dendritic growth: A simple stochastic growth algorithm. Brain Research, 576, 152–156.

    Article  PubMed  CAS  Google Scholar 

  • Parzen, E. (1962). On estimation of probability density function and mode. Annals of Mathematical Statistics, 33(3), 1065–1076.

    Article  Google Scholar 

  • Raykar, V., & Duraiswami, R. (2006). Very fast optimal bandwidth selection for univariate kernel density estimation. Technical Report CS-TR-4774, University of Maryland, College Park.

  • Rihn, L., & Claiborne, B. (1990). Dendritic growth and regression in rat dentate granule cells during late postnatal development. Brain Research. Developmental Brain Research, 54(1), 115–124.

    Article  PubMed  CAS  Google Scholar 

  • Robert, C. (2007). The Bayesian Choice: From decision-theoretic foundations to computational Implementation. New York, NY: Springer-Verlag.

    Google Scholar 

  • Samsonovich, A. V., & Ascoli, G. A. (2003). Statistical morphological analysis of hippocampal principal neurons indicates cell-specific repulsion of dendrites from their own cell. Journal of Neuroscience Research, 71, 173–187.

    Article  PubMed  CAS  Google Scholar 

  • Samsonovich, A. V., & Ascoli, G. A. (2005a). Algortihmic description of hippocampal granule cell dendritic morphology. Neurocomputing, 65–66, 253–260.

    Article  Google Scholar 

  • Samsonovich, A. V., & Ascoli, G. A. (2005b). Statistical determinants of dendritic morphology in hippocampal pyramidal neurons: A hidden markov model. Hippocampus, 15, 166–183.

    Article  PubMed  Google Scholar 

  • Scorcioni, R., Lazarewicz, M. T., & Ascoli, G. A. (2004). Quantitative morphometry of hippocampal pyramidal cells: Differences between anatomical classes and reconstructing laboratories. Journal Comparative Neurology, 473, 177–193.

    Article  Google Scholar 

  • Scott, D. (1992). Multivariate density estimation: theory, practice, and visualization. New York, NY: John Wiley and Sons Inc.

    Google Scholar 

  • Scott, E., & Luo, L. (2001). How do dendrites take their shape? Nature (neuroscience), 4(4), 359–365.

    Article  CAS  Google Scholar 

  • Silverman, B. (1986). Density estimation for statistics and data analysis. London, UK: Chapman and Hall.

    Google Scholar 

  • Steuber, V., De Schutter, E., & Jaeger, D. (2004). Passive models of neurons in the deep cerebellar nuclei: The effect of reconstruction errors. Neurocomputing, 58–60, 563–568.

    Article  Google Scholar 

  • Szilágyi, T., & De Schutter, E. (2004). Effects of variability in anatomical reconstruction techniques on models of synaptic integration by dendrites: a comparison of three internet archives. European Journal of Neuroscience, 19, 1257–1266.

    Article  PubMed  Google Scholar 

  • Thrun, S., Burgard, W., & Fox, D. (2005). Probabilistic robotics. Cambridge, Mass: Oxford University Press.

    Google Scholar 

  • Torben-Nielsen, B., Tuyls, K., & Postma, E. O. (2008). Evol-neuron: Virtual neuron generation. Neurocomputing, 71(4–6), 963–972.

    Article  Google Scholar 

  • van Pelt, J., & Schierwagen, A. (2004). Morphological analysis and modeling of neuronal dendrites. Mathematical Biosciences, 188, 147–155.

    Article  PubMed  Google Scholar 

  • Wand, M., & Jones, C. (1995). Kernel smoothing. London, UK: Chapman and Hall.

    Google Scholar 

  • Witten, I. H., & Frank, E. (2005). Data Mining: Practical machine learning tools and techniques (2nd ed.). San Francisco, CA: Morgan Kaufmann.

    Google Scholar 

Download references

Acknowledgements

BTN was funded by the Interactive Collaborative Information Systems (ICIS) project, supported by the Dutch Ministry of Economic Affairs, grant nr: BSIK03024. SV is supported by the Dutch Organization for Scientific Research (NWO), ToKeN programme, viz. the IPOL project, grant nr.: 634.000.435. The authors thank Drs. Jaap van Pelt and Klaus Stiefel, and Steven de Jong for fruitful discussion on this topic. We also wish to thank the three anonymous reviewers for their suggestions that improved the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Benjamin Torben-Nielsen.

Appendix: Kernel Density Estimates

Appendix: Kernel Density Estimates

In the appendix we give more details on the implementation of kernel density estimation for univariate and bivariate prototype data. We also explain how the automatic selection of the bandwidth parameter has been performed which is an important issue that we did not address so far.

Univariate KDEs

An intuitive, non-parametric way to model observed data is by means of a histogram estimator, which is formally defined by:

$$ \hat{f}(x) = \frac{| \left\{ x_i \in N(x), \: i=1, \dots, n \right\} |}{nh} \enspace , $$
(5)

where N(x) is the bin centred at x and of width h. A KDE is a smooth version of the histogram estimator were a kernel function K is used that counts observations close to x with weights that are decreasing with the distance from x (Parzen 1962):

$$ \hat{f}(x) = \frac{1}{nh} \sum_{i=1}^n K\left(\frac{x - x_i}{h}\right) \enspace . $$
(6)

A popular choice of kernel function is the Gaussian distribution with zero mean and unit variance. The KDE with Gaussian kernel function becomes:

$$ \hat{f}(x) = \frac{1}{nh\sqrt{2\pi}} \sum_{i=1}^n \exp\left[ -\frac{(x - x_i)^2}{2h^2} \right] \enspace , $$
(7)

with h the bandwidth (standard deviation) of the Gaussian components. The kernel density estimator is a legitimate probability density function when the kernel function is non-negative everywhere and integrates to one. It is easily verified that this holds for the Gaussian. The parameter h acts as the smoothing parameter.

Multivariate KDEs

Suppose we have a sample consisting of n bivariate observations of the form (x 1, y 1), ..., (x n , y n ) that are independently drawn from an unknown probability distribution f over the random variable (X, Y). The KDE for the joint distribution is then:

$$ \hat{f}(x, y) = \frac{1}{nh_xh_y} \sum_{i=1}^n K\left(\frac{x - x_i}{h_x}\right) K\left(\frac{y - y_i}{h_y}\right) \enspace , $$
(8)

with h x and h y the bandwidths for the random variables X and Y, respectively. Since we are using the Gaussian as kernel function the estimator for the joint distribution becomes:

$$ \hat{f}(x, y) = \frac{1}{nh_xh_y\sqrt{(2\pi)^2}} \sum_{i=1}^n \exp\left[ -\frac{(x - x_i)^2}{2h_x^2} -\frac{(y - y_i)^2}{2h_y^2} \right]. $$
(9)

In the rest of this appendix we write equations explicitly for the Gaussian kernel although they can be written in terms of any other kernel function.

The kernel density estimate of a marginal distribution can be calculated from an application of the sum rule of probability. For example, the estimator for the marginal distribution f X (x) of X (assuming Y to be a real-valued random variable) is:

$$ \hat{f}_X(x) = \int^{+\infty}_{-\infty} \hat{f}(x, y) \text{d} y = \frac{1}{nh_x\sqrt{2\pi}} \sum_{i=1}^n \exp\left[ -\frac{(x - x_i)^2}{2h_x^2} \right], $$
(10)

since the Gaussian integrates to one over the real line. The kernel density estimator for f Y (y) can be defined analogously. A kernel density estimator for a conditional density, e.g., the estimator of the density function of X given that Y is some value y, is computed by a straightforward application of the product rule of probability:

$$ \hat{f}(x | y) = \frac{\hat{f}(x, y)}{\hat{f}_Y(y)} = \frac{ \frac{1}{h_x \sqrt{2\pi}} \sum_{i=1}^n \exp\left[ -\frac{(x - x_i)^2}{2h_x^2} -\frac{(y - y_i)^2}{2h_y^2} \right] } {\sum_{i=1}^n \exp\left[ -\frac{(y - y_i)^2}{2h_y^2} \right]}. $$
(11)

The above equations for density estimation with respect to bivariate observations can be generalized in a straightforward manner to observations with more than two dimensions.

Automatic Bandwidth Selection

The bandwidth is the single parameter for kernel density estimation using Gaussian kernel. It acts as a smoothing factor and consequently determines the quality of the estimator. The best bandwidth has to be learned from the data since the data, besides prior knowledge if available, are the only objective basis to decide when the estimator is oversmoothing or undersmoothing. Various methods have been proposed for the task of automatic bandwidth selection. A brief survey is found in (Wand and Jones 1995; Jones et al. 1996).

It is natural to define the best bandwidth as the one resulting in the best KDE. The quality of a KDE \(\hat{f}(x)\) with respect to the true density f(x) is often measured by the integrated squared error:

$$ \text{ISE}(\hat{f}, f{\kern1pt}) = \int^{+\infty}_{-\infty} (\hat{f}(x) - f(x))^2 dx \enspace . $$
(12)

The ISE depends on a particular sample of n observations since the sample is used as information for the KDE. The mean integrated squared error averages the integrated squared error over all samples:

$$ \text{MISE}(\hat{f}, f{\kern1pt}) = \int^{+\infty}_{-\infty} \text{E}\left[(\hat{f}(x) - f(x))^2\right] dx \enspace , $$
(13)

where we have used \(\text{E}\left[\cdot\right]\) to denote expected value. Clearly, the bandwidth with minimum MISE gives us an estimator that on average has lowest ISE for a particular sample of n observations. Given that n is sufficiently large, an approximation to MISE can be shown to be (Scott 1992):

$$ \text{AMISE}(\hat{f}, f) = \frac{R(K)}{nh} + \frac{h^4\mu_2(K)^2R(f'')}{4} \enspace , $$
(14)

where K is the kernel and the following definitions are used:

$$ R(g) = \int^{+\infty}_{-\infty} g(x)^2 dx \mbox{, and } \mu_2(g) = \int^{+\infty}_{-\infty} x^2g(x) dx \enspace . $$
(15)

Setting the derivative of Eq. 14 with respect to the bandwidth to zero gives the optimal solution for the bandwidth:

$$ h = \left[ \frac{R(K)}{\mu_2(K)^2 R(f'') n} \right]^{1/5} \enspace . $$
(16)

Unfortunately this solution depends on the second order derivative of the true density f(x) which is unknown. Moreover, estimating the derivative is a more difficult problem than estimating the density itself (Raykar and Duraiswami 2006).

There are several approaches to approximate the best bandwidth in terms of AMISE. The basic approach is called rules of thumb and assumes that the data is generated by a Gaussian distribution (Wand and Jones 1995). Given data of dimensionality d, the optimal bandwidth for the i-th variable is then given by:

$$ h_i = \left( \frac{4}{d+2} \right)^{1/(d+4)} n^{-1/(d+4)} \hat{\sigma}_i \enspace , $$
(17)

with \(\hat{\sigma}_i\) the estimator for the standard deviation of the i-th variable. For univariate data (d = 1), a more specific estimate of the bandwidth is:

$$ h_i = 0.9 \: S \: n^{-1/5} \enspace , $$
(18)

with S the minimum of the standard deviation and three-quarters of its interquartile range (Silverman 1986). In our experiments we found that the rules of thumb bandwidths for multivariate data are often better than the bandwidths found by some more advanced approaches, even though the assumption of Gaussian distributions is not satisfied. This is in line with the findings reported by Loader (1999).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Torben-Nielsen, B., Vanderlooy, S. & Postma, E.O. Non-parametric Algorithmic Generation of Neuronal Morphologies. Neuroinform 6, 257–277 (2008). https://doi.org/10.1007/s12021-008-9026-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12021-008-9026-x

Keywords

Navigation