Skip to main content
Log in

The novel aggregation function-based neuron models in complex domain

  • Original Paper
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

The computational power of a neuron lies in the spatial grouping of synapses belonging to any dendrite tree. Attempts to give a mathematical representation to the grouping process of synapses continue to be a fascinating field of work for researchers in the neural network community. In the literature, we generally find neuron models that comprise of summation, radial basis or product aggregation function, as basic unit of feed-forward multilayer neural network. All these models and their corresponding networks have their own merits and demerits. The MLP constructs global approximation to input–output mapping, while a RBF network, using exponentially decaying localized non-linearity, constructs local approximation to input–output mapping. In this paper, we propose two compensatory type novel aggregation functions for artificial neurons. They produce net potential as linear or non-linear composition of basic summation and radial basis operations over a set of input signals. The neuron models based on these aggregation functions ensure faster convergence, better training and prediction accuracy. The learning and generalization capabilities of these neurons have been tested over various classification and functional mapping problems. These neurons have also shown excellent generalization ability over the two-dimensional transformations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. In this paper, a neuron with only summation aggregation function is referred as ‘conventional’ neuron and network of these neurons as ‘MLP’.

  2. \(\Re\) and \(\Im\) stands for real and imaginary components of complex value.

References

  • Aizenberg I, Moraga C (2007) Multilayer feedforward neural network based on multi-valued neurons (MLMVN) and a back-propagation learning algorithm. Soft Comput 11(2):169–183

    Article  Google Scholar 

  • Aizenberg NN, Ivaskiv YL, Pospelov DA (1971) About one generalization of the threshold function Doklady Akademii Nauk SSSR (The Reports of the Academy of Sciences of the USSR) 196(6):1287–1290 (in Russian)

  • Aizenberg I, Aizenberg N, Vandewalle J (2000) Multi-valued and universal binary neurons: theory, learning, applications. Kluwer, Boston

    Google Scholar 

  • Blake CL, Merz CJ (1998) UCI repository of machine learning database. http://www.ics.uci.edu/mealrn/MLRepository.html, Department of Information and Computer Science, University of California

  • Brown JW, Churchill RV (2003) Complex variables and applications, VIIth edn. Mc Graw-Hill, New York

  • Chaturvedi DK, Satsangi PS, Kalra PK (1999) New neuron models for simulating rotating electrical machines and load forecasting problems. Electr Power Syst Res 52:123–131

    Google Scholar 

  • Chen X, Li S (2005) A modified error function for the complex-value back-propagation neural network. Neural Inf Process 8(1)

  • Foggel BD (1991) An information criterion for optimal neural network selection. IEEE Trans Neural Netw 2:490–497

    Article  Google Scholar 

  • Georgiou GM, Koutsougeras C (1992) Complex domain back-propagation. IEEE Trans Circuits Syst II 39(5)

  • Gorman RP, Sejnowski TJ (1988) Analysis of hidden units in a layered network trained to classify sonar targets. Neural Netw 1:75–89

    Article  Google Scholar 

  • Gupta MM, Homma N (2003) Static and dynamic neural networks, from fundamentals to advanced theory. Wiley, New York

  • Hirose A (2006) Complex-valued neural networks. Springer, New York

    Book  MATH  Google Scholar 

  • Homma N, Gupta MM (2002) A general second-order neural unit. Bull Coll Med Sci Tohoku Univ 11(1):1–6

    Google Scholar 

  • Jianping D, Sundararajan N, Saratchandran P (2002) Communication channel equalization using complex-valued minimal radial basis function neural networks. IEEE Trans Neural Netw 13(3):687–696

    Article  Google Scholar 

  • Koch C (1999) Biophysics of computation: information processing in single neurons. Oxford University Press, Oxford

  • Lee CC, Chung PC, Tsai JR, Chang CI (1999) Robust radial basis function neural network. IEEE Trans Syst Man Cybern B Cybern 29(6)

  • Leung H, Haykin S (1991) The complex backpropagation algorithm. IEEE Trans Signal Process 39(9)

  • Mel BW (1995) Information processing in dendritic tree. Neural Comput 6:1013–1085

    Google Scholar 

  • Nitta T (1997) An extension of the back-propagation algorithm to complex numbers. Neural Netw 10(8):1391–1415

    Article  Google Scholar 

  • Nitta T (2000) An analysis of the fundamental structure of complex-valued neurons. Neural Process Lett 12:239–246

    Article  MATH  Google Scholar 

  • Piazza F, Benvenuto N (1992) On the complex back-propagation algorithm. IEEE Trans Signal Process 40(4):967–969

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bipin Kumar Tripathi.

Appendix: Derivation of update rule

Appendix: Derivation of update rule

In a multilayer network, we consider a commonly used three-layer structure (LMN). First layer has L inputs, second layer has M proposed neurons described in Sect. 2 and the output layer consists of N conventional neurons. All weights, threshold, and input–output signals are complex numbers. By convention \(w_{lm}\) is the weight that connects \(l{\rm th}\) neuron to \(m{\rm th}\) neuron. \(\eta \in [0,1]\) is the learning rate and \(f^{\prime}\) is derivative of function f. Let \(Z=[ z_1, z_2, \ldots, z_L]\) be the vector of input signals. \(Z^{\rm T}\) is transpose of vector Z and \(\overline{z}\) is the complex conjugate of z. \(W^S_m = [ w^S_{1m}, w^S_{2m}, \ldots, w^S_{Lm}]\) is a vector of weights from inputs \((l=1,\ldots, L)\) to summation part of \(m{\rm th}\) SRS or SRP neuron and \(W^{{\rm RB}}_m=[w^{{\rm RB}}_{1m}, w^{{\rm RB}}_{2m}, \ldots, w^{{\rm RB}}_{Lm}]\) is a vector of weights from inputs to radial basis part of \(m{\rm th}\) SRS or SRP neuron. \(w_0\) is a bias and \(z_0\) is the bias input. From Eq. 7, the output of each neuron \((m=1,\ldots, M)\) in hidden layer can be expressed as:

$$ Y_m = f(\Re(V_m))+j f(\Im(V_m))=\Re(Y_m)+j \Im(Y_m) $$
(16)

Let \(V_m^\sigma\) be the net potential of SRS neuron in hidden layer then from Eqs. 4 and 5:

$$ V_m=V_m^\sigma=\Re\left(V_m^\sigma\right)+j \Im\left(V_m^\sigma\right)=H_m W^S_m Z^{\rm T} + K_m {\exp\left(-\left\|Z - W^{{\rm RB}}_m\right\|^2\right)} + w_{0m} z_0 $$
(17)

Let \(V_m^\pi\) be the net potential of SRP neuron in hidden layer then from Eqs. 1 and 2:

$$ V_m = V_m^\pi = H_m W^S_m Z^{\rm T} + K_m {\exp\left(-\left\|Z - W^{{\rm RB}}_m\right\|^2\right)} + H_m W^S_m Z^{\rm T} K_m {\exp\left(-\left\|Z - W^{{\rm RB}}_m\right\|^2\right)}+w_{0m}z_0 $$
(18)

The net internal potential of SRP neuron in Eq. 18 can also be expressed term wise as follows:

$$ V^{\pi}_m =V^{\pi_{1}}_m+V^{\pi_{2}}_m+V^{\pi_{1}}_m V^{\pi_{2}}_m+w_{0m}z_{0} $$
(19)

The output of a neuron (\(n=1,\ldots, N\)) in output layer can be given by:

$$ Y_n=f_{{\mathbf{C}}}(V_n)=f_{{\mathbf{C}}}\left(\sum_{m=1}^{M} w_{mn}Y_{m}+w_{0n}z_{0}\right) $$
(20)

Let YD be the desired output. The output error consists of its real and imaginary parts and defined as:

$$ e_n=\Re(e_n)+j\Im(e_n)=YD_n-Y_n $$
(21)

where, \(\Re(e_n)=\Re(YD_n)-\Re(Y_n)\) and \(\Im(e_n)=\Im(YD_n)-\Im(Y_n).\) The real-valued cost function (MSE) can be given as:

$$ E=\frac{1}{2N}\sum_{n=1}^{N} e_n \overline{e_n}=\frac{1}{2N}\sum_{n=1}^{N}\left[(\Re(e_n))^2+(\Im(e_n))^2 \right] $$
(22)

The complex-BP algorithm minimizes cost function E by recursively altering the weight coefficients based on gradient descent, given by:

$$ w^{\rm new}=w^{\rm old}-\eta \bigtriangledown_{w}E $$
(23)

where the gradient \(\bigtriangledown_{w}E\) is derived with respect to both real and imaginary parts of complex weights.

$$ \Updelta w=-\eta \bigtriangledown_{w}E=-\eta \bigtriangledown_{\Re(w)} E -j \eta \bigtriangledown_{\Im(w)} E=-\eta \left({\frac{\partial E}{\partial \Re(w)}}+j\ast{\frac{\partial E}{\partial \Im(w)}} \right) $$
(24)

For any weight in output layer, \(w = w_{mn}\)

$$ {\frac{-\partial E}{\partial \Re(w_{mn})}}=\left[\Re(e_n) f^{\prime}(\Re(V_n)){\frac{\partial \Re(V_n)}{\partial \Re(w_{mn})}}+\Im(e_n)f^{\prime}(\Im(V_n)){\frac{\partial \Im(V_n)} {\partial \Re(w_{mn})}}\right] $$
(25)

and

$$ {\frac{-\partial E}{\partial \Im(w_{mn})}} = \left[\Re(e_n) f^{\prime}(\Re(V_n)){\frac{\partial \Re(V_n)}{\partial \Im(w_{mn})}} +\Im(e_n) f^{\prime}(\Im(V_n)){\frac{\partial \Im(V_n)}{\partial \Im(w_{mn})}}\right] $$
(26)
$$ \Updelta w_{mn}=\eta \left[\Re(e_n)f^{\prime}(\Re(V_n)) \left\{{\frac{\partial \Re(V_n)}{\partial \Re(w_{mn})}}+j{\frac{\partial \Re(V_n)}{\partial \Im(w_{mn})}} \right\}+\Im(e_n)f^{\prime}(\Im(V_n))\left\{{\frac{\partial \Im(V_n)}{\partial \Re(w_{mn})}}+j{\frac{\partial \Im(V_n)}{\partial \Im(w_{mn})}}\right\}\right] $$
$$ \begin{aligned}\Updelta w_{mn}&=&\eta {\overline{Y}_m} (\Re(e_n) f^{\prime}(\Re(V_n))+ j\Im(e_n)f^{\prime}(\Im(V_n)))\\ \Updelta w_{0n}&=&\eta{\overline{z}_0} (\Re(e_n) f^{\prime}(\Re(V_n))+j \Im(e_n) f^{\prime}(\Im(V_n))) \end{aligned}$$
(27)

The weight update equation for weights between input and hidden layer can be obtained by calculating the gradient of cost function with respect to these weights. For \(w=w_{lm},\) following the chain rule of derivation:

$$ \begin{aligned} {\frac{-\partial E}{\partial \Re(w_{lm})}}&={\frac{1}{N}}{\frac{\partial (\Re(V_m))}{\partial \Re(w_{lm})}}f^{\prime}(\Re(V_m)) \sum_{n=1}^{N} \{\Re(e_n)f^{\prime}(\Re(V_n)) \Re(w_{mn})+\Im(e_n)f^{\prime}(\Im(V_n))\Im(w_{mn})\}\\ &\quad+ {\frac{1}{N}}{\frac{\partial (\Im(V_m))}{\partial \Re(w_{lm})}} f^{\prime}(\Im(V_m))\sum_{n=1}^{N}\{\Im(e_n)f^{\prime}(\Im(V_n)) \Re(w_{mn})- \Re(e_n) f^{\prime}(\Re(V_n)) \Im(w_{mn})\} \end{aligned} $$
(28)
  1. I

    In a three-layer network, the difference is at hidden layer where we can use either of proposed neuron. When SRS neuron is in hidden layer then \(V_m=V_m^\sigma\) and Eq. 28 can be rewritten as:

    $$ \begin{aligned} {\frac{-\partial E}{\partial \Re(w_{lm}^S)}} =&\;{\frac{1}{N}}\left\{{\frac{\partial \left(\Re\left(V_m^\sigma \right) \right)}{\partial \Re\left(w_{lm}^S \right)}}\Re\left(\Gamma_{m} ^{\sigma} \right)+{\frac{\partial \left(\Im\left(V_m^\sigma \right) \right)} {\partial \Re\left(w_{lm}^S \right)}}\Im\left(\Gamma_{m}^{\sigma} \right) \right\}\\ =&\, {\frac{1}{N}}\left\{(\Re(H_m) \Re(z_l)-\Im(H_m) \Im(z_l)) \Re\left(\Gamma_{m}^ {\sigma}\right)+(\Re(H_m) \Im(z_l)+\Im(H_m) \Re(z_l)) \Im\left(\Gamma_{m}^{\sigma} \right) \right\} \end{aligned} $$
    (29)

    Similarly

    $$ {\frac{-\partial E}{\partial \Im\left(w_{lm}^S\right)}}={\frac{1}{N}}\left\{ (\Re(H_m) (-\Im(z_l))-\Im(H_m)\Re(z_l)) \Re\left(\Gamma_{m}^{\sigma}\right)+(\Re(H_m) \Re(z_l)-\Im(H_m) \Im(z_l)) \Im\left(\Gamma_{m}^{\sigma}\right) \right\} $$
    (30)

    Now substituting Eqs. 29 and 30 in Eq. 24

    $$ \Updelta w^{S}_{lm}={\frac{\eta}{N}}\left\{ (\Re(z_l)-j \Im(z_l)) (\Re(H_m)-j \Im(H_m)) (\Re(\Gamma_{m}^{\sigma}) +j \Im(\Gamma_{m}^{\sigma}))\right\}={\frac{\eta}{N}} \overline{z_l} \overline{H}_{m} \Gamma_{m}^{\sigma} $$
    (31)

    Following the same procedure update equation for other learning parameters can be obtained:

    $$ \Updelta w^{{\rm RB}}_{lm}={\frac{2\eta} {N}} \exp\left(-\left\|Z - W^{{\rm RB}}_m\right\|^2\right) \left(z_l - w^{{\rm RB}}_{lm}\right) \left\{ \Re\left(\Gamma_{m}^{\sigma}\right) \Re(K_m) + \Im\left(\Gamma_{m}^{\sigma}\right)\Im(K_m) \right\} $$
    (32)
    $$ \Updelta H_m={\frac{\eta}{N}} \overline{\left(W^{S}_{m} Z^{\rm T}\right)} \Gamma_{m}^{\sigma} \quad \Updelta K_m={\frac{\eta}{N}} \exp\left(-\left\|Z - W^{{\rm RB}}_m\right\|^2\right) \Gamma_{m}^{\sigma} \quad \Updelta w_{0m}={\frac{\eta}{N}} \overline{z_0} \Gamma_{m}^{\sigma} $$
    (33)
  2. II

    When SRP neuron is in hidden layer then \(V_m=V_m^\pi\) and Eq. 28 can be rewritten as:

    $$ \begin{aligned} {\frac{-\partial E}{\partial \Re\left(w_{lm}^S\right)}}=&\;{\frac{1}{N}}\left\{ {\frac{\partial\left(\Re\left(V_{m}^{\pi}\right)\right)} {\partial\Re\left(w_{lm}^S\right)}}\Re\left(\Gamma_{m}^{\pi}\right)+{\frac{\partial\left(\Im\left(V_m^\pi\right)\right)}{\partial\Re\left(w_{lm}^S\right)}} \Im\left(\Gamma_{m}^{\pi}\right) \right\}\\=&\; {\frac{1}{N}} \Re\left(\Gamma_{m}^{\pi}\right)\left\{\left(\Re(H_m)+\Re(H_m)\Re\left(V^{\pi_{2}}_m\right)-\Im(H_m)\Im\left(V^{\pi_{2}}_m\right)\right)\Re(z_l) -\left(\Im(H_m)+\Im(H_m)\Re\left(V^{\pi_{2}}_m\right)+\Re(H_m)\Im\left(V^{\pi_{2}}_m\right)\right)\Im(z_l)\right\}\\&\quad+\; \Im\left(\Gamma_{m}^{\pi}\right)\left\{\left(\Im(H_m)+\Im(H_m)\Re\left(V^{\pi_{2}}_m\right)+\Re(H_m)\Im(V^{\pi_{2}}_m)\right)\Re(z_l) +\left(\Re(H_m)+\Re(H_m)\Re\left(V^{\pi_{2}}_m\right)-\Im(H_m)\Im(V^{\pi_{2}}_m)\right)\Im(z_l)\right\}\end{aligned} $$
    (34)

    Similarly

    $$ \begin{aligned} {\frac{-\partial E}{\partial \Im(w_{lm}^S)}}=&\;{\frac{1}{N}} \left\{{\frac{\partial (\Re(V_m^\pi))}{\partial \Im(w_{lm}^S)}} \Re(\Gamma_{m}^{\pi})+{\frac{\partial (\Im(V_m^\pi))}{\partial \Im(w_{lm}^S)}} \Im(\Gamma_{m}^{\pi}) \right\}\\ =&\;{\frac{1}{N}} \Re(\Gamma_{m}^{\pi}) \left\{-(\Re(H_m)+\Re(H_m)\Re(V^{\pi_{2}}_m) -\Im(H_m)\Im(V^{\pi_{2}}_m))\Im(z_l) - (\Im(H_m)+\Im(H_m)\Re(V^{\pi_{2}}_m) +\Re(H_m)\Im(V^{\pi_{2}}_m))\Re(z_l)\right\}\\ &+\; \Im(\Gamma_{m}^{\pi})\left\{(\Re(H_m)+\Re(H_m)\Re(V^{\pi_{2}}_m) -\Im(H_m)\Im(V^{\pi_{2}}_m))\Re(z_l)-(\Im(H_m)+\Im(H_m)\Re(V^{\pi_{2}}_m) +\Re(H_m)\Im(V^{\pi_{2}}_m))\Im(z_l)\right\} \end{aligned} $$
    (35)

Now substituting Eqs. 34 and 35 in Eq. 24

$$ \begin{aligned} \Updelta w^{S}_{lm}=&{\frac{\eta}{N}}\left\{ (\Re(z_l)-j \Im(z_l)) (\Re(H_m)-j \Im(H_m)) (1+\Re(V^{\pi_{2}}_m)-j \Im(V^{\pi_{2}}_m)) (\Re(\Gamma_{m}^{\pi})+j \Im(\Gamma_{m}^{\pi})) \right\}\\ =&{\frac{\eta}{N}} \overline{z_l} \overline{H}_{m} \left(1+\overline{V_m^{\pi_2}}\right) \Gamma_{m}^{\pi} \end{aligned} $$
(36)

Following the same procedure update equation for other learning parameters can be obtained:

$$ \begin{aligned} \Updelta w^{{\rm RB}}_{lm} =&{\frac{2\eta}{N}} \exp\left(-\left\|Z - W^{{\rm RB}}_m\right\|^2\right) \left(z_l - w^{{\rm RB}}_{lm}\right)\\ &\quad\left[ \Re(\Gamma_{m}^{\pi}) \left\{ \Re(K_m) (1+\Re({V}^{\pi_1}_{m}))- \Im(K_m)\Im({V}^{\pi_1}_{m}) \right\}+ \Im(\Gamma_{m}^{\pi}) \left\{ \Im(K_m) (1+\Re({V}^{\pi_1}_{m}))+ \Re(K_m)\Im({V}^{\pi_1}_{m}) \right\} \right] \end{aligned} $$
(37)
$$ \Updelta H_m = {\frac{\eta}{N}} \overline{\left(W^S_m Z^{\rm T}\right)} \left(1+\overline{V_m^{\pi_2}}\right) \Gamma_{m}^{\pi} $$
(38)
$$ \Updelta K_m={\frac{\eta}{N}} \exp\left(-\left\|Z - W^{{\rm RB}}_m\right\|^2\right) \left(1+\overline{V_m^{\pi_1}}\right) \Gamma_{m}^{\pi} $$
(39)
$$ \Updelta w_{0m}={\frac{\eta}{N}} \overline{z_0} \Gamma_{m}^{\pi} $$
(40)

where

$$ \Re\left(\Gamma_{m}^{\pi}\right)=f^{\prime}\left(\Re\left(V_m^{\pi}\right)\right) \sum_{n=1}^{N} \{\Re(e_n)f^{\prime}(\Re(V_n)) \Re(w_{mn})+\Im(e_n) f^{\prime}(\Im(V_n))\Im(w_{mn})\} $$

and

$$ \Im(\Gamma_{m}^{\pi}) = f^{\prime}(\Im(V_m^{\pi})) \sum_{n=1}^{N} \{ \Im(e_n)f^{\prime}\Im(V_n))\Re(w_{mn})-\Re(e_n)f^{\prime}(\Re(V_n)) \Im(w_{mn}) \} $$

This completes the derivation.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tripathi, B.K., Kalra, P.K. The novel aggregation function-based neuron models in complex domain. Soft Comput 14, 1069–1081 (2010). https://doi.org/10.1007/s00500-009-0502-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-009-0502-5

Keywords

Navigation