The novel aggregation function-based neuron models in complex domain

Tripathi, Bipin Kumar; Kalra, Prem K.

doi:10.1007/s00500-009-0502-5

The novel aggregation function-based neuron models in complex domain

Original Paper
Published: 13 October 2009

Volume 14, pages 1069–1081, (2010)
Cite this article

Soft Computing Aims and scope Submit manuscript

Bipin Kumar Tripathi¹ &
Prem K. Kalra²^nAff3

165 Accesses
19 Citations
Explore all metrics

Abstract

The computational power of a neuron lies in the spatial grouping of synapses belonging to any dendrite tree. Attempts to give a mathematical representation to the grouping process of synapses continue to be a fascinating field of work for researchers in the neural network community. In the literature, we generally find neuron models that comprise of summation, radial basis or product aggregation function, as basic unit of feed-forward multilayer neural network. All these models and their corresponding networks have their own merits and demerits. The MLP constructs global approximation to input–output mapping, while a RBF network, using exponentially decaying localized non-linearity, constructs local approximation to input–output mapping. In this paper, we propose two compensatory type novel aggregation functions for artificial neurons. They produce net potential as linear or non-linear composition of basic summation and radial basis operations over a set of input signals. The neuron models based on these aggregation functions ensure faster convergence, better training and prediction accuracy. The learning and generalization capabilities of these neurons have been tested over various classification and functional mapping problems. These neurons have also shown excellent generalization ability over the two-dimensional transformations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Parsimonious Radial Basis Function-Based Neural Network for Data Classification

Neo-Fuzzy Radial-Basis Function Neural Network and Its Combined Learning

About $$\varSigma \varPi $$ -neuron Models of Aggregating Type

Notes

In this paper, a neuron with only summation aggregation function is referred as ‘conventional’ neuron and network of these neurons as ‘MLP’.
$\Re$ and $\Im$ stands for real and imaginary components of complex value.

References

Aizenberg I, Moraga C (2007) Multilayer feedforward neural network based on multi-valued neurons (MLMVN) and a back-propagation learning algorithm. Soft Comput 11(2):169–183
Article Google Scholar
Aizenberg NN, Ivaskiv YL, Pospelov DA (1971) About one generalization of the threshold function Doklady Akademii Nauk SSSR (The Reports of the Academy of Sciences of the USSR) 196(6):1287–1290 (in Russian)
Aizenberg I, Aizenberg N, Vandewalle J (2000) Multi-valued and universal binary neurons: theory, learning, applications. Kluwer, Boston
Google Scholar
Blake CL, Merz CJ (1998) UCI repository of machine learning database. http://www.ics.uci.edu/mealrn/MLRepository.html, Department of Information and Computer Science, University of California
Brown JW, Churchill RV (2003) Complex variables and applications, VIIth edn. Mc Graw-Hill, New York
Chaturvedi DK, Satsangi PS, Kalra PK (1999) New neuron models for simulating rotating electrical machines and load forecasting problems. Electr Power Syst Res 52:123–131
Google Scholar
Chen X, Li S (2005) A modified error function for the complex-value back-propagation neural network. Neural Inf Process 8(1)
Foggel BD (1991) An information criterion for optimal neural network selection. IEEE Trans Neural Netw 2:490–497
Article Google Scholar
Georgiou GM, Koutsougeras C (1992) Complex domain back-propagation. IEEE Trans Circuits Syst II 39(5)
Gorman RP, Sejnowski TJ (1988) Analysis of hidden units in a layered network trained to classify sonar targets. Neural Netw 1:75–89
Article Google Scholar
Gupta MM, Homma N (2003) Static and dynamic neural networks, from fundamentals to advanced theory. Wiley, New York
Hirose A (2006) Complex-valued neural networks. Springer, New York
Book MATH Google Scholar
Homma N, Gupta MM (2002) A general second-order neural unit. Bull Coll Med Sci Tohoku Univ 11(1):1–6
Google Scholar
Jianping D, Sundararajan N, Saratchandran P (2002) Communication channel equalization using complex-valued minimal radial basis function neural networks. IEEE Trans Neural Netw 13(3):687–696
Article Google Scholar
Koch C (1999) Biophysics of computation: information processing in single neurons. Oxford University Press, Oxford
Lee CC, Chung PC, Tsai JR, Chang CI (1999) Robust radial basis function neural network. IEEE Trans Syst Man Cybern B Cybern 29(6)
Leung H, Haykin S (1991) The complex backpropagation algorithm. IEEE Trans Signal Process 39(9)
Mel BW (1995) Information processing in dendritic tree. Neural Comput 6:1013–1085
Google Scholar
Nitta T (1997) An extension of the back-propagation algorithm to complex numbers. Neural Netw 10(8):1391–1415
Article Google Scholar
Nitta T (2000) An analysis of the fundamental structure of complex-valued neurons. Neural Process Lett 12:239–246
Article MATH Google Scholar
Piazza F, Benvenuto N (1992) On the complex back-propagation algorithm. IEEE Trans Signal Process 40(4):967–969
Article Google Scholar

Download references

Author information

Prem K. Kalra
Present address: IIT, Rajasthan, India

Authors and Affiliations

ACES-107, Indian Institute of Technology, Kanpur, India
Bipin Kumar Tripathi
IIT, Kanpur, India
Prem K. Kalra

Authors

Bipin Kumar Tripathi
View author publications
You can also search for this author in PubMed Google Scholar
Prem K. Kalra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bipin Kumar Tripathi.

Appendix: Derivation of update rule

In a multilayer network, we consider a commonly used three-layer structure (L–M–N). First layer has L inputs, second layer has M proposed neurons described in Sect. 2 and the output layer consists of N conventional neurons. All weights, threshold, and input–output signals are complex numbers. By convention $w_{lm}$ is the weight that connects $l{\rm th}$ neuron to $m{\rm th}$ neuron. $\eta \in [0,1]$ is the learning rate and $f^{\prime}$ is derivative of function f. Let $Z=[ z_1, z_2, \ldots, z_L]$ be the vector of input signals. $Z^{\rm T}$ is transpose of vector Z and $\overline{z}$ is the complex conjugate of z. $W^S_m = [ w^S_{1m}, w^S_{2m}, \ldots, w^S_{Lm}]$ is a vector of weights from inputs $(l=1,\ldots, L)$ to summation part of $m{\rm th}$ SRS or SRP neuron and $W^{{\rm RB}}_m=[w^{{\rm RB}}_{1m}, w^{{\rm RB}}_{2m}, \ldots, w^{{\rm RB}}_{Lm}]$ is a vector of weights from inputs to radial basis part of $m{\rm th}$ SRS or SRP neuron. $w_0$ is a bias and $z_0$ is the bias input. From Eq. 7, the output of each neuron $(m=1,\ldots, M)$ in hidden layer can be expressed as:

$$ Y_m = f(\Re(V_m))+j f(\Im(V_m))=\Re(Y_m)+j \Im(Y_m) $$

(16)

Let $V_m^\sigma$ be the net potential of SRS neuron in hidden layer then from Eqs. 4 and 5:

$$ V_m=V_m^\sigma=\Re\left(V_m^\sigma\right)+j \Im\left(V_m^\sigma\right)=H_m W^S_m Z^{\rm T} + K_m {\exp\left(-\left\|Z - W^{{\rm RB}}_m\right\|^2\right)} + w_{0m} z_0 $$

(17)

Let $V_m^\pi$ be the net potential of SRP neuron in hidden layer then from Eqs. 1 and 2:

$$ V_m = V_m^\pi = H_m W^S_m Z^{\rm T} + K_m {\exp\left(-\left\|Z - W^{{\rm RB}}_m\right\|^2\right)} + H_m W^S_m Z^{\rm T} K_m {\exp\left(-\left\|Z - W^{{\rm RB}}_m\right\|^2\right)}+w_{0m}z_0 $$

(18)

The net internal potential of SRP neuron in Eq. 18 can also be expressed term wise as follows:

$$ V^{\pi}_m =V^{\pi_{1}}_m+V^{\pi_{2}}_m+V^{\pi_{1}}_m V^{\pi_{2}}_m+w_{0m}z_{0} $$

(19)

The output of a neuron ($n=1,\ldots, N$) in output layer can be given by:

$$ Y_n=f_{{\mathbf{C}}}(V_n)=f_{{\mathbf{C}}}\left(\sum_{m=1}^{M} w_{mn}Y_{m}+w_{0n}z_{0}\right) $$

(20)

Let YD be the desired output. The output error consists of its real and imaginary parts and defined as:

$$ e_n=\Re(e_n)+j\Im(e_n)=YD_n-Y_n $$

(21)

where, $\Re(e_n)=\Re(YD_n)-\Re(Y_n)$ and $\Im(e_n)=\Im(YD_n)-\Im(Y_n).$ The real-valued cost function (MSE) can be given as:

$$ E=\frac{1}{2N}\sum_{n=1}^{N} e_n \overline{e_n}=\frac{1}{2N}\sum_{n=1}^{N}\left[(\Re(e_n))^2+(\Im(e_n))^2 \right] $$

(22)

The complex-BP algorithm minimizes cost function E by recursively altering the weight coefficients based on gradient descent, given by:

$$ w^{\rm new}=w^{\rm old}-\eta \bigtriangledown_{w}E $$

(23)

where the gradient $\bigtriangledown_{w}E$ is derived with respect to both real and imaginary parts of complex weights.

$$ \Updelta w=-\eta \bigtriangledown_{w}E=-\eta \bigtriangledown_{\Re(w)} E -j \eta \bigtriangledown_{\Im(w)} E=-\eta \left({\frac{\partial E}{\partial \Re(w)}}+j\ast{\frac{\partial E}{\partial \Im(w)}} \right) $$

(24)

For any weight in output layer, $w = w_{mn}$

$$ {\frac{-\partial E}{\partial \Re(w_{mn})}}=\left[\Re(e_n) f^{\prime}(\Re(V_n)){\frac{\partial \Re(V_n)}{\partial \Re(w_{mn})}}+\Im(e_n)f^{\prime}(\Im(V_n)){\frac{\partial \Im(V_n)} {\partial \Re(w_{mn})}}\right] $$

(25)

and

$$ {\frac{-\partial E}{\partial \Im(w_{mn})}} = \left[\Re(e_n) f^{\prime}(\Re(V_n)){\frac{\partial \Re(V_n)}{\partial \Im(w_{mn})}} +\Im(e_n) f^{\prime}(\Im(V_n)){\frac{\partial \Im(V_n)}{\partial \Im(w_{mn})}}\right] $$

(26)

$$ \Updelta w_{mn}=\eta \left[\Re(e_n)f^{\prime}(\Re(V_n)) \left\{{\frac{\partial \Re(V_n)}{\partial \Re(w_{mn})}}+j{\frac{\partial \Re(V_n)}{\partial \Im(w_{mn})}} \right\}+\Im(e_n)f^{\prime}(\Im(V_n))\left\{{\frac{\partial \Im(V_n)}{\partial \Re(w_{mn})}}+j{\frac{\partial \Im(V_n)}{\partial \Im(w_{mn})}}\right\}\right] $$

$$ \begin{aligned}\Updelta w_{mn}&=&\eta {\overline{Y}_m} (\Re(e_n) f^{\prime}(\Re(V_n))+ j\Im(e_n)f^{\prime}(\Im(V_n)))\\ \Updelta w_{0n}&=&\eta{\overline{z}_0} (\Re(e_n) f^{\prime}(\Re(V_n))+j \Im(e_n) f^{\prime}(\Im(V_n))) \end{aligned}$$

(27)

The weight update equation for weights between input and hidden layer can be obtained by calculating the gradient of cost function with respect to these weights. For $w=w_{lm},$ following the chain rule of derivation:

$$ \begin{aligned} {\frac{-\partial E}{\partial \Re(w_{lm})}}&={\frac{1}{N}}{\frac{\partial (\Re(V_m))}{\partial \Re(w_{lm})}}f^{\prime}(\Re(V_m)) \sum_{n=1}^{N} \{\Re(e_n)f^{\prime}(\Re(V_n)) \Re(w_{mn})+\Im(e_n)f^{\prime}(\Im(V_n))\Im(w_{mn})\}\\ &\quad+ {\frac{1}{N}}{\frac{\partial (\Im(V_m))}{\partial \Re(w_{lm})}} f^{\prime}(\Im(V_m))\sum_{n=1}^{N}\{\Im(e_n)f^{\prime}(\Im(V_n)) \Re(w_{mn})- \Re(e_n) f^{\prime}(\Re(V_n)) \Im(w_{mn})\} \end{aligned} $$

(28)

I
In a three-layer network, the difference is at hidden layer where we can use either of proposed neuron. When SRS neuron is in hidden layer then $V_m=V_m^\sigma$ and Eq. 28 can be rewritten as:
$$ \begin{aligned} {\frac{-\partial E}{\partial \Re(w_{lm}^S)}} =&\;{\frac{1}{N}}\left\{{\frac{\partial \left(\Re\left(V_m^\sigma \right) \right)}{\partial \Re\left(w_{lm}^S \right)}}\Re\left(\Gamma_{m} ^{\sigma} \right)+{\frac{\partial \left(\Im\left(V_m^\sigma \right) \right)} {\partial \Re\left(w_{lm}^S \right)}}\Im\left(\Gamma_{m}^{\sigma} \right) \right\}\\ =&\, {\frac{1}{N}}\left\{(\Re(H_m) \Re(z_l)-\Im(H_m) \Im(z_l)) \Re\left(\Gamma_{m}^ {\sigma}\right)+(\Re(H_m) \Im(z_l)+\Im(H_m) \Re(z_l)) \Im\left(\Gamma_{m}^{\sigma} \right) \right\} \end{aligned} $$
(29)
Similarly
$$ {\frac{-\partial E}{\partial \Im\left(w_{lm}^S\right)}}={\frac{1}{N}}\left\{ (\Re(H_m) (-\Im(z_l))-\Im(H_m)\Re(z_l)) \Re\left(\Gamma_{m}^{\sigma}\right)+(\Re(H_m) \Re(z_l)-\Im(H_m) \Im(z_l)) \Im\left(\Gamma_{m}^{\sigma}\right) \right\} $$
(30)
Now substituting Eqs. 29 and 30 in Eq. 24
$$ \Updelta w^{S}_{lm}={\frac{\eta}{N}}\left\{ (\Re(z_l)-j \Im(z_l)) (\Re(H_m)-j \Im(H_m)) (\Re(\Gamma_{m}^{\sigma}) +j \Im(\Gamma_{m}^{\sigma}))\right\}={\frac{\eta}{N}} \overline{z_l} \overline{H}_{m} \Gamma_{m}^{\sigma} $$
(31)
Following the same procedure update equation for other learning parameters can be obtained:
$$ \Updelta w^{{\rm RB}}_{lm}={\frac{2\eta} {N}} \exp\left(-\left\|Z - W^{{\rm RB}}_m\right\|^2\right) \left(z_l - w^{{\rm RB}}_{lm}\right) \left\{ \Re\left(\Gamma_{m}^{\sigma}\right) \Re(K_m) + \Im\left(\Gamma_{m}^{\sigma}\right)\Im(K_m) \right\} $$
(32)

$$ \Updelta H_m={\frac{\eta}{N}} \overline{\left(W^{S}_{m} Z^{\rm T}\right)} \Gamma_{m}^{\sigma} \quad \Updelta K_m={\frac{\eta}{N}} \exp\left(-\left\|Z - W^{{\rm RB}}_m\right\|^2\right) \Gamma_{m}^{\sigma} \quad \Updelta w_{0m}={\frac{\eta}{N}} \overline{z_0} \Gamma_{m}^{\sigma} $$
(33)
II
When SRP neuron is in hidden layer then $V_m=V_m^\pi$ and Eq. 28 can be rewritten as:
$$ \begin{aligned} {\frac{-\partial E}{\partial \Re\left(w_{lm}^S\right)}}=&\;{\frac{1}{N}}\left\{ {\frac{\partial\left(\Re\left(V_{m}^{\pi}\right)\right)} {\partial\Re\left(w_{lm}^S\right)}}\Re\left(\Gamma_{m}^{\pi}\right)+{\frac{\partial\left(\Im\left(V_m^\pi\right)\right)}{\partial\Re\left(w_{lm}^S\right)}} \Im\left(\Gamma_{m}^{\pi}\right) \right\}\\=&\; {\frac{1}{N}} \Re\left(\Gamma_{m}^{\pi}\right)\left\{\left(\Re(H_m)+\Re(H_m)\Re\left(V^{\pi_{2}}_m\right)-\Im(H_m)\Im\left(V^{\pi_{2}}_m\right)\right)\Re(z_l) -\left(\Im(H_m)+\Im(H_m)\Re\left(V^{\pi_{2}}_m\right)+\Re(H_m)\Im\left(V^{\pi_{2}}_m\right)\right)\Im(z_l)\right\}\\&\quad+\; \Im\left(\Gamma_{m}^{\pi}\right)\left\{\left(\Im(H_m)+\Im(H_m)\Re\left(V^{\pi_{2}}_m\right)+\Re(H_m)\Im(V^{\pi_{2}}_m)\right)\Re(z_l) +\left(\Re(H_m)+\Re(H_m)\Re\left(V^{\pi_{2}}_m\right)-\Im(H_m)\Im(V^{\pi_{2}}_m)\right)\Im(z_l)\right\}\end{aligned} $$
(34)
Similarly
$$ \begin{aligned} {\frac{-\partial E}{\partial \Im(w_{lm}^S)}}=&\;{\frac{1}{N}} \left\{{\frac{\partial (\Re(V_m^\pi))}{\partial \Im(w_{lm}^S)}} \Re(\Gamma_{m}^{\pi})+{\frac{\partial (\Im(V_m^\pi))}{\partial \Im(w_{lm}^S)}} \Im(\Gamma_{m}^{\pi}) \right\}\\ =&\;{\frac{1}{N}} \Re(\Gamma_{m}^{\pi}) \left\{-(\Re(H_m)+\Re(H_m)\Re(V^{\pi_{2}}_m) -\Im(H_m)\Im(V^{\pi_{2}}_m))\Im(z_l) - (\Im(H_m)+\Im(H_m)\Re(V^{\pi_{2}}_m) +\Re(H_m)\Im(V^{\pi_{2}}_m))\Re(z_l)\right\}\\ &+\; \Im(\Gamma_{m}^{\pi})\left\{(\Re(H_m)+\Re(H_m)\Re(V^{\pi_{2}}_m) -\Im(H_m)\Im(V^{\pi_{2}}_m))\Re(z_l)-(\Im(H_m)+\Im(H_m)\Re(V^{\pi_{2}}_m) +\Re(H_m)\Im(V^{\pi_{2}}_m))\Im(z_l)\right\} \end{aligned} $$
(35)

Now substituting Eqs. 34 and 35 in Eq. 24

$$ \begin{aligned} \Updelta w^{S}_{lm}=&{\frac{\eta}{N}}\left\{ (\Re(z_l)-j \Im(z_l)) (\Re(H_m)-j \Im(H_m)) (1+\Re(V^{\pi_{2}}_m)-j \Im(V^{\pi_{2}}_m)) (\Re(\Gamma_{m}^{\pi})+j \Im(\Gamma_{m}^{\pi})) \right\}\\ =&{\frac{\eta}{N}} \overline{z_l} \overline{H}_{m} \left(1+\overline{V_m^{\pi_2}}\right) \Gamma_{m}^{\pi} \end{aligned} $$

(36)

Following the same procedure update equation for other learning parameters can be obtained:

$$ \begin{aligned} \Updelta w^{{\rm RB}}_{lm} =&{\frac{2\eta}{N}} \exp\left(-\left\|Z - W^{{\rm RB}}_m\right\|^2\right) \left(z_l - w^{{\rm RB}}_{lm}\right)\\ &\quad\left[ \Re(\Gamma_{m}^{\pi}) \left\{ \Re(K_m) (1+\Re({V}^{\pi_1}_{m}))- \Im(K_m)\Im({V}^{\pi_1}_{m}) \right\}+ \Im(\Gamma_{m}^{\pi}) \left\{ \Im(K_m) (1+\Re({V}^{\pi_1}_{m}))+ \Re(K_m)\Im({V}^{\pi_1}_{m}) \right\} \right] \end{aligned} $$

(37)

$$ \Updelta H_m = {\frac{\eta}{N}} \overline{\left(W^S_m Z^{\rm T}\right)} \left(1+\overline{V_m^{\pi_2}}\right) \Gamma_{m}^{\pi} $$

(38)

$$ \Updelta K_m={\frac{\eta}{N}} \exp\left(-\left\|Z - W^{{\rm RB}}_m\right\|^2\right) \left(1+\overline{V_m^{\pi_1}}\right) \Gamma_{m}^{\pi} $$

(39)

$$ \Updelta w_{0m}={\frac{\eta}{N}} \overline{z_0} \Gamma_{m}^{\pi} $$

(40)

where

$$ \Re\left(\Gamma_{m}^{\pi}\right)=f^{\prime}\left(\Re\left(V_m^{\pi}\right)\right) \sum_{n=1}^{N} \{\Re(e_n)f^{\prime}(\Re(V_n)) \Re(w_{mn})+\Im(e_n) f^{\prime}(\Im(V_n))\Im(w_{mn})\} $$

and

$$ \Im(\Gamma_{m}^{\pi}) = f^{\prime}(\Im(V_m^{\pi})) \sum_{n=1}^{N} \{ \Im(e_n)f^{\prime}\Im(V_n))\Re(w_{mn})-\Re(e_n)f^{\prime}(\Re(V_n)) \Im(w_{mn}) \} $$

This completes the derivation.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tripathi, B.K., Kalra, P.K. The novel aggregation function-based neuron models in complex domain. Soft Comput 14, 1069–1081 (2010). https://doi.org/10.1007/s00500-009-0502-5

Download citation

Published: 13 October 2009
Issue Date: August 2010
DOI: https://doi.org/10.1007/s00500-009-0502-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The novel aggregation function-based neuron models in complex domain

Abstract

Access this article

Similar content being viewed by others

A Parsimonious Radial Basis Function-Based Neural Network for Data Classification

Neo-Fuzzy Radial-Basis Function Neural Network and Its Combined Learning

About $$\varSigma \varPi $$ -neuron Models of Aggregating Type

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendix: Derivation of update rule

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The novel aggregation function-based neuron models in complex domain

Abstract

Access this article

Similar content being viewed by others

A Parsimonious Radial Basis Function-Based Neural Network for Data Classification

Neo-Fuzzy Radial-Basis Function Neural Network and Its Combined Learning

About $$\varSigma \varPi $$ -neuron Models of Aggregating Type

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendix: Derivation of update rule

Appendix: Derivation of update rule

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation