Elsevier

Neural Networks

Volume 16, Issue 10, December 2003, Pages 1461-1481
Neural Networks

A functions localized neural network with branch gates

https://doi.org/10.1016/S0893-6080(03)00211-9Get rights and content

Abstract

In this paper, a functions localized network with branch gates (FLN-bg) is studied, which consists of a basic network and a branch gate network. The branch gate network is used to determine which intermediate nodes of the basic network should be connected to the output node with a gate coefficient ranging from 0 to 1. This determination will adjust the outputs of the intermediate nodes of the basic network depending on the values of the inputs of the network in order to realize a functions localized network. FLN-bg is applied to function approximation problems and a two-spiral problem. The simulation results show that FLN-bg exhibits better performance than conventional neural networks with comparable complexity.

Introduction

In the brain and cognitive sciences, many biologists and psychologists share a view that brain has many special intellectual parts. Those are specialized to perform a specific function. Depending on different information received by brain, different intellectual parts are activated. Furthermore, even in the activated part, different regions are activated with different intensities. It means that functions are localized in the brain. Occasionally, this assumption is used as a foundation about the brain's underlying structure (Gazzaniga, 1989, Sawaguchi, 1989).

But commonly used artificial neural networks, especially layered neural networks, do not have such a structure, i.e. all input nodes and output nodes in artificial neural networks are connected to the intermediate nodes, and the connection between nodes never changes depending on the input values. Therefore, functions localized neural networks are hard to realize in conventional neural networks. Here, functions localization means that not all, but only parts of the networks are activated depending on the input values of the network.

To obtain such a brain-like model in artificial neural networks for solving real-world large-scale difficult problems, one of the most important things is how to divide a problem into smaller and simpler subproblems; and then how to guide the training algorithm to realize functions localization without degrading system's performance.

In this paper, an artificial functions localized network with branch gates (FLN-bg) is proposed using universal learning networks (ULNs) and it is studied how much FLN-bg can improve the performance compared to the conventional neural networks.

FLN-bg is composed of two kinds of networks. One is a basic network, which is a conventional layered neural network. The other is a branch gate network, which can control the branches from the intermediate nodes to the output node of the basic network in a way that the outputs of the intermediate nodes are multiplied by the gate coefficients ranging from 0 to 1, which are calculated by the branch gate network.

Therefore, FLN-bg can realize artificial functions localization, where the basic network is divided into smaller and simpler subnetworks by cutting some branches of the basic network when the gate coefficients calculated by the branch gate network become zero. This connection or disconnection of the branches of the basic network depends on the input values of the network. On the other hand, when the gate coefficients calculated by the branch gate network do not take a value of zero, the multiplication operation at the outputs of intermediate nodes is done and play an important role in improving the performance of FLN-bg.

There has been already proposed ‘learning petri network (LPN)’ (Hirasawa, Ohbayashi, Sakai, & Hu, 1998) as a functions localized network in our laboratory, which is a combination of Petri networks and neural networks. In LPN, functions localization can be realized by using token control of Petri Networks and its learning can be executed by the back propagation learning method of neural networks. But, it is fairly difficult to realize token control because of its complicated mechanism.

Jacobs’ and Jordan's modular networks (Jacobs and Jordan, 1993, Jordan and Jacobs, 1994) have similar module architectures to FLN-bg. However, the structure of the modular networks is different from that of FLN-bg. While the modules in modular networks are completely separated, FLN-bg has an integral construction in which there are many modules and different modules have their own nodes and branches which also belong to other modules. In other words, there is overlap between two different modules in FLN-bg. In addition, the multiplication operations exist in FLN-bg in the case of non-zero gate coefficient values like modular networks. By using such multiplication operations, we can realize more sophisticated functions localization in the basic network and improve the performance of FLN-bg.

The remainder of this paper is organized as follows: In Section 2, ULNs are reviewed briefly because ULNs are used to construct FLN-bg which uses various kinds of node functions including multiplication. In Section 3, we present the basic structure and training algorithm of FLN-bg. The reason why FLN-bg has better performance than the commonly used neural networks is discussed in Section 4. In Section 5, we give simulation conditions and results for both function approximation problems and a two-spiral problem. Finally, conclusions are given in Section 6.

Section snippets

Universal learning networks

In this section, the structure of ULNs is explained briefly because ULN is a super set of learning networks whose distinguished characteristics are used to realize FLN-bg.

Since the first proposal of a neuron model (McCulloch & Pitts, 1943), especially after the revitalization of artificial neural networks in 1980s (Hopfield, 1982, Rumelhart et al., 1986), a variety of neural networks have been devised and are now applied in many fields. The vast majority of neural networks in use are those

Basic structure of FLN-bg

FLN-bg is composed of a basic network and a branch gate network (Fig. 2). The relationship between the outputs of the branch gate network and the branches from the intermediate nodes to the output node in the basic network is one to one correspondence. As a result, the outputs of the branch gate network can control the connections of the corresponding intermediate branches of the basic network.

In this paper, a commonly used three layered neural network is used as the basic network, while a

Advantages of FLN-bg

In this section, we discuss what advantages over the commonly used neural networks exist when we use FLN-bg.

Simulations of FLN-bg

Simulations are carried out by adjusting the threshold Zjo in order to study the fundamental characteristics of FLN-bg, that is, to study (1) whether there exists an optimal connectivity as shown in Fig. 7 or not, even when parameters μjq∈[0,1] of the fuzzy inference network are randomly set, (2) whether FLN-bg can train the parameters of the network faster than conventional neural networks or not, and also, (3) whether multiplication nodes are useful to realize functions localization and to

Conclusions

In this paper, a new type of functions localized network named FLN-bg is proposed. It consists of two kinds of networks, one is basic network and the other is branch gate network. The branch gate network can control the branch connection and disconnection of the basic network with the gate coefficient depending on the input patterns in the input space. From our simulation results, it has been clarified that FLN-bg shows better performance and generalization ability, and faster learning speed

References (29)

  • K. Hirasawa et al.

    Improvement of generalization ability for identifying dynamical systems by using universal learning networks

    Neural Networks

    (2001)
  • K. Hirasawa et al.

    Universal learning network and its application to chao control

    Neural Networks

    (2000)
  • K. Hornik

    Multilayer feedforward networks are universal approximators

    Neural Networks

    (1989)
  • S. Fahlman et al.

    The cascade-correlation learning architecture

    Advances in nips 2 (san mateo)

    (1990)
  • M. Gazzaniga

    Organization of the human brain

    Science

    (1989)
  • K. Hirasawa et al.

    Universal learning network and its application to robust control

    IEEE Transactions on System, Man, and Cybernetics, Part B

    (2000)
  • K. Hirasawa et al.

    Learning petri network and its applications to non-linear system control

    IEEE Transactions on Systems, Man and Cybernetics, Part B: Cybernetics

    (1998)
  • J. Hopfield

    Neural networks and physical systems with emergent collective computational abilities

    Proceedings of the National Academy of Science, USA

    (1982)
  • J. Hu et al.

    RasID—Random search for neural networks training

    Journal of Advanced Computational Intelligence

    (1998)
  • R. Jacobs et al.

    Learning piecewise control strategies in a modular neural network architecture

    IEEE Transactions on Systems, Man, and Cybernetics

    (1993)
  • J.-S. Jang et al.

    Neuro-fuzzy modeling and control

    Proceedings of the IEEE

    (1995)
  • M. Jordan et al.

    Hierarchical mixtures of experts and the EM algorithm

    Neural Computation

    (1994)
  • S. Kirkpatrick et al.

    Optimization by simulated annealing

    Science

    (1983)
  • K. Lang et al.

    Learning to tell two spiral apart

  • Cited by (9)

    View all citing articles on Scopus
    1

    Tel.: +81-92-642-3907; fax: +81-92-642-3962.

    View full text