Elsevier

Applied Soft Computing

Volume 9, Issue 1, January 2009, Pages 20-29
Applied Soft Computing

Multilayer perceptron neural networks with novel unsupervised training method for numerical solution of the partial differential equations

https://doi.org/10.1016/j.asoc.2008.02.003Get rights and content

Abstract

In this paper by using MultiLayer Perceptron and Radial Basis Function (RBF) neural networks, a novel method for solving both kinds of differential equation, ordinary and partial differential equation, is presented. From the differential equation and its boundary conditions, the energy function of the network is prepared which is used in the unsupervised training method to update the network parameters. This method was implemented to solve the nonlinear Schrodinger equation in hydrogen atom and triangle-shaped quantum well. Comparison of this method results with analytical solution and two well-known numerical methods, Runge–kutta and finite element, shows the efficiency of Neural Networks with high accuracy, fast convergence and low use of memory for solving the differential equations.

Introduction

Scientists and engineers use several techniques in solving continuum or field problems. Loosely speaking, these techniques can be classified as experimental, analytical, or numerical. Experiments are expensive, time consuming, sometimes hazardous, and usually do not allow much flexibility in parameter variations. However, every numerical method involves an analytic simplification to the point where it is easy to apply the numerical method. Notwithstanding this fact, the following methods are among the most commonly used in Differential equations (DEs).

  • A.

    Analytical methods (exact solutions)

    • 1.

      Separation of variables

    • 2.

      Series expansion

    • 3.

      Conformal mapping

    • 4.

      Integral solutions, e.g., Laplace and Fourier transforms

    • 5.

      Perturbation methods

  • B.

    Numerical methods (approximate solutions)

    • 1.

      Finite difference method

    • 2.

      Method of weighted residuals

    • 3.

      Moment method

    • 4.

      Finite element method

    • 5.

      Runge–Kutta method

    • 6.

      Transmission-line modeling

    • 7.

      Monte Carlo method

    • 8.

      Method of lines

Application of these methods is in several fields such as electromagnetics (EM)-related problems, fluid, heat transfer, and acoustics [1].

This paper implements the application of an Artificial Neural Network (ANN) to solve the both kinds of DEs, Partial Differential Equation (PDE) and Ordinary Differential Equation (ODE). The MultiLayer Perceptron (MLP) and Radial Basis Function (RBF) networks are powerful architectures of ANN for interpolation in multidimensional space and they are universal function approximator [5].

Solving DEs with a trained ANN offers the following advantages over standard numerical methods [2]:

  • 1.

    Solution search proceeds without coordinate transformations.

  • 2.

    ANN learns to solve the DE analytically.

  • 3.

    Computational complexity does not increase quickly when the number of sampling points increase.

  • 4.

    Rapid calculation of the solution values.

In contrast, standard methods provide solution values only at (discrete) pre-defined (sampled) locations of the solution space and their computational complexity increases quickly with the number of sampling points [3], [4]. Rounding-off errors seriously affect the solution accuracy in numerical algorithms with complexities that also increase rapidly with the number of sampling points [4].

The construction of MLP and RBF involves three layers with entirely different calculations [5], [6]. The input layer is made of source nodes and works as an information recipient. It has only one hidden layer in the network, performs nonlinear transformation from input space to high dimensional hidden space. The linear output layer pass on the network response to activation pattern. In this sense, the unsupervised training method for solving the DE is considered. Since the exact form of the DE solution is unknown, the network is trained in an unsupervised manner using an energy (error) function that is derived from the DE itself and the applying boundary conditions.

This method was tested by solving Nonlinear Schrodinger Equation (NLSE) in Hydrogen atom application and triangle-shaped quantum well. Equation of motion of a quantum particle in a Three-dimensional (3D) systems is the following time-independent Schrodinger equation [7], [8]:Ĥψ(r)22m2+V(r)ψ(r)=Eψ(r)where , m and V(r) are del operator, the particle mass and the potential function, respectively. Also Ĥ, ψ(r), E and r denote the system Hamiltonian, eigenfunction, eigenvalue and vector of coordinate variables, respectively.

Alexopoulos [20] showed the scattering matrix for particles that encounter a quantum potential by discretising Schrodingers time-independent differential equation without the need to resort to the manipulation of the eigenfunctions directly. This is achieved by expanding the eigenfunctions in Taylor series so that a discrete form can be obtained. The final solutions are given in terms of discrete scattering matrices that allow to derive transmission and reflection probabilities via the total scattering matrix that is the product of the (N + 1)-order discrete scattering matrices.

Monovasilis and Simos [21] present explicit method for the numerical solution of the Schrodinger equation. The Schrodinger equation is first transformed into a Hamiltonian canonical equation and then developed several methods up to the eighth order. They present second, third, fourth, fifth, sixth and eighth order symplectic methods with several stages.

Lehtovaara et al. [22] used numerical solution of eigenvalues and eigenvectors of large matrices originating from discretization of linear and nonlinear Schrodinger equations using the imaginary time propagation (ITP) method. Convergence properties and accuracy of second and fourth order operator-splitting methods for the ITP method are studied using numerical examples. The natural convergence of the method is further accelerated with a new dynamic time step adjustment method.

Also, Mazzone and Morandi [23] studied the solution of the multi-particle, time-dependent Schrodinger equation using quantum Monte Carlo methods and numerical integration. The Monte Carlo method is based on a mixed scheme, combining classical dynamics for the nuclei and quantum mechanics for the electrons. The numerical solution uses a discretization of the Schrodinger equation in real space and time. There are a lot of studies on the numerical solution of initial and initial-boundary problems for solving the linear or nonlinear Schrodinger equation, see e.g. [19], [24], [25], [26], [27], [28], [29], [30], [31], [32], [33], [34], [35], [36].

Most of this conventional methods have low convergence speed, complex computation and high use of memory. ANN is a parallel distributed processor which consists of a number of simply designed computing units, called neuron. ANN used low memory with simple computations in each neuron that caused fast convergence for solving the problems. Its massive interconnection between neurons can be used for storing various types of information, especially the ones classified as knowledge or experience [5], [9]. This information is acquired as inter-neuron connection strengths, or synaptic weights.

By solving NLSE, the eigenvalues (E) and eigenfunctions (ψ) can be obtained. Some of the cases have analytical solutions [7], which enable us to check out our numerical results. Our numerical results are in agreement with their corresponding analytical solution and show the efficiency of ANN method for solving differential equations.

The outline of this paper is as follows: in Section 2, details of new method are derived and also, explained how it can be applied to differential equations. Section 3 presents our numerical results for solution of the NLSE. Finally, the last section is the conclusion.

Section snippets

MLP and RBF networks

The feed-forward neural networks are the most popular architectures due to their structural flexibility, good representational capabilities and availability of a large number of training algorithm [5]. This network consists of neurons arranged in layers in which every neuron is connected to all neurons of the next layer (a fully connected network).

MLP and RBF networks are two kinds of feed-forward neural network with different transfer functions. An output of a three-layer MLP networks is

Numerical results

In this section, we present the numerical example of a nonlinear Schrodinger equation, Eq. (1). In order to determine the achieved accuracy, two types of errors were measured. First, the Max error [12], [13]:e=Max(|ψ(r)ψ(ri)|),0iKwhere ψ(r) is the exact solution and ψ(ri) is the approximation solution. Second, the RMS error given by [14], [15]e2=1K0iK|ψ(r)ψ(ri)|2.

Conclusion

A novel method for solving ordinary and partial differential equations, based on Artificial Neural Networks (ANN) is presented. This method relies on the function approximation capabilities of feed-forward neural networks [5] and provides accurate and differentiable solutions in a closed analytic form. This method was applied to the nonlinear Schrodinger equation [7] in hydrogen atom and triangle-shaped quantum well [8], [10]. By comparing the proposed method results with analytical solutions

References (36)

  • M. Delfour et al.

    Finite-difference solutions of a non-linear Schrodinger equation

    J. Comput. Phys.

    (1981)
  • B.M. Herbst et al.

    Numerical experience with the nonlinear Schrodinger equation

    J. Comput. Phys.

    (1985)
  • P.L. Nash et al.

    Efficient difference solutions to the time-dependent Schrodinger equation

    J. Comput. Phys.

    (1997)
  • J.M. Sanz-Serna et al.

    A method for the integration in time of certain partial differential equations

    J. Comput. Phys.

    (1983)
  • T.R. Taha et al.

    Analytic and numerical aspects of certain nonlinear evolution equations. II. Numerical, nonlinear Schrodinger equation

    J. Comput. Phys.

    (1984)
  • K.H. Huebner et al.

    The Finite Element Method for Engineers

    (1982)
  • C. Monterola et al.

    Characterizing the dynamics of constrained physical systems with unsupervised neural network

    Phys. Rev. E

    (1998)
  • W. Press et al.

    Numerical Recipes: The Art of Scientific Computing

    (1986)
  • Cited by (0)

    View full text