Contrastive divergence for memristor-based restricted Boltzmann machine

https://doi.org/10.1016/j.engappai.2014.09.013Get rights and content

Abstract

Restricted Boltzmann machines and deep belief networks have been shown to perform effectively in many applications such as supervised and unsupervised learning, dimensionality reduction and feature learning. Implementing networks, which use contrastive divergence as the learning algorithm on neuromorphic hardware, can be beneficial for real-time hardware interfacing, power efficient hardware and scalability. Neuromorphic hardware which uses memristors as synapses is one of the most promising areas to achieve the above-mentioned goals. This paper presents a restricted Boltzmann machine which uses a two memristor model to emulate synaptic weights and achieves learning using contrastive divergence.

Introduction

Memristors have initiated a new direction for the advancement of neuromorphic and analog applications. Biologically inspired computation is appealing and recently many methods have been researched extensively. Artificial neural network, being one of them, has been getting more attention due to deep learning and its hardware implementation in neuromorphic devices. The pioneering work (Snider, 2007) presented neural networks using memristors as synapses which emphasized easily manufactured and large scale devices. Demonstration of an associative memory using memristors was first presented in Pershin and Di Ventra (2010) while in Pershin and Ventra (2014) use of memcapacitors as synapses with integrate and fire neurons is presented. In many later works, memristors have been employed to implement synapses in neural networks (Thomas, 2013, Indiveri et al., 2013). Spike-Timing-Dependent-Plasticity (STDP) using memristors as synapses which can be used to emulate the auditory system is discussed in Serrano-Gotarredona et al. (2013). Memristors have also been used in image processing to detect edges (Prodromakis and Toumazou, 2010).

Restricted Boltzmann machine (RBM) used in deep networks has shown promising results in general, while the best results were achieved within the image classification problem (Larochelle and Bengio, 2008). RBMs in deep networks are trained in an unsupervised fashion using contrastive divergence (CD) as a learning algorithm. Since training large-scale RBMs is time consuming, some fast hardware implementations have been suggested in the literature e.g. FPGA-based RBM implementations have been suggested in Ly and Chow (2010), Kim et al. (2010) and recently in Kim et al. (2014). RBM with continuous valued neurons called Continuous RBM has VLSI implementation suggested in Chen et al. (2006). An implementation of RBM using digital synapses has been presented in Merolla et al. (2011), but no such implementation of RBMs exists for memristor based synapses.

This paper presents an approach to implementing CD in one layer of RBM which uses memristors as weight edges (synapse). Conductance of memristors is used to emulate a positive real number storage memory, which can be read and written using digital pulses. Furthermore, given that the conductance of an electronic device cannot be negative, a mechanism needs to be developed so that memristors can store negative real numbers as well. Additionally, since memristors are very noisy devices, it is important to use a realistic model of memristor with noise to better understand how learning behavior is affected under such conditions. This work presents such a mechanism, which makes little adjustments to RBMs and CD to keep the architecture simple resulting in an extensible system.

This paper is organized as follows. Section 2 introduces the background of the RBM followed by description of CD, and the resistive random-access memory (RRAM) memristors. Section 3 describes in detail the proposed RBM architecture and CD for memristor-based RBM. Section 4 presents the simulation setup and the obtained results. Finally, Section 5 concludes the paper.

Section snippets

Background

Boltzmann machines (BMs) are bidirectionally connected networks of stochastic processing units, which can be interpreted and trained as neural networks. In general, BMs are difficult and time consuming to train. Imposing certain restrictions on the network structure of BMs makes it an RBM, which are easier and less time consuming to train. From the perspective of neural networks, RBM is a generative stochastic neural network which maximizes the log probability of its training set. Its structure

Memristor-based RBM

Generally, weights in RBMs are real numbers ranging from negative infinity to positive infinity, although it is known that this limit is never reached. One crucial aspect for inducing learning in neural networks is its weight initialization. The weights must be randomly initialized, preferably in a specific range. The following subsection presents an implementation of CD on a memristor-based RBM, followed by the details of the weight initialization algorithm.

Simulation and results

For verification of the presented architecture, an RBM was trained and tested for a standard character recognition task on MNIST dataset. MNIST dataset is composed of 60,000 training images and 10,000 testing images. Memristors used for this work go from an LRS to the HRS in around 200 pulses, which effectively dictates the learning rate epsilon of (11). Therefore, instead of using the entire MNIST training dataset, only 10,000 training images were chosen randomly, with equal number of training

Conclusions

This work presented a mechanism of implementing an RBM model with CD as a learning algorithm for the neural networks with memristors as synaptic or weight edges. The technique presented in this paper was designed to mimic the basic CD cycle performed in software simulations of RBMs to keep things simple and extensible. Although the results achieved in this work do not attain the maximum performance which can be achieved using state of the art results, which are designed to use real numbers as

Acknowledgments

This work was partly supported by the ICT R&D program of MSIP/IITP (14-824-09-002, Development of global multi-target tracking and event prediction techniques based on real-time large-scale video analysis), and the Pioneer research center program through the National research foundation of Korea funded by the Ministry of Science, ICT and future planning (Grant number 2012-0009462).

References (24)

  • Y.V. Pershin et al.

    Experimental demonstration of associative memory with memristive neural networks

    Neural Netw.: The Official Journal of the International Neural Network Society

    (2010)
  • Bengio, Y, 2012. Practical Recommendations for Gradient-based Training of Deep Architectures, CoRR...
  • H. Chen et al.

    Continuous-valued probabilistic behavior in a VLSI generative model

    IEEE Trans. Neural Netw. (A publication of the IEEE Neural Networks Council)

    (2006)
  • Fort, A., Cortigiani, F., Rocchi, S., Vignoli, V, 2003. Very high-speed true random noise generator. Analog Integr....
  • Glorot, X., Bengio, Y., 2010. Understanding the difficulty of training deep feedforward neural networks. In: AISTATS,...
  • G. Hinton

    Training products of experts by minimizing contrastive divergence

    Neural Comput.

    (2002)
  • W. Holman et al.

    An integrated analog/digital random noise source

    IEEE Trans. Circuits Syst. I: Fundam. Theory Appl

    (1997)
  • Hopfield, J.J., 1982. Neural networks and physical systems with emergent collective computational abilities. Proc....
  • Indiveri, G., Linares-Barranco, B., Legenstein, R., Deligeorgis, G., Prodromakis, T, 2013. Integration of nanoscale...
  • Jo, M., Seong, D., Kim, S., Lee, J., Lee, W., Park, J.B., Park, S., Jung, S., Shin, J., Lee, D., 2010. Novel...
  • Kim, S.K., McMahon, P.L., Olukotun, K., 2010. A large-scale architecture for restricted Boltzmann machines. In: 2010...
  • L.-W. Kim et al.

    A fully pipelined FPGA architecture of a factored restricted Boltzmann machine artificial neural network

    ACM Trans. Reconfig. Technol. Syst.

    (2014)
  • Cited by (0)

    View full text