Elsevier

Neurocomputing

Volume 466, 27 November 2021, Pages 128-138
Neurocomputing

Memory augmented convolutional neural network and its application in bioimages

https://doi.org/10.1016/j.neucom.2021.09.012Get rights and content

Highlights

  • This paper presents a memory augmented convolutional neural network.

  • This paper utilizes the augmenting SOM as memory for better network generalization.

  • This paper instantiates the distributed knowledge representation with SOM and CNN.

Abstract

The long short-term memory (LSTM) network underpins many achievements and breakthroughs especially in natural language processing fields. Essentially, it is endowed with certain memory capabilities to boost its performance. Currently, the volume and speed of big data generation are increasing exponentially, and such data require efficient models to acquire memory augmented knowledge. In this paper, we propose a memory augmented convolutional neural network (MACNN) with utilizing self-organizing maps (SOM) as the memory module. First, we depict the potential challenge about just applying solely a convolutional neural network (CNN) so as to highlight the advantage of augmenting SOM memory for better network generalization. Then, we dissert a corresponding network architecture incorporating memory to instantiate the distributed knowledge representation machanism, which tactically combines both SOM and CNN. Each component of the input vector is connected with a neuron in a two-dimensional lattice. Finally, we test the proposed network on various datasets and the experimental results reveal that MACNN can achieve competitive performance, especially for bioimages datasets. Meanwhile, we further illustrate the learned representations to interpret the SOM behavior and to comprehend the achieved results, which indicates that the proposed memory-incorporating model can exhibit the better performance.

Introduction

The prosperity of deep learning (DL) triggers the retrospection of research on brain-inspired computations. For example, tremendous achievements buffeting natural language processing (NLP) in recent years are regarded as being attributed to the constructions of large long short-term memory (LSTM) networks. It encourages Google and Facebook to build more sophisticated neural networks integrating memory to solve more difficult tasks [1], [2]. However, certain DL applications are not targeting sequential data, such as image classification, face recognition, cancer prediction, to name a few. These applications are based on variations in convolutional neural network (CNN) architectures [3], [4], [5], which are generally regarded as computation-oriented. Although this network interacts with all training samples, this process is still regarded as being mainly learning optimal filters for the targeted problem instead of memorized learning. Hence, the impressive performance of CNN relies on the features being effectively extracted from input data via these optimal filters [6]. Alternatively, it means that the prediction is mainly based on the computation base on the current input samples, while their relations with historical samples are quite vague.

For a group of images containing the same object, e.g., dogs, these images share many features common to dog species. For a given CNN, categorization of the content in a specific image mainly harnesses the knowledge retrieved from the current image via applying the learned filters. Instead, humans tend to make the judgements comprehensively via cognitive processes such as familiarity or even recall if necessary, all of which involve the memory system [7]. The above considerations encourage us to devise a memory module to couple with the CNN to improve its performance.

Another drive for considering new computation paradigm involving memory originates from a specific application of CNN, i.e., bioimage classification. Unlike the datasets such as ImageNet or Microsoft common objects in contex (MS-COCO) for general-purpose application of image classification [8], [9], bioimage datasets either from clinical practice or academic research tend to be small ones with imbalanced samples, all of which post challenge for effectively utilizing CNN for classification. For example, limited samples increase the risk of overfitting for deep neural network, and shallow network may fail to effectively extract suitable features for proper classification. Essentially, this kind of learning from a few samples to generalize on new data in sufficient accuracy is natural structure for humans [10], and learning capability is regarded as interleaving with the memory system [11]. Hence, it is promising to propose some new network models from a memory augmentation perspective to enhance the performance of CNN.

Due to the limited understanding of the human memory system, inspirations from it may be not so intuitive and difficult to manage. Take Google’s proposition, i.e., the differentiable neural computer (DNC) as an exampl. It tries to establish an enriched computation model by integrating a large memory, and two extra LSTM networks have to be tailored for the memory mechanism. Although with succeeded applications to complicated problems, following the architecture of modern computers avoidably adds complexity to the whole architecture and makes modifications and extensions to the memory module extremely difficult.

In fact, the idea of utilizing memory as part of computation has its own history. Immanuel [12] proposed the concepts of memory elements and the interconnections between them and their functionalities. Hopfield [13] introduced Hopfield network, working in a content-addressable or associative manner with binary threshold nodes. In this paper, we propose the self-organizing map (SOM) invented by Finnish professor Teuvo Kohonen as the memory module to fertilize computation [14]. SOM is based on the biological models of neural systems [15]; thus, it can capture the intuitive essence of memory system and meanwhile maintain its simplicity.

Currently, the volume and speed of big data generation are increasing exponentially, and such data require efficient models to acquire memory augmented knowledge. In this paper, we propose a neural network architecture named MACNN (memory augmented convolutional neural network) and evaluate it on datasets which conventionally only CNN are applied. We structure the subsequent sections as with our contributions. We first introduce the background knowledge of SOM to reveal its suitability as memory module, followed by the detailed explanation of the overall MACNN architecture. Then we experiment on different datasets to show the practicability and supremacy of our designed architecture via benchmarked results. Simultaneously, we also illustrate the learned memory representation to interpret the performance of our proposed network. At last, we discuss the difference between LSTM and SOM and anticipate future works to improve MACNN.

The remainder of this article is organized as follows. The SOM model is described in Section 2. Section 3 introduces the proposed MACNN network architecture with a combination of CNN and SOM. Extensive experimental evaluation is provided in Section 4. Section 5 further presents experimental results’ discussion and analysis. Section 6 provides the conclusion and identifies future research work.

Section snippets

SOM model

SOM used to be a popular neural network model and could find its various applications in numerous fields [16]. It belongs to the category of competitive learning networks and implements the winner-take-all learning strategy [14]. The arrangement of neurons plus the unsupervised learning paradigm enable it to represent the features intrinsically to the input sample space in a topological and suitable manner automatically. The updating process, which drives the best-matching unit (BMU) and its

Proposed MACNN network architecture

In this work, we design a hybrid architecture that jointly combines computation and memory by incorporating a SOM into the CNN. The general structure of SOM is shown in Fig. 2(A). Each component of the input vector is connected with a neuron which is arranged in a two-dimensional lattice. Assuming that the dimension of the input sample is N, let i denote the ith input component and j denote the jth neuron; then, the corresponding weights are wi1j,wi2j,,wiNj.

To conceive a memory module based on

Experiments studies

Three datasets are utilized to assess the practicability of the proposed network architecture. First, the clean and well-understood MNIST dataset is used to check the feasibility. Once the proposed model does not work, the use of clean MNIST data can narrow down the reason to the network structure rather than the dataset. The second dataset consists of breast cancer histopathology image patches, and the third dataset is an electroencephalogram (EEG) dataset captured from a visual oddball task.

Discussion and analysis

The significance of human memory in the cognitive process which shapes our daily life is self-evident. It is well-known that the human memory system consists of working memory, short-term memory and long-term memory. Long-term memory is further categorized into explicit memory and implicit memory. How to draw from human memory to design a more intelligent system is admittedly meaningful research in this DL era.

Through these restricted analyses, we proposed SOM that plays the role of memory in

Conclusion and future direction

Currently, the volume and speed of big data generation are increasing exponentially, and such data require efficient models to acquire memory augmented knowledge. In this paper, we proposed a new network architecture named MACNN, typically from the memory perspective, i.e., augmenting a SOM module with an existing CNN. A corresponding network architecture incorporating memory is disserted to instantiate the distributed knowledge representation philosophy. By experiments, we showed the better

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

The authors would like to express the sincere appreciation to the editor and anonymous reviewers for their insightful comments, which greatly improve the quality of this paper. This work was supported in part by the Australian Research Council (ARC) under discovery grant DP180100670 and DP180100656, NSW Defense Innovation Network and NSW State Government of Australia under the grant DINPP2019 S1-03/09, Office of Naval Research Global, US under Cooperative Agreement Number

Weiping Ding (M’16-SM’19) received the Ph.D. degree in Computer Science, Nanjing University of Aeronautics and Astronautics (NUAA), Nanjing, China, in 2013. He was a Visiting Scholar at University of Lethbridge(UL), Alberta, Canada, in 2011. From 2014 to 2015, He is a Postdoctoral Researcher at the Brain Research Center, National Chiao Tung University (NCTU), Hsinchu, Taiwan. In 2016, He was a Visiting Scholar at National University of Singapore (NUS), Singapore. From 2017 to 2018, he was a

References (40)

  • K. Robbins et al.

    An 18-subject EEG data collection using a visual-oddball task

    Data in Brief

    (Feb. 2018)
  • Weiwei Zhang et al.

    Multi-task learning with multi-view weighted fusion attention for artery-specific calcification analysis

    Information Fusion

    (2021)
  • A. Graves et al.

    Hybrid computing using a neural network with dynamic external memory

    Nature

    (2016)
  • S. Sukhbaatar, A. Szlam, et al., “End-To-End memory networks,” [Online]. arXiv:1503.08895v5,...
  • W. Ding et al.

    Deep neuro-cognitive co-evolution for fuzzy attribute reduction by quantum leaping PSO with nearest-neighbor memeplexes

    IEEE Trans. Cybern.

    (July 2019)
  • K. Muhammad et al.

    Deep learning for multigrade brain tumor classification in smart healthcare systems: A prospective survey

    IEEE Trans. Neural Netwk Learn. Syst.

    (2021)
  • W. Ding, M. Abdel-Basset, H. Hawash, and W. Pedrycz, “Multimodal infant brain segmentation by fuzzy-informed deep...
  • Matthew D. Zeiler and Rob Fergus, Visualizing and understanding convolutional networks, European Conference on Computer...
  • A.P. Yonelinas et al.

    Recollection and familiarity: Examining controversial assumptions and new directions

    Hippocampus

    (Nov. 2010)
  • J. Deng et al.

    ImageNet: A large-scale hierarchical image database

    IEEE Comput. Vision Pattern Recog. (CVPR)

    (2009)
  • T.Y. Lin et al.

    Microsoft COCO: Common objects in context

  • N. Jankowski et al.

    Meta-Learning in Computational Intelligence

    (2011)
  • Anonymous.

    Memory & Learning

    Neuroimage

    (2004)
  • K. Immanuel

    Critique of Pure Reason (The Cambridge Edition of the Works of Immanuel Kant)

    (1999)
  • J. J. Hopfield, “Neural networks and physical systems with emergent collective computational abilities,” Proceedings of...
  • Teuvo Kohonen

    Self-Organized formation of topologically correct feature maps

    Biol. Cybern.

    (1982)
  • Chr. Malsburg

    Self-organization of orientation sensitive cells in the striate cortex

    Kybernetik

    (1973)
  • T. Kohonen et al.

    Engineering applications of the self-organizing map

    Proc. IEEE

    (1996)
  • D. Graupe and H. Kordylewski, “Large scale memory (LAMSTAR) neural network for medical diagnosis,” in Proceedings of...
  • H. Nimmagadda et al.

    Self organising maps: An interesting tool for exploratory data analysis

    Res. J. Eng. Technol.

    (2012)
  • Cited by (5)

    • Meta-learning with Hopfield Neural Network

      2022, 9th IEEE Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering, UPCON 2022

    Weiping Ding (M’16-SM’19) received the Ph.D. degree in Computer Science, Nanjing University of Aeronautics and Astronautics (NUAA), Nanjing, China, in 2013. He was a Visiting Scholar at University of Lethbridge(UL), Alberta, Canada, in 2011. From 2014 to 2015, He is a Postdoctoral Researcher at the Brain Research Center, National Chiao Tung University (NCTU), Hsinchu, Taiwan. In 2016, He was a Visiting Scholar at National University of Singapore (NUS), Singapore. From 2017 to 2018, he was a Visiting Professor at University of Technology Sydney (UTS), Ultimo, NSW, Australia. Now, Dr. Ding is the Chair of IEEE CIS Task Force on Granular Data Mining for Big Data. He is a member of Senior IEEE, IEEE-CIS, ACM, CCAI and Senior CCF. He is a member of Technical Committee on Soft Computing of IEEE SMCS, on Granular Computing of IEEE SMCS, and on Data Mining and Big Data Analytics of IEEE CIS. He is currently a Full Professor with the School of Information Science and Technology, Nantong University, Nantong, China. His main research directions involve data mining, granular computing, evolutionary computing, machine learning and big data analytics. He has published more than 80 research peer-reviewed journal and conference papers in this field, including IEEE T-FS, T-NNLS, T-CYB, T-SMCS, T-BME, T-EVC,T-II, T-ETCI and T-ITS, etc, and he has held 15 approved invention patents. His four co-authored papers have been selected as ESI Highly Cited Papers. Dr. Ding was an Excellent-Young Teacher (Qing Lan Project) of Jiangsu Province in 2014, a High-Level Talent (Six Talent Peak) of Jiangsu Province in 2016, and a Middle-aged and Young Academic Leaders (Qing Lan Project) of Jiangsu Province in 2019. He was awarded the Best Paper of ICDMA’15. Dr. Ding was a recipient of the Medical Science and Technology Award (Second Prize) of Jiangsu Province, China, in 2017, and the Education Teaching and Research Achievement Award (Third Prize) of Jiangsu Province, China, in 2018. Dr. Ding was awarded two Chinese Government Scholarships for Overseas Studies in 2011 and 2016. Dr. Ding is vigorously involved in editorial activities. He currently serves on the Editorial Advisory Board of Knowledge-Based Systems (Elsevier) and Editorial Board of Information Fusion (Elsevier), Engineering Applications of Artificial Intelligence (Elsevier), and Applied Soft Computing (Elsevier), He served/serves as an Associate Editor of IEEE Transactions on Fuzzy Systems, IEEE/CAA Journal of Automatica Sinica, Information Sciences (Elsevier), Neurocomputing (Elsevier), Swarm and Evolutionary Computation (Elsevier), IEEE Access and Journal of Intelligent & Fuzzy Systems, and Co-Editor-in-Chief of Journal of Artificial Intelligence and System. He is the Leading Guest Editor of Special Issues in several prestigious journals, including IEEE Transactions on Evolutionary Computation, IEEE Transactions on Fuzzy Systems, IEEE Transactions on Emerging Topics in Computational Intelligence, Information Fusion, Information Sciences, and so on.

    Yurui Ming received a PhD. degree in Artificial Intelligence at University of Technology Sydney, Australia in 2020. He used to work in Telecommunication companies as a software engineer, design and implementation of protocol stacks for computer networks. His current research interests focus on applying variant neural networks especially deep ones to analyze electroenceph- alogram (EEG) data.

    Yu-kai Wang (M’13) received the B.S. degree in mathematics education from National Taichung University of Education, Taichung, Taiwan, in 2006, the M.S. degree in biomedical engineering from National Chiao Tung University (NCTU), Hsinchu Taiwan, in 2009, and the Ph.D. degree in computer science from NCTU, Hsinchu Taiwan, in 2015. He was a Visiting Scholar with the Swartz Center for Computational Neuroscience, University of California at San Diego, La Jolla, CA, USA, from 2013 to 2014. From 2016-2017, he was a postdoctoral researcher with the Centre for Artificial Intelligence in University of Technology Sydney, Australia. He is currently the Lecturer in Faculty of Engineering and Information Technology, University of Technology Sydney. His current research interests include machine learning, computational neuroscience, biomedical signal processing, and the brain computer interface.

    Chin-Teng Lin (S’88-M’91-SM’99-F’05) received the B.S. degree from National Chiao-Tung University (NCTU), Taiwan in 1986, and the Master and Ph.D. degree in electrical engineering from Purdue University, USA in 1989 and 1992, respectively. He is currently the Distinguished Professor of Faculty of Engineering and Information Technology, and Co-Director of Center for Artificial Intelligence, University of Technology Sydney, Australia. He is also invited as Honorary Chair Professor of Electrical and Computer Engineering, NCTU, and Honorary Professorship of University of Nottingham. Dr. Lin was elevated to be an IEEE Fellow for his contributions to biologically inspired information systems in 2005, and was elevated International Fuzzy Systems Association (IFSA) Fellow in 2012. Dr. Lin received the IEEE Fuzzy Systems Pioneer Awards in 2017. He served as the Editor-in-chief of IEEE Transactions on Fuzzy Systems from 2011 to 2016. He also served on the Board of Governors at IEEE Circuits and Systems (CAS) Society in 2005-2008, IEEE Systems, Man, Cybernetics (SMC) Society in 2003-2005, IEEE Computational Intelligence Society in 2008-2010, and Chair of IEEE Taipei Section in 2009-2010. Dr. Lin was the Distinguished Lecturer of IEEE CAS Society from 2003 to 2005 and CIS Society from 2015-2017. He serves as the Chair of IEEE CIS Distinguished Lecturer Program Committee in 2018. He served as the Deputy Editor-in-Chief of IEEE Transactions on Circuits and Systems-II in 2006-2008. Dr. Lin was the Program Chair of IEEE International Conference on Systems, Man, and Cybernetics in 2005 and General Chair of 2011 IEEE International Conference on Fuzzy Systems. Dr. Lin is the coauthor of Neural Fuzzy Systems (Prentice-Hall), and the author of Neural Fuzzy Control Systems with Structure and Parameter Learning (World Scientific). He has published more than 300 journal papers (Total Citation: 19,232, H-index: 64, i10-index: 243) in the areas of neural networks, fuzzy systems, brain computer interface, multimedia information processing, and cognitive neuro-engineering, including over 120 IEEE journal papers.

    View full text