A brain-inspired information processing algorithm and its application in text classification

https://doi.org/10.1016/j.eswa.2021.114828Get rights and content

Highlights

  • A new feature extraction algorithm with less human intervention.

  • The algorithm is suitable for the input of continuous time-series data.

  • The algorithm can work without word segmentation when processing Chinese texts.

  • Classification experiment contains 30 anthologies of different Chinese litterateurs.

Abstract

Cognitive scientists believe that the human brain is constantly predicting the information it is going to receive, and this prediction ability is acquired based on the experiences gained from previous information it received over its lifetime. Inspired by this brain-behavior, we propose a new information processing algorithm with the building blocks named as boxes and routes. The function of boxes is to store the learned information and the function of routes is to represent the relationship between information. The novel algorithm features generality and objectivity. It imitates the mechanism of the human brain and has functions such as information learning, comparison, prediction, and forgetting. It also has advantages in dealing with continuous time series data by using routes and high-order boxes. The algorithm is self-adaptive and unsupervised. It does not need manual intervention information to train the model. It can learn useful information from undefined data and subsequently construct a hierarchical network corresponding to the characteristics of the input information, which can be used for classification or prediction. To prove the validity of this new algorithm, a classifier is constructed based on the hierarchical network to do text classification. We select a collection of Chinese literature from 30 litterateurs as samples to train the classifier. For 10 classes situation, the optimal average accuracy of classification reaches 79.5%, which outperforms other approaches commonly used in literatures, verifying the effectiveness of the proposed algorithm.

Introduction

In recent years, artificial intelligence technology has made rapid progress which has greatly surpassed human capabilities in many specific fields, such as image recognition (He, Zhang, Ren, & Sun, 2016), speech recognition (Chorowski, Bahdanau, Serdyuk, Cho, & Bengio, 2015), data mining (Holzinger, 2015), and so on.

In the field of pattern recognition, most algorithms reported in the literature are based on feature extraction and classifier (Bishop, 1995, Miroczuk and Protasiewicz, 2018). The main function of feature extraction is to map the original data to the input space corresponding to the classifier, meanwhile, simplify and refine the data. Many different methods for feature extraction have been developed over the past decades. In the field of image recognition, feature extraction methods have gradually evolved from the initial artificial selection to the convolution and pooling methods (Cui et al., 2016, He et al., 2015, Lowe et al., 1999, Simonyan and Zisserman, 2014). In comparison, in the field of text classification, many works reported in the last five years focused on feature selection and weight optimization (Pinheiro et al., 2015, Uysal, 2016, Ghareb et al., 2016, Agnihotri et al., 2017, Chen et al., 2016). The main goals of these studies are simplifying feature vectors, removing redundant information, and improving training speed as well as accuracy. The most common feature extraction method is vector space model (VSM), also known as the bag of words model. The central idea of VSM (Salton et al., 1975, Rajan et al., 2009) is using the word frequency of a text to form vector space. Unlike English texts, there is no space in Chinese texts to separate two neighboring Chinese words, which may consist of one or several Chinese characters. To solve this issue, word segmentation (WS) algorithm (Sproat and Emerson, 2003, Liu and Chen, 2015, Zhang et al., 2015) is usually used to segment Chinese sentences into individual Chinese word. However, WS algorithm needs to be trained with a large number of texts. When choosing the training set to train a WS model, a strong subjective human intervention is needed.

As outlined in Table 1, the trend of researches on feature extraction is to reduce human intervention because of its strong subjectivity and one-sidedness. Years ago, researchers used features with strong human intervention such as histogram (Ito, Ohyama, Wakabayashi, & Kimura, 2012) to do image recognition. But recently, methods with less human intervention such as convolution and pooling (Krizhevsky, Sutskever, & Hinton, 2012) show advantages. However, in Chinese text classification, word segmentation and vector space model are still the most commonly used methods to do data pretreatment and feature extraction. Training these algorithms requires massive labeled data with strong human intervention. As a result, a WS model trained with dataset A can hardly deal with dataset B, if there is a great difference between dataset A and B. For example, after training with modern standard Chinese texts, a word segmentation system may perform poorly dealing with Chinese dialects or Classical Chinese (Yi et al., 2007).

Although the human brain is slow to calculate, it can respond quickly when facing a familiar environment. Generally, people can read an article quickly and do skimming reading without reading every word. Classical cognitive science holds that the human brain constantly predicts the information to be received (Hosoya et al., 2005, Harman et al., 1999, Clark, 2013). In our daily life, we record the information around us into the brain constantly. When we receive familiar inputs again, we can quickly recall similar memory as the prediction of upcoming inputs to be received or as a supplement to the missing information of the inputs. This information processing mechanism of the human brain has two great benefits. First, as mentioned above, the missing information can be automatically filled up, which enhances the robust information processing mechanism of the human brain. Second, this approach of information processing can help our brain ignore redundant or repeated information, which makes the speed of processing familiar information several times faster.

Since we live in a time series related world, most events that happen in the real world change over time. The human brain continuously receives time-related information to achieve real-time learning and prediction (Bonda et al., 1995, Herzog et al., 2020). So if we want to build an algorithm which can realize similar functions of real-time learning and prediction as the human brain, we must consider the time sequence of information, which are unavoidable in real life. Most conventional machine learning methods can only perform time-independent classification (Han et al., 2019, Sezer et al., 2020). In addition, the input information received by human brain mostly are undefined information, and the brain seeks the similarity between undefined new information and learned information through analogy, in order to realize unsupervised learning and prediction (Bar, 2007). However, most machine learning algorithms need to utilize massive carefully defined information (labeled data) to train model, and perform poorly when receiving undefined information (unlabeled data)(Jiang, Chen, Yuan, & Yao, 2017). As shown in Table 2, in the Chinese text classification field, most researches reported recently need word segmentation and vector space model, both methods require the use of labeled information to train the algorithm. It is highly desirable to develop a new and brain-like information processing algorithm to automatically handle undefined inputs.

It is meaningful to study the information processing mechanism of the human brain for two reasons. For one thing, the information process mechanism of the human brain is capable to process various types of information inputs, and it does not depend on a specific information preprocessing method such as text segmentation. For another, input information can be quickly read, screened, and processed in familiar environments by learning and prediction. The human brain can store the learned information by neural circuit composed of neurons (Silva and Zhou, 2009, Deng and Aimone J B, 2010). It can make predictions by analogy between new input and learned information(Bar, 2007), and establish connections between new input and learned information through synapses (Yang, Pan, & Gan, 2009). In addition, the short-term memory formed by the human brain will be forgotten if it is not repetitively strengthened in time to form long-term memory (Frankland, Köhler, & Josselyn, 2013). Inspired by these behaviors of the human brain, we propose a new information processing algorithm that mimics the function of the human brain. We use boxes to store the learned information, and routes to reflect the connections between the information. Like the human brain, our algorithm compares the new input information with the box-route network constructed from the learned information to achieve learning and prediction functions. In addition, our algorithm also introduces a forgetting mechanism similar to the human brain, which can reduce the impact of unimportant information or noise. To demonstrate this new algorithm, we apply it to a text classification of up to 30 Chinese litterateurs and compare its effectiveness with the vector space model method.

The novelty of this research lies mainly in the following four aspects: (1) The proposed novel information process algorithm mimics the functions of the human brain, especially the functions of information learning, comparison, prediction, and forgetting. (2) This algorithm is unsupervised and self-adaptive. It does not need to use manually labeled data sets to train algorithm and can independently learn useful information from a large amount of undefined inputs. (3) It can retain the time sequence of input information and handle time series related data well, which can’t be achieved easily by the vector space model method commonly used in the literatures (Yu, Xu, & Li, 2008). (4) We apply the algorithm to Chinese text classification scenarios to test it’s effectiveness. The data set we used is anthologies of Chinese litterateurs, which is more challenging than commonly used datasets in literatures, such as Chinese news and email dataset (Mujtaba et al., 2017, Miao et al., 2018). To our best knowledge, this is the first study on the classification of Chinese litterateurs anthologies. This algorithm has a further advantage when classifying Chinese text as it does not need to use word segmentation algorithm to group Chinese characters into meaningful words.

Section snippets

Algorithm

A key idea of this new algorithm is to continuously learn the new information as it is received, and continuously identify repeated information to construct a hierarchical feature network. We focus on two basic brain-like functions: storage and connection. The new information should be stored in a certain form, and there should be a type of connection among various pieces of information to describe their relationship. In this algorithm, we use “boxes” to store the information and “routes” to

Experiments and results

To demonstrate the feasibility and effectiveness of this new algorithm, it is necessary to prove the box-route networks generated by different inputs can be distinguished from each other. As the first demonstration, we collected 30 anthologies of 30 Chinese writers as training materials to verify the effectiveness of our proposed algorithm. Proper data selection is namely of great importance, in this work, we use the following methods to choose samples. (1) We only choose contemporary Chinese

Discussion

Based on the experimental results, it can be seen that our algorithm is effective in Chinese text classification. One of the advantages of this algorithm is that it needs less human intervention. Namely, it needs neither to use word segmentation (WS) algorithm to group Chinese characters into meaningful words, nor to impose some restrictions on the features. Our algorithm also has higher accuracy in Chinese anthologies classification task compared with the conventional algorithms. In addition,

Conclusion and future work

This work presents a new information processing algorithm featuring a box-route structure, in analog to the basic “conception” and “association” between different “conceptions” in the human brain. The main function of this algorithm is to learn from input information automatically to establish a box-route network. It has the advantage of good adaptability and does not need human intervention in the data preprocessing, which means it can learn directly from undefined inputs (such as Chinese text

CRediT authorship contribution statement

Shenghong Mou: Conceptualization, Methodology, Software, Validation, Writing - original draft. Pengwei Du: Conceptualization, Software, Validation, Writing - review & editing. Zhiyuan Cheng: Supervision, Funding acquisition.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (49)

  • S.M. Liu et al.

    A multi-label classification based approach for sentiment classification

    Expert Systems with Applications

    (2015)
  • R.H. Pinheiro et al.

    Data-driven global-ranking local feature selection methods for text categorization

    Expert Systems with Applications

    (2015)
  • K. Rajan et al.

    Automatic classification of tamil documents using vector space model and artificial neural network

    Expert Systems with Applications

    (2009)
  • O.B. Sezer et al.

    Financial time series forecasting with deep learning: A systematic literature review: 2005–2019

    Applied Soft Computing

    (2020)
  • A.K. Uysal

    An improved global feature selection scheme for text classification

    Expert Systems with Applications

    (2016)
  • B. Yu et al.

    Latent semantic analysis for text categorization using neural network

    Knowledge-Based Systems

    (2008)
  • D. Zhang et al.

    Chinese comments sentiment classification based on word2vec and svmperf

    Expert Systems with Applications

    (2015)
  • H.M. Ahissar

    View from the top: Hierarchies and reverse hierarchies in the visual system

    Neuron

    (2002)
  • C.M. Bishop

    Neural Networks for Pattern Recognition

    (1995)
  • E. Bonda et al.

    Neural correlates of mental transformations of the body-in-space

    Proceedings of the National Academy of Sciences

    (1995)
  • Y.-W. Chen et al.

    A method for Chinese text classification based on apparent semantics and latent aspects

    Journal of Ambient Intelligence and Humanized Computing

    (2015)
  • J. Chorowski et al.
    (2015)
  • A. Clark

    Whatever next? predictive brains, situated agents, and the future of cognitive science

    Behavioral and Brain Sciences

    (2013)
  • M. Coltheart

    The persistences of vision. Philosophical Transactions of the Royal Society of London

    B, Biological Sciences

    (1980)
  • Cited by (0)

    1

    These authors contributed equally to this work and should be considered co-first authors.

    View full text