A brain-inspired information processing algorithm and its application in text classification
Introduction
In recent years, artificial intelligence technology has made rapid progress which has greatly surpassed human capabilities in many specific fields, such as image recognition (He, Zhang, Ren, & Sun, 2016), speech recognition (Chorowski, Bahdanau, Serdyuk, Cho, & Bengio, 2015), data mining (Holzinger, 2015), and so on.
In the field of pattern recognition, most algorithms reported in the literature are based on feature extraction and classifier (Bishop, 1995, Miroczuk and Protasiewicz, 2018). The main function of feature extraction is to map the original data to the input space corresponding to the classifier, meanwhile, simplify and refine the data. Many different methods for feature extraction have been developed over the past decades. In the field of image recognition, feature extraction methods have gradually evolved from the initial artificial selection to the convolution and pooling methods (Cui et al., 2016, He et al., 2015, Lowe et al., 1999, Simonyan and Zisserman, 2014). In comparison, in the field of text classification, many works reported in the last five years focused on feature selection and weight optimization (Pinheiro et al., 2015, Uysal, 2016, Ghareb et al., 2016, Agnihotri et al., 2017, Chen et al., 2016). The main goals of these studies are simplifying feature vectors, removing redundant information, and improving training speed as well as accuracy. The most common feature extraction method is vector space model (VSM), also known as the bag of words model. The central idea of VSM (Salton et al., 1975, Rajan et al., 2009) is using the word frequency of a text to form vector space. Unlike English texts, there is no space in Chinese texts to separate two neighboring Chinese words, which may consist of one or several Chinese characters. To solve this issue, word segmentation (WS) algorithm (Sproat and Emerson, 2003, Liu and Chen, 2015, Zhang et al., 2015) is usually used to segment Chinese sentences into individual Chinese word. However, WS algorithm needs to be trained with a large number of texts. When choosing the training set to train a WS model, a strong subjective human intervention is needed.
As outlined in Table 1, the trend of researches on feature extraction is to reduce human intervention because of its strong subjectivity and one-sidedness. Years ago, researchers used features with strong human intervention such as histogram (Ito, Ohyama, Wakabayashi, & Kimura, 2012) to do image recognition. But recently, methods with less human intervention such as convolution and pooling (Krizhevsky, Sutskever, & Hinton, 2012) show advantages. However, in Chinese text classification, word segmentation and vector space model are still the most commonly used methods to do data pretreatment and feature extraction. Training these algorithms requires massive labeled data with strong human intervention. As a result, a WS model trained with dataset A can hardly deal with dataset B, if there is a great difference between dataset A and B. For example, after training with modern standard Chinese texts, a word segmentation system may perform poorly dealing with Chinese dialects or Classical Chinese (Yi et al., 2007).
Although the human brain is slow to calculate, it can respond quickly when facing a familiar environment. Generally, people can read an article quickly and do skimming reading without reading every word. Classical cognitive science holds that the human brain constantly predicts the information to be received (Hosoya et al., 2005, Harman et al., 1999, Clark, 2013). In our daily life, we record the information around us into the brain constantly. When we receive familiar inputs again, we can quickly recall similar memory as the prediction of upcoming inputs to be received or as a supplement to the missing information of the inputs. This information processing mechanism of the human brain has two great benefits. First, as mentioned above, the missing information can be automatically filled up, which enhances the robust information processing mechanism of the human brain. Second, this approach of information processing can help our brain ignore redundant or repeated information, which makes the speed of processing familiar information several times faster.
Since we live in a time series related world, most events that happen in the real world change over time. The human brain continuously receives time-related information to achieve real-time learning and prediction (Bonda et al., 1995, Herzog et al., 2020). So if we want to build an algorithm which can realize similar functions of real-time learning and prediction as the human brain, we must consider the time sequence of information, which are unavoidable in real life. Most conventional machine learning methods can only perform time-independent classification (Han et al., 2019, Sezer et al., 2020). In addition, the input information received by human brain mostly are undefined information, and the brain seeks the similarity between undefined new information and learned information through analogy, in order to realize unsupervised learning and prediction (Bar, 2007). However, most machine learning algorithms need to utilize massive carefully defined information (labeled data) to train model, and perform poorly when receiving undefined information (unlabeled data)(Jiang, Chen, Yuan, & Yao, 2017). As shown in Table 2, in the Chinese text classification field, most researches reported recently need word segmentation and vector space model, both methods require the use of labeled information to train the algorithm. It is highly desirable to develop a new and brain-like information processing algorithm to automatically handle undefined inputs.
It is meaningful to study the information processing mechanism of the human brain for two reasons. For one thing, the information process mechanism of the human brain is capable to process various types of information inputs, and it does not depend on a specific information preprocessing method such as text segmentation. For another, input information can be quickly read, screened, and processed in familiar environments by learning and prediction. The human brain can store the learned information by neural circuit composed of neurons (Silva and Zhou, 2009, Deng and Aimone J B, 2010). It can make predictions by analogy between new input and learned information(Bar, 2007), and establish connections between new input and learned information through synapses (Yang, Pan, & Gan, 2009). In addition, the short-term memory formed by the human brain will be forgotten if it is not repetitively strengthened in time to form long-term memory (Frankland, Köhler, & Josselyn, 2013). Inspired by these behaviors of the human brain, we propose a new information processing algorithm that mimics the function of the human brain. We use boxes to store the learned information, and routes to reflect the connections between the information. Like the human brain, our algorithm compares the new input information with the box-route network constructed from the learned information to achieve learning and prediction functions. In addition, our algorithm also introduces a forgetting mechanism similar to the human brain, which can reduce the impact of unimportant information or noise. To demonstrate this new algorithm, we apply it to a text classification of up to 30 Chinese litterateurs and compare its effectiveness with the vector space model method.
The novelty of this research lies mainly in the following four aspects: (1) The proposed novel information process algorithm mimics the functions of the human brain, especially the functions of information learning, comparison, prediction, and forgetting. (2) This algorithm is unsupervised and self-adaptive. It does not need to use manually labeled data sets to train algorithm and can independently learn useful information from a large amount of undefined inputs. (3) It can retain the time sequence of input information and handle time series related data well, which can’t be achieved easily by the vector space model method commonly used in the literatures (Yu, Xu, & Li, 2008). (4) We apply the algorithm to Chinese text classification scenarios to test it’s effectiveness. The data set we used is anthologies of Chinese litterateurs, which is more challenging than commonly used datasets in literatures, such as Chinese news and email dataset (Mujtaba et al., 2017, Miao et al., 2018). To our best knowledge, this is the first study on the classification of Chinese litterateurs anthologies. This algorithm has a further advantage when classifying Chinese text as it does not need to use word segmentation algorithm to group Chinese characters into meaningful words.
Section snippets
Algorithm
A key idea of this new algorithm is to continuously learn the new information as it is received, and continuously identify repeated information to construct a hierarchical feature network. We focus on two basic brain-like functions: storage and connection. The new information should be stored in a certain form, and there should be a type of connection among various pieces of information to describe their relationship. In this algorithm, we use “boxes” to store the information and “routes” to
Experiments and results
To demonstrate the feasibility and effectiveness of this new algorithm, it is necessary to prove the box-route networks generated by different inputs can be distinguished from each other. As the first demonstration, we collected 30 anthologies of 30 Chinese writers as training materials to verify the effectiveness of our proposed algorithm. Proper data selection is namely of great importance, in this work, we use the following methods to choose samples. (1) We only choose contemporary Chinese
Discussion
Based on the experimental results, it can be seen that our algorithm is effective in Chinese text classification. One of the advantages of this algorithm is that it needs less human intervention. Namely, it needs neither to use word segmentation (WS) algorithm to group Chinese characters into meaningful words, nor to impose some restrictions on the features. Our algorithm also has higher accuracy in Chinese anthologies classification task compared with the conventional algorithms. In addition,
Conclusion and future work
This work presents a new information processing algorithm featuring a box-route structure, in analog to the basic “conception” and “association” between different “conceptions” in the human brain. The main function of this algorithm is to learn from input information automatically to establish a box-route network. It has the advantage of good adaptability and does not need human intervention in the data preprocessing, which means it can learn directly from undefined inputs (such as Chinese text
CRediT authorship contribution statement
Shenghong Mou: Conceptualization, Methodology, Software, Validation, Writing - original draft. Pengwei Du: Conceptualization, Software, Validation, Writing - review & editing. Zhiyuan Cheng: Supervision, Funding acquisition.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References (49)
- et al.
Variable global feature selection scheme for automatic classification of text documents
Expert Systems with Applications
(2017) The proactive brain: using analogies and associations to generate predictions
Trends in cognitive sciences
(2007)- et al.
Turning from tf-idf to tf-igm for term weighting in text classification
Expert Systems with Applications
(2016) - et al.
Classification of Chinese Texts Based on Recognition of Semantic Topics
Cognitive Computation
(2016) - et al.
Hippocampal neurogenesis and forgetting
Trends in neurosciences
(2013) - et al.
Hybrid feature selection based on enhanced genetic algorithm for text categorization
Expert Systems with Applications
(2016) - et al.
Active manual control of object views facilitates visual recognition
Current Biology
(1999) - et al.
All in good time: Long-lasting postdictive effects reveal discrete perception
Trends in Cognitive Sciences
(2020) - et al.
Dynamic predictive coding by the retina
American Journal of Ophthalmology
(2005) - et al.
Chinese text classification model based on deep learning
Future Internet
(2018)
A multi-label classification based approach for sentiment classification
Expert Systems with Applications
Data-driven global-ranking local feature selection methods for text categorization
Expert Systems with Applications
Automatic classification of tamil documents using vector space model and artificial neural network
Expert Systems with Applications
Financial time series forecasting with deep learning: A systematic literature review: 2005–2019
Applied Soft Computing
An improved global feature selection scheme for text classification
Expert Systems with Applications
Latent semantic analysis for text categorization using neural network
Knowledge-Based Systems
Chinese comments sentiment classification based on word2vec and svmperf
Expert Systems with Applications
View from the top: Hierarchies and reverse hierarchies in the visual system
Neuron
Neural Networks for Pattern Recognition
Neural correlates of mental transformations of the body-in-space
Proceedings of the National Academy of Sciences
A method for Chinese text classification based on apparent semantics and latent aspects
Journal of Ambient Intelligence and Humanized Computing
Whatever next? predictive brains, situated agents, and the future of cognitive science
Behavioral and Brain Sciences
The persistences of vision. Philosophical Transactions of the Royal Society of London
B, Biological Sciences
Cited by (0)
- 1
These authors contributed equally to this work and should be considered co-first authors.