Application of a staged learning-based resource allocation network to automatic text categorization

doi:10.1016/j.neucom.2014.07.017

Neurocomputing

Volume 149, Part B, 3 February 2015, Pages 1125-1134

https://doi.org/10.1016/j.neucom.2014.07.017 Get rights and content

Abstract

In this paper, we propose a novel learning classifier which utilizes a staged learning-based resource allocation network (SLRAN) for text categorization. In the light of its learning progress, SLRAN is divided into a preliminary learning phase and a refined learning phase. In the former phase, to reduce the sensitivity corresponding to input data an agglomerate hierarchical k-means method is utilized to create the initial structure of hidden layer. Subsequently, a novelty criterion is put forward to dynamically regulate the hidden layer centers. In the latter phase a least square method is used to enhance the convergence rate of network and further improve its ability for classification. Such staged learning-based approach builds a compact structure which decreases the computational complexity of network and boosts its learning capability. In order to implement SLRAN to text categorization, we utilize a semantic similarity approach which reduces the input scales of neural network and reveals the latent semantics between text features. The benchmark Reuter and 20-newsgroup datasets are tested in our experiments and the extensive experimental results reveal that the dynamic learning process of SLRAN improves its classifying performance in comparison with conventional classifiers, e.g. RAN, BP, RBF neural networks and SVM.

Introduction

With the rapid development of Internet technology, a large quantity of online documents and information are growing exponentially. The demand of rapidly and accurately finding out the useful information from such a large dataset has become a challenge for modern information retrieval (IR) technologies. Text categorization (TC) is a crucial and well-proven instrument for organizing large volumes of textual information. As a key technique in IR field, TC has been extensively researched and witnessed in recent decades. Meanwhile, TC has become a hot spot and puts forward a series of related applications, including web classification, query recommendation, spam filtering, topic spotting etc.

In recent years, an increasing number of approaches based on intelligent agent and machine learning, e.g. support vector machine (SVM) [1], decision trees [2], [3], K-nearest neighbor (KNN) [4], [5], bayes model [6], [7], [8], neural network [9], [10] etc, have been applied to text categorization. Although such methods have been extensively researched, yet the present automated text classifiers are still with fault and the effectiveness needs improvement. Thus, text categorization is still a major research field. Since artificial neural network is still one of the most powerful tools utilized in the field of pattern recognition [11], we employ it as a classifier.

As a kind of basic supervised network, back propagation (BP) neural network suffers the fault of slow training rate and high tendency to trap into local minimum. On the contrary, without slow learning rate, the relatively simple mechanism of radial basis function (RBF) neural network [12], [13], [14] displays the robust property of global situation approaching. It has been known that the key to build a successful RBF neural network is to insure a proper number of units in its hidden layer [15]. More specifically, the lack of hidden layer nodes always results in a negative influence on its ability to decision-making. Whereas the redundant hidden layer nodes bring about a result of high computing [16], [17], [18]. That is to say, too small architecture of network may cause the problem of under-fitting, while on the other hand, too large architecture of network may lead to over-fitting to data [19], [20]. Although more and more learning methods have studied to regulate the hidden nodes to satisfy the demand of the suitable structure for RBF network, the most remarkable approach is resource allocation network (RAN) learning method put forward by Platt [21]. Platt made a significant contribution through the development of the algorithm which regulates the hidden nodes according to the so called novelty criteria. In other words, RAN can dynamically manipulate the number of the hidden layer units by judging the novelty criteria. However, the novelty criteria are sensitive to the initialized data, which would easily cause the growth of the training time for network, and lead to the reduction of the employment effect [22]. Meanwhile, in RAN the least mean-square (LMS) algorithm applied to update its learning parameters usually makes the network suffer from the drawback of lower convergence rate [23], [24]. To tackle with these problems, in this paper we propose a staged learning-based resource allocation network (SLRAN) which divides its learning process into a preliminary learning phase and a refined learning phase. In the former phase, to reduce its sensitivity corresponding to the initialized data, an agglomerate hierarchical k-means method is utilized to construct the structure of hidden layer. Subsequently, a novelty criterion is put forward to dynamically add or prune hidden layer centers, and a compact structure is created. That is, the former phase reduces the complexity of the network and builds the initial structure of RAN. Yet in the latter phase a least square method is used to enhance the convergence rate of network and further refine its learning ability. Therefore, SLRAN builds a compact structure which decreases the computational complexity of network and boosts its learning ability for classification.

The rest of this paper is organized as follows. Section 2 introduces the basic concepts of the RAN. Section 3 proposes the algorithm of SLRAN as an efficient text classifier and describes its details. The steps to generate the latent semantic feature of text documents, which helps enhance the text categorization performance, are depicted in Section 4. Experimental results and analysis are illustrated in Section 5. Conclusions are discussed in Section 6.

Section snippets

Resource allocation network (RAN)

RAN is a promising and sequential learning algorithm based on RBF neural network. The architecture of RAN includes three layers, i.e. an input layer, an output layer and a single hidden layer. The topology of the RAN is shown in Fig. 1. During the training process of RAN, a sample of n dimensional input vector is given to the input layer, and based on the assigned input pattern, RAN will compute the output of m dimensional vector in the output layer. That is to say, the aim of the RAN network

Staged learning-based resource allocation network (SLRAN)

To handle the above-mentioned problems of RAN, in this paper we propose a staged learning-based resource allocation network (SLRAN) which divides the learning process into two phases. In order to reduce its sensitivity corresponding to the initialized data, in the preliminary learning phase of SLRAN an agglomerate hierarchical k-means method is utilized to construct the structure of hidden layer. Subsequently, a novelty criterion is put forward to dynamically add or prune hidden layer centers.

Vector space model (VSM)

VSM is a commonly used method on behalf of documents through the weight of each word comprised. The model is on the basis of the idea that, the meaning of a document can be conveyed by its words. And the weight of each feature, which represents the contribution of every word, is evaluated by a statistical rule [25], [26], [27]. It is implemented through creating a term-document matrix that represents all dataset. In order to create a set of initial feature vectors used for representing the

Data sets

In order to measure the effectiveness of our approach, we make our experiments over two standard text corpus datasets in this study, i.e. Reuter-21578 corpus and 20-newsgroup corpus. In the former dataset, we choose 1500 documents which contains ten categories, i.e. acq, coffee, crude, earn, grain, money-fix, trade, interest, ship and sugar. In the latter dataset, 1200 documents are selected which comes from ten categories, i.e. Alt.atheism, Comp.windows.x, Sci.crypt, Rec.motorcycles,

Conclusion and discussion

In this paper, we propose a learning classifier which utilizes a staged learning-based resource allocation network (SLRAN) for text categorization. In terms of its learning progress, we divide SLRAN into a preliminary learning phase and a refined learning phase. In the former phase, an agglomerate hierarchical k-means method is utilized to create the initial structure of hidden layer. Such a method can reduce the sensitivity corresponding to the input data and can effectively prevent it from

Acknowledgments

The authors thank the editors and reviewers for providing very helpful comments and suggestions. Their insight and comments led to a better presentation of the ideas expressed in this paper. This work was sponsored by the National Natural Science Foundation of China (61103129), the fourth stage of Brain Korea 21 Project. Natural Science Foundation of Jiangsu Province (SBK201122266), SRF for ROCS, SEM, and the Specialized Research Fund for the Doctoral Program of Higher Education (20100093120004

Wei Song received his MS degree in Information and Communication Engineering from the Chonbuk National University, Jeonbuk, Korea, in 2006. He received his PhD in Computer Science from the Chonbuk National University, in 2009. Upon graduation, he joined the Jiangnan University in School of Internet of things (IOT). His research interests include pattern recognition, information retrieval, evolutionary computing, neural networks, artificial intelligence, data mining and knowledge discovery.

References (34)

M.C. Wu et al.
An effective application of decision tree to stock trading
Expert Syst Appl.
(2006)
A. Kumar et al.
Fuzzy binary decision tree for biometric based personal authentication
Neurocomputing
(2013)
Y. Gao et al.
Edited AdaBoost by weighted kNN
Neurocomputing
(2010)
H.C. Lin et al.
A selective Bayes classifier with meta-heuristics for incomplete data
Neurocomputing
(2013)
J. Wang et al.
Bayes classification based on minimum bounding spheres
Neurocomputing
(2007)
C.H. Li et al.
An efficient document classification model using an improved back propagation neural network and singular value decomposition
Expert Syst. Appl.
(2009)
V. Fathi et al.
An improvement in RBF learning algorithm based on PSO for real time applications
Neurocomputing
(2013)
D. Du et al.
A novel locally regularized automatic construction method for RBF neural models
Neurocomputing
(2012)
H.G. Han et al.
An efficient self-organizing RBF neural network for water quality prediction
Neural Netw.
(2011)
L. Xu
Least mean square error reconstruction principle for self-organizing neural-nets
Neural Netw.
(1993)

W. Song et al.

Fuzzy evolutionary optimization modeling and its applications to unsupervised categorization and extractive summarization

Expert Syst. Appl.

(2011)

W. Song et al.

Parametric and nonparametric evolutionary computing with a content-based feature selection approach for parallel categorization

Expert Syst. Appl.

(2009)

B. Yu et al.

Latent semantic analysis for text categorization using neural network

Knowl. Based Syst.

(2008)

C.H. Li et al.

An automatically constructed thesaurus for neural network based document categorization

Expert Syst. Appl.

(2009)

W. Song et al.

Genetic algorithm for text clustering using ontology and evaluating the validity of various semantic similarity measures

Expert Syst. Appl.

(2009)

H.T. Lin et al.

A note on platt׳s probabilistic outputs for support vector machines

Mach. Learn.

(2007)

E.K. Plakua et al.

Distributed computation of the KNN graph for large high-dimensional point sets

J. Parallel Distrib. Comput.

(2007)

Cited by (1)

A semi-supervised auto-encoder using label and sparse regularizations for classification
2019, Applied Soft Computing Journal
Citation Excerpt :
The ELM is utilized to obtain the actual output labels, and the squared error in Eq. (9) is then calculated to adjust the parameters of the virtual AE with the back propagation method. To perform the application of classification [39,40], the output of LSRAE, i.e., the extracted features from the original test samples, is the input to the softmax classifier. Thus, the flow chart of LSRAE is shown in Fig. 3, and the main steps of LSRAE are summarized as
The semi-supervised auto-encoder (SSAE) is a promising deep-learning method that integrates the advantages of unsupervised and supervised learning processes. The former learning process is designed to extract the underlying concepts of data as intrinsic information and enhance its generalization ability to express data. Furthermore, the supervised process tends to describe the rules of categorization with labels that further improve categorization accuracy. In this paper, we propose a novel semi-supervised learning method, namely, label and sparse regularization AE (LSRAE), by integrating label and sparse constraints to update the structure of the AE. The sparse regularization activates a minority of important neurons, while most of the other neurons are inhibited. Such a method ensures that LSRAE can yield a more local and informative structure of the data. Moreover, by implementing the label constraint, the supervised learning process can extract the features regulated by category rules and enhance the performance of the classifier in depth. To extensively test the performances of LSRAE, we perform our experiments on the benchmark datasets USPS, ISOLET and MNIST. The experimental results demonstrate the superiority of LSRAE in comparison with state-of-the-art feature extraction methods including AE, LSAE, SAE, ELM, DBN, and adaptive DBN.

Peng Chen received his B.Eng. degree in 2012 from the Hubei University of Technology, Wuha, China, and he is currently the MS candidate at Jiangnan University in School of Internet of things (IOT). His research interests include information retrieval, neural networks, data mining and knowledge discovery.

Soon Cheol Park received his BS degree in Applied Physics from the Inha University, Incheon, Korea, in 1979. He received his PhD in Computer Science from Louisiana State University, Baton Rouge, Louisiana, USA, in 1991. He was a senior researcher in the Division of Computer Research at the Electronics & Telecommunications Research Institute, Korea, from 1991 to 1993. He is currently a Professor at the Department of the Electronics and Information Engineering at Chonbuk National University, Jeonbuk, Korea. His research interests include pattern recognition, information retrieval, artificial intelligence, data mining and knowledge discovery.

View full text

Application of a staged learning-based resource allocation network to automatic text categorization

Abstract

Introduction

Section snippets

Resource allocation network (RAN)

Staged learning-based resource allocation network (SLRAN)

Vector space model (VSM)

Data sets

Conclusion and discussion

Acknowledgments

Expert Syst Appl.

Neurocomputing

Neurocomputing

Neurocomputing

Neurocomputing

Expert Syst. Appl.

Neurocomputing

Neurocomputing

Neural Netw.

Neural Netw.

Expert Syst. Appl.

Expert Syst. Appl.

Knowl. Based Syst.

Expert Syst. Appl.

Expert Syst. Appl.

A note on platt׳s probabilistic outputs for support vector machines

Mach. Learn.

Distributed computation of the KNN graph for large high-dimensional point sets

J. Parallel Distrib. Comput.