Elsevier

Neurocomputing

Volume 149, Part B, 3 February 2015, Pages 1125-1134
Neurocomputing

Application of a staged learning-based resource allocation network to automatic text categorization

https://doi.org/10.1016/j.neucom.2014.07.017Get rights and content

Abstract

In this paper, we propose a novel learning classifier which utilizes a staged learning-based resource allocation network (SLRAN) for text categorization. In the light of its learning progress, SLRAN is divided into a preliminary learning phase and a refined learning phase. In the former phase, to reduce the sensitivity corresponding to input data an agglomerate hierarchical k-means method is utilized to create the initial structure of hidden layer. Subsequently, a novelty criterion is put forward to dynamically regulate the hidden layer centers. In the latter phase a least square method is used to enhance the convergence rate of network and further improve its ability for classification. Such staged learning-based approach builds a compact structure which decreases the computational complexity of network and boosts its learning capability. In order to implement SLRAN to text categorization, we utilize a semantic similarity approach which reduces the input scales of neural network and reveals the latent semantics between text features. The benchmark Reuter and 20-newsgroup datasets are tested in our experiments and the extensive experimental results reveal that the dynamic learning process of SLRAN improves its classifying performance in comparison with conventional classifiers, e.g. RAN, BP, RBF neural networks and SVM.

Introduction

With the rapid development of Internet technology, a large quantity of online documents and information are growing exponentially. The demand of rapidly and accurately finding out the useful information from such a large dataset has become a challenge for modern information retrieval (IR) technologies. Text categorization (TC) is a crucial and well-proven instrument for organizing large volumes of textual information. As a key technique in IR field, TC has been extensively researched and witnessed in recent decades. Meanwhile, TC has become a hot spot and puts forward a series of related applications, including web classification, query recommendation, spam filtering, topic spotting etc.

In recent years, an increasing number of approaches based on intelligent agent and machine learning, e.g. support vector machine (SVM) [1], decision trees [2], [3], K-nearest neighbor (KNN) [4], [5], bayes model [6], [7], [8], neural network [9], [10] etc, have been applied to text categorization. Although such methods have been extensively researched, yet the present automated text classifiers are still with fault and the effectiveness needs improvement. Thus, text categorization is still a major research field. Since artificial neural network is still one of the most powerful tools utilized in the field of pattern recognition [11], we employ it as a classifier.

As a kind of basic supervised network, back propagation (BP) neural network suffers the fault of slow training rate and high tendency to trap into local minimum. On the contrary, without slow learning rate, the relatively simple mechanism of radial basis function (RBF) neural network [12], [13], [14] displays the robust property of global situation approaching. It has been known that the key to build a successful RBF neural network is to insure a proper number of units in its hidden layer [15]. More specifically, the lack of hidden layer nodes always results in a negative influence on its ability to decision-making. Whereas the redundant hidden layer nodes bring about a result of high computing [16], [17], [18]. That is to say, too small architecture of network may cause the problem of under-fitting, while on the other hand, too large architecture of network may lead to over-fitting to data [19], [20]. Although more and more learning methods have studied to regulate the hidden nodes to satisfy the demand of the suitable structure for RBF network, the most remarkable approach is resource allocation network (RAN) learning method put forward by Platt [21]. Platt made a significant contribution through the development of the algorithm which regulates the hidden nodes according to the so called novelty criteria. In other words, RAN can dynamically manipulate the number of the hidden layer units by judging the novelty criteria. However, the novelty criteria are sensitive to the initialized data, which would easily cause the growth of the training time for network, and lead to the reduction of the employment effect [22]. Meanwhile, in RAN the least mean-square (LMS) algorithm applied to update its learning parameters usually makes the network suffer from the drawback of lower convergence rate [23], [24]. To tackle with these problems, in this paper we propose a staged learning-based resource allocation network (SLRAN) which divides its learning process into a preliminary learning phase and a refined learning phase. In the former phase, to reduce its sensitivity corresponding to the initialized data, an agglomerate hierarchical k-means method is utilized to construct the structure of hidden layer. Subsequently, a novelty criterion is put forward to dynamically add or prune hidden layer centers, and a compact structure is created. That is, the former phase reduces the complexity of the network and builds the initial structure of RAN. Yet in the latter phase a least square method is used to enhance the convergence rate of network and further refine its learning ability. Therefore, SLRAN builds a compact structure which decreases the computational complexity of network and boosts its learning ability for classification.

The rest of this paper is organized as follows. Section 2 introduces the basic concepts of the RAN. Section 3 proposes the algorithm of SLRAN as an efficient text classifier and describes its details. The steps to generate the latent semantic feature of text documents, which helps enhance the text categorization performance, are depicted in Section 4. Experimental results and analysis are illustrated in Section 5. Conclusions are discussed in Section 6.

Section snippets

Resource allocation network (RAN)

RAN is a promising and sequential learning algorithm based on RBF neural network. The architecture of RAN includes three layers, i.e. an input layer, an output layer and a single hidden layer. The topology of the RAN is shown in Fig. 1. During the training process of RAN, a sample of n dimensional input vector is given to the input layer, and based on the assigned input pattern, RAN will compute the output of m dimensional vector in the output layer. That is to say, the aim of the RAN network

Staged learning-based resource allocation network (SLRAN)

To handle the above-mentioned problems of RAN, in this paper we propose a staged learning-based resource allocation network (SLRAN) which divides the learning process into two phases. In order to reduce its sensitivity corresponding to the initialized data, in the preliminary learning phase of SLRAN an agglomerate hierarchical k-means method is utilized to construct the structure of hidden layer. Subsequently, a novelty criterion is put forward to dynamically add or prune hidden layer centers.

Vector space model (VSM)

VSM is a commonly used method on behalf of documents through the weight of each word comprised. The model is on the basis of the idea that, the meaning of a document can be conveyed by its words. And the weight of each feature, which represents the contribution of every word, is evaluated by a statistical rule [25], [26], [27]. It is implemented through creating a term-document matrix that represents all dataset. In order to create a set of initial feature vectors used for representing the

Data sets

In order to measure the effectiveness of our approach, we make our experiments over two standard text corpus datasets in this study, i.e. Reuter-21578 corpus and 20-newsgroup corpus. In the former dataset, we choose 1500 documents which contains ten categories, i.e. acq, coffee, crude, earn, grain, money-fix, trade, interest, ship and sugar. In the latter dataset, 1200 documents are selected which comes from ten categories, i.e. Alt.atheism, Comp.windows.x, Sci.crypt, Rec.motorcycles,

Conclusion and discussion

In this paper, we propose a learning classifier which utilizes a staged learning-based resource allocation network (SLRAN) for text categorization. In terms of its learning progress, we divide SLRAN into a preliminary learning phase and a refined learning phase. In the former phase, an agglomerate hierarchical k-means method is utilized to create the initial structure of hidden layer. Such a method can reduce the sensitivity corresponding to the input data and can effectively prevent it from

Acknowledgments

The authors thank the editors and reviewers for providing very helpful comments and suggestions. Their insight and comments led to a better presentation of the ideas expressed in this paper. This work was sponsored by the National Natural Science Foundation of China (61103129), the fourth stage of Brain Korea 21 Project. Natural Science Foundation of Jiangsu Province (SBK201122266), SRF for ROCS, SEM, and the Specialized Research Fund for the Doctoral Program of Higher Education (20100093120004

Wei Song received his MS degree in Information and Communication Engineering from the Chonbuk National University, Jeonbuk, Korea, in 2006. He received his PhD in Computer Science from the Chonbuk National University, in 2009. Upon graduation, he joined the Jiangnan University in School of Internet of things (IOT). His research interests include pattern recognition, information retrieval, evolutionary computing, neural networks, artificial intelligence, data mining and knowledge discovery.

References (34)

  • W. Song et al.

    Fuzzy evolutionary optimization modeling and its applications to unsupervised categorization and extractive summarization

    Expert Syst. Appl.

    (2011)
  • W. Song et al.

    Parametric and nonparametric evolutionary computing with a content-based feature selection approach for parallel categorization

    Expert Syst. Appl.

    (2009)
  • B. Yu et al.

    Latent semantic analysis for text categorization using neural network

    Knowl. Based Syst.

    (2008)
  • C.H. Li et al.

    An automatically constructed thesaurus for neural network based document categorization

    Expert Syst. Appl.

    (2009)
  • W. Song et al.

    Genetic algorithm for text clustering using ontology and evaluating the validity of various semantic similarity measures

    Expert Syst. Appl.

    (2009)
  • H.T. Lin et al.

    A note on platt׳s probabilistic outputs for support vector machines

    Mach. Learn.

    (2007)
  • E.K. Plakua et al.

    Distributed computation of the KNN graph for large high-dimensional point sets

    J. Parallel Distrib. Comput.

    (2007)
  • Cited by (1)

    • A semi-supervised auto-encoder using label and sparse regularizations for classification

      2019, Applied Soft Computing Journal
      Citation Excerpt :

      The ELM is utilized to obtain the actual output labels, and the squared error in Eq. (9) is then calculated to adjust the parameters of the virtual AE with the back propagation method. To perform the application of classification [39,40], the output of LSRAE, i.e., the extracted features from the original test samples, is the input to the softmax classifier. Thus, the flow chart of LSRAE is shown in Fig. 3, and the main steps of LSRAE are summarized as

    Wei Song received his MS degree in Information and Communication Engineering from the Chonbuk National University, Jeonbuk, Korea, in 2006. He received his PhD in Computer Science from the Chonbuk National University, in 2009. Upon graduation, he joined the Jiangnan University in School of Internet of things (IOT). His research interests include pattern recognition, information retrieval, evolutionary computing, neural networks, artificial intelligence, data mining and knowledge discovery.

    Peng Chen received his B.Eng. degree in 2012 from the Hubei University of Technology, Wuha, China, and he is currently the MS candidate at Jiangnan University in School of Internet of things (IOT). His research interests include information retrieval, neural networks, data mining and knowledge discovery.

    Soon Cheol Park received his BS degree in Applied Physics from the Inha University, Incheon, Korea, in 1979. He received his PhD in Computer Science from Louisiana State University, Baton Rouge, Louisiana, USA, in 1991. He was a senior researcher in the Division of Computer Research at the Electronics & Telecommunications Research Institute, Korea, from 1991 to 1993. He is currently a Professor at the Department of the Electronics and Information Engineering at Chonbuk National University, Jeonbuk, Korea. His research interests include pattern recognition, information retrieval, artificial intelligence, data mining and knowledge discovery.

    View full text