Abstract
This article proposes a novel approach for text categorization based on a regularization extreme learning machine (RELM) in which its weights can be obtained analytically, and a bias-variance trade-off could be achieved by adding a regularization term into the linear system of single-hidden layer feedforward neural networks. To fit the input scale of RELM, the latent semantic analysis was used to represent text for dimensionality reduction. Moreover, a classification algorithm based on RELM was developed including the uni-label (i.e., a document can only be assigned to a unique category) and multi-label (i.e., a document can be assigned to multiple categories simultaneously) situations. The experimental results in two benchmarks show that the proposed method can produce good performance in most cases, and it could learn faster than popular methods such as feedforward neural networks or support vector machine.
Similar content being viewed by others
References
Sebastiani F (2002) Machine learning in automated text categorization. ACM Comput Surv 34(1):1–47
Lewis DD, Ringuette M (1994) A comparison of two learning algorithms for text categorization. In: Third annual symposium on document analysis and information retrieval, vol 33. Citeseer, pp 81–93
Soucy P, Mineau GW (2001) A simple knn algorithm for text categorization. In: IEEE international conference on data mining, pp 647–648
Ng HT, Goh WB, Low KL (1997) Feature selection, perceptron learning, and a usability case study for text categorization. In: 20th Annual international ACM SIGIR conference on research and development in information retrieval, pp 67–73
Wang W, Yu B (2009) Text categorization based on combination of modified back propagation neural network and latent semantic analysis. Neural Comput Appl 18(8):875–881
De Souza AF, Pedroni F, Oliveira E, Ciarelli PM, Henrique WF, Veronese L, Badue C (2009) Automated multi-label text categorization with vg-ram weightless neural networks. Neurocomputing 72(10–12):2209–2217
Joachims T (1998) Text categorization with support vector machines: learning with many relevant features. In: 10th European Conference on Machine Learning, pp 137–142
Gabrilovich E, Markovitch S (2004) Text categorization with many redundant features: Using aggressive feature selection to make svms competitive with c4. 5. In: Proceedings of the twenty-first international conference on Machine learning, pp 321–328
Genkin A, Lewis DD, Madigan D (2007) Large-scale bayesian logistic regression for text categorization. Technometrics 49:291–304
Aseervatham S, Antoniadis A, Gaussier E, Burlet M, Denneulin Y (2011) A sparse version of the ridge logistic regression for large-scale text categorization. Pattern Recognit Lett 32:101–106
Hmeidi I, Hawashin B, El-Qawasmeh E (2008) Performance of knn and svm classifiers on full word arabic articles. Adv Eng Inform 22(1):106–111
Zhao M, Ren J, Ji L, Fu C, Li J, Zhou M (2011) Parameter selection of support vector machines and genetic algorithm based on change area search. Neural Comput Appl (in press)
Anand R, Mehrotra K, Mohan CK, Ranka S (1995) Efficient classification for multiclass problems using modular neural networks. IEEE Trans Neural Netw 6(1):117–124
Man Z, Wu HR, Liu S, Yu X (2006) A new adaptive backpropagation algorithm based on lyapunov stability theory for neural networks. IEEE Trans Neural Netw 17(6):1580–1591
Huang GB, Zhu QY, Siew CK (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. In: Proceedings of the IEEE international joint conference on neural networks, vol 2, pp 985–990
Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1–3):489–501
Huang GB, Wang DH, Lan Y (2011) Extreme learning machines: a survey. Int J Mach Learn Cybern 2(2):107–122
Zhu QY, Qin AK, Suganthan PN, Huang GB (2005) Evolutionary extreme learning machine. Pattern Recognit 38(10):1759–1763
Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391–407
Huang GB, Ding X, Zhou H (2010) Optimization method based extreme learning machine for classification. Neurocomputing 74:155–163
Nakayama M, Shimizu Y (2003) Subject categorization for web educational resources using mlp. In: Proceedings of 11th European symposium on artificial neural networks. Citeseer, pp 9–14
Tsimboukakis N, Tambouratzis G (2010) A comparative study on authorship attribution classification tasks using both neural network and statistical methods. Neural Comput Appl 19(4):573–582
Liu Y, Loh HT, Tor SB (2005) Comparison of extreme learning machine with support vector machine for text classification. Innov Appl Artif Intell 3533:390–399
Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Inf Process Manag 24(5):513–523
Tikhonov A (1963) Solution of incorrectly formulated problems and the regularization method. Sov Math Dokl 4:1035–1038
Hoerl AE, Kennard RW (1970) Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12:55–67
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Series B (Methodol) 58:267–288
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Series B (Stat Methodol) 67(2):301–320
Qian Y, Jia S, Zhou J, Robles-Kelly A (2011) Hyperspectral unmixing via l_ {1/2} sparsity-constrained nonnegative matrix factorization. IEEE Trans Geosci Remote Sens 99:1–16
Dai G, Wang J, Shi J, Ren X, Zhang Z (2011) A non-convex relaxation approach to sparse dictionary learning. In: International conference on computer vision and pattern recognition, pp 1809–1816
Huang GB, Zhou H, Ding X, Zhang R (2011) Extreme learning machine for regression and multi-class classification. IEEE Trans Syst Man Cybern Part B (in press)
Acknowledgments
The authors are grateful to anonymous reviewers for their constructive suggestions. This work was supported by the 973 Program (Grant No. 2012CB316400), the National Natural Science Foundation of China (Grant No. 61171151), and the Natural Science Foundation of Zhejiang Province (Grant No. Y6110147 and Y1110342).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zheng, W., Qian, Y. & Lu, H. Text categorization based on regularization extreme learning machine. Neural Comput & Applic 22, 447–456 (2013). https://doi.org/10.1007/s00521-011-0808-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-011-0808-y