Skip to main content
Log in

Text categorization based on regularization extreme learning machine

  • Extreme Learning Machine’s Theory & Application
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

This article proposes a novel approach for text categorization based on a regularization extreme learning machine (RELM) in which its weights can be obtained analytically, and a bias-variance trade-off could be achieved by adding a regularization term into the linear system of single-hidden layer feedforward neural networks. To fit the input scale of RELM, the latent semantic analysis was used to represent text for dimensionality reduction. Moreover, a classification algorithm based on RELM was developed including the uni-label (i.e., a document can only be assigned to a unique category) and multi-label (i.e., a document can be assigned to multiple categories simultaneously) situations. The experimental results in two benchmarks show that the proposed method can produce good performance in most cases, and it could learn faster than popular methods such as feedforward neural networks or support vector machine.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. http://www.daviddlewis.com/resources/.

  2. http://tartarus.org/∼martin/PorterStemmer/.

  3. http://web.ist.utl.pt/∼acardoso/datasets/.

References

  1. Sebastiani F (2002) Machine learning in automated text categorization. ACM Comput Surv 34(1):1–47

    Article  Google Scholar 

  2. Lewis DD, Ringuette M (1994) A comparison of two learning algorithms for text categorization. In: Third annual symposium on document analysis and information retrieval, vol 33. Citeseer, pp 81–93

  3. Soucy P, Mineau GW (2001) A simple knn algorithm for text categorization. In: IEEE international conference on data mining, pp 647–648

  4. Ng HT, Goh WB, Low KL (1997) Feature selection, perceptron learning, and a usability case study for text categorization. In: 20th Annual international ACM SIGIR conference on research and development in information retrieval, pp 67–73

  5. Wang W, Yu B (2009) Text categorization based on combination of modified back propagation neural network and latent semantic analysis. Neural Comput Appl 18(8):875–881

    Article  Google Scholar 

  6. De Souza AF, Pedroni F, Oliveira E, Ciarelli PM, Henrique WF, Veronese L, Badue C (2009) Automated multi-label text categorization with vg-ram weightless neural networks. Neurocomputing 72(10–12):2209–2217

    Article  Google Scholar 

  7. Joachims T (1998) Text categorization with support vector machines: learning with many relevant features. In: 10th European Conference on Machine Learning, pp 137–142

  8. Gabrilovich E, Markovitch S (2004) Text categorization with many redundant features: Using aggressive feature selection to make svms competitive with c4. 5. In: Proceedings of the twenty-first international conference on Machine learning, pp 321–328

  9. Genkin A, Lewis DD, Madigan D (2007) Large-scale bayesian logistic regression for text categorization. Technometrics 49:291–304

    Article  MathSciNet  Google Scholar 

  10. Aseervatham S, Antoniadis A, Gaussier E, Burlet M, Denneulin Y (2011) A sparse version of the ridge logistic regression for large-scale text categorization. Pattern Recognit Lett 32:101–106

    Article  Google Scholar 

  11. Hmeidi I, Hawashin B, El-Qawasmeh E (2008) Performance of knn and svm classifiers on full word arabic articles. Adv Eng Inform 22(1):106–111

    Article  Google Scholar 

  12. Zhao M, Ren J, Ji L, Fu C, Li J, Zhou M (2011) Parameter selection of support vector machines and genetic algorithm based on change area search. Neural Comput Appl (in press)

  13. Anand R, Mehrotra K, Mohan CK, Ranka S (1995) Efficient classification for multiclass problems using modular neural networks. IEEE Trans Neural Netw 6(1):117–124

    Article  Google Scholar 

  14. Man Z, Wu HR, Liu S, Yu X (2006) A new adaptive backpropagation algorithm based on lyapunov stability theory for neural networks. IEEE Trans Neural Netw 17(6):1580–1591

    Article  Google Scholar 

  15. Huang GB, Zhu QY, Siew CK (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. In: Proceedings of the IEEE international joint conference on neural networks, vol 2, pp 985–990

  16. Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1–3):489–501

    Article  Google Scholar 

  17. Huang GB, Wang DH, Lan Y (2011) Extreme learning machines: a survey. Int J Mach Learn Cybern 2(2):107–122

    Google Scholar 

  18. Zhu QY, Qin AK, Suganthan PN, Huang GB (2005) Evolutionary extreme learning machine. Pattern Recognit 38(10):1759–1763

    Article  MATH  Google Scholar 

  19. Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391–407

    Article  Google Scholar 

  20. Huang GB, Ding X, Zhou H (2010) Optimization method based extreme learning machine for classification. Neurocomputing 74:155–163

    Article  Google Scholar 

  21. Nakayama M, Shimizu Y (2003) Subject categorization for web educational resources using mlp. In: Proceedings of 11th European symposium on artificial neural networks. Citeseer, pp 9–14

  22. Tsimboukakis N, Tambouratzis G (2010) A comparative study on authorship attribution classification tasks using both neural network and statistical methods. Neural Comput Appl 19(4):573–582

    Article  Google Scholar 

  23. Liu Y, Loh HT, Tor SB (2005) Comparison of extreme learning machine with support vector machine for text classification. Innov Appl Artif Intell 3533:390–399

    Article  Google Scholar 

  24. Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Inf Process Manag 24(5):513–523

    Article  Google Scholar 

  25. Tikhonov A (1963) Solution of incorrectly formulated problems and the regularization method. Sov Math Dokl 4:1035–1038

    Google Scholar 

  26. Hoerl AE, Kennard RW (1970) Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12:55–67

    Google Scholar 

  27. Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Series B (Methodol) 58:267–288

    Google Scholar 

  28. Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Series B (Stat Methodol) 67(2):301–320

    Article  MathSciNet  MATH  Google Scholar 

  29. Qian Y, Jia S, Zhou J, Robles-Kelly A (2011) Hyperspectral unmixing via l_ {1/2} sparsity-constrained nonnegative matrix factorization. IEEE Trans Geosci Remote Sens 99:1–16

    Google Scholar 

  30. Dai G, Wang J, Shi J, Ren X, Zhang Z (2011) A non-convex relaxation approach to sparse dictionary learning. In: International conference on computer vision and pattern recognition, pp 1809–1816

  31. Huang GB, Zhou H, Ding X, Zhang R (2011) Extreme learning machine for regression and multi-class classification. IEEE Trans Syst Man Cybern Part B (in press)

Download references

Acknowledgments

The authors are grateful to anonymous reviewers for their constructive suggestions. This work was supported by the 973 Program (Grant No. 2012CB316400), the National Natural Science Foundation of China (Grant No. 61171151), and the Natural Science Foundation of Zhejiang Province (Grant No. Y6110147 and Y1110342).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuntao Qian.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zheng, W., Qian, Y. & Lu, H. Text categorization based on regularization extreme learning machine. Neural Comput & Applic 22, 447–456 (2013). https://doi.org/10.1007/s00521-011-0808-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-011-0808-y

Keywords

Navigation