Skip to main content

Study the Significance of ML-ELM Using Combined PageRank and Content-Based Feature Selection

  • Conference paper
  • First Online:
Distributed Computing and Internet Technology (ICDCIT 2021)

Abstract

Scalable big data analysis frameworks are of paramount importance in the modern web society, which is characterized by a huge number of resources, including electronic text documents. Hence, choosing an adequate subset of features that provide a complete representation of the document while discarding the irrelevant one is of utmost importance. Aiming in this direction, this paper studies the suitability and importance of a deep learning classifier called Multilayer ELM (ML-ELM) by proposing a combined PageRank and content-based feature selection (CPRCFS) technique on all the terms present in a given corpus. Top \(k\%\) terms are selected to generate a reduced feature vector which is then used to train different classifiers including ML-ELM. Experimental results show that the proposed feature selection technique is better or comparable with the baseline techniques and the performance of Multilayer ELM can outperform state-of-the-arts machine and deep learning classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://pythonprogramming.net/lemmatizing-nltk-tutorial/.

  2. 2.

    https://www.nltk.org/.

  3. 3.

    http://qwone.com/~jason/20Newsgroups/.

  4. 4.

    http://www.dataminingresearch.com/index.php/2010/09/classic3-classic4-datasets/.

References

  1. Du, J., Vong, C.-M., Chen, C. P.: Novel efficient RNN and LSTM-like architectures: Recurrent and gated broad learning systems and their applications for text classification. IEEE Trans. Cybern. (2020)

    Google Scholar 

  2. Sambasivan, R., Das, S.: Classification and regression using augmented trees. Int. J. Data Sci. Anal. 7(4), 259–276 (2019)

    Article  Google Scholar 

  3. Joseph, S.I.T., Sasikala, J., Juliet, D.S.: A novel vessel detection and classification algorithm using a deep learning neural network model with morphological processing (m-dlnn). Soft Comput. 23(8), 2693–2700 (2019)

    Article  Google Scholar 

  4. Roul, R.K., Asthana, S.R., Kumar, G.: Study on suitability and importance of multilayer extreme learning machine for classification of text data. Soft Comput. 21(15), 4239–4256 (2017)

    Article  Google Scholar 

  5. Sayed, G.I., Hassanien, A.E., Azar, A.T.: Feature selection via a novel chaotic crow search algorithm. Neural Comput. Appl. 31(1), 171–188 (2019)

    Article  Google Scholar 

  6. Tsai, C.-J.: New feature selection and voting scheme to improve classification accuracy. Soft Comput. 23(15), 1–14 (2019)

    Google Scholar 

  7. Roul, R.K., Rai, P.: A new feature selection technique combined with elm feature space for text classification. In: Proceedings of the 13th International Conference on Natural Language Processing, pp. 285–292 (2016)

    Google Scholar 

  8. Kasun, L.L.C., Zhou, H., Huang, G.-B., Vong, C.M.: Representational learning with extreme learning machine for big data. IEEE Intell. Syst. 28(6), 31–34 (2013)

    Google Scholar 

  9. Huang, G.-B., Zhou, H., Ding, X., Zhang, R.: Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 42(2), 513–529 (2012)

    Article  Google Scholar 

  10. Huang, G.-B., Chen, L., Siew, C.K., et al.: Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans. Neural Netw. 17(4), 879–892 (2006)

    Article  Google Scholar 

  11. Roul, R.K.: Suitability and importance of deep learning feature space in the domain of text categorisation. Int. J. Comput. Intell. Stud. 8(1–2), 73–102 (2019)

    Article  Google Scholar 

  12. Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. Tech. Rep, Stanford InfoLab (1999)

    Google Scholar 

  13. Huang, G.-B., Zhou, H., Ding, X., Zhang, R.: "Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Part B Cybern.(Cybern.) 42(2), 513–529 (2011)

    Article  Google Scholar 

  14. Huang, G.-B., Chen, Y.-Q., Babri, H.A.: Classification ability of single hidden layer feedforward neural networks. IEEE Trans. Neural Netw. 11(3), 799–801 (2000)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rajendra Kumar Roul .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Roul, R.K., Sahoo, J.K. (2021). Study the Significance of ML-ELM Using Combined PageRank and Content-Based Feature Selection. In: Goswami, D., Hoang, T.A. (eds) Distributed Computing and Internet Technology. ICDCIT 2021. Lecture Notes in Computer Science(), vol 12582. Springer, Cham. https://doi.org/10.1007/978-3-030-65621-8_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-65621-8_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-65620-1

  • Online ISBN: 978-3-030-65621-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics