Skip to main content

Study and Understanding the Significance of Multilayer-ELM Feature Space

  • Conference paper
  • First Online:
Big Data Analytics (BDA 2020)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12581))

Included in the following conference series:

Abstract

Multi-layer Extreme Learning Machine (Multi-layer ELM) is one of the most popular deep learning classifiers among other traditional classifiers because of its good characteristics such as being able to manage a huge volume of data, no backpropagation, faster learning speed, maximum level of data abstraction etc. Another distinct feature of Multi-layer ELM is that it can be able to make the input features linearly separable by mapping them non-linearly to an extended feature space. This architecture shows acceptable performance as compared to other deep networks. The paper studies the high dimensional feature space of Multi-layer ELM named as MLELM-HDFS in detail by performing different conventional unsupervised and semi-supervised clustering techniques on it using text data and comparing it with the traditional TF-IDF vector space named as TFIDF-VS in order to show its importance. Results on both unsupervised and semi-supervised clustering techniques show that MLELM-HDFS is more promising than the TFIDF-VS.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://pythonprogramming.net/lemmatizing-nltk-tutorial/.

  2. 2.

    https://www.nltk.org/.

  3. 3.

    https://github.com/alvations/pywsd.

  4. 4.

    https://www.nltk.org/.

  5. 5.

    http://www.dataminingresearch.com/index.php/2010/09/classic3-classic4-datasets/.

  6. 6.

    http://qwone.com/~jason/20Newsgroups/.

  7. 7.

    http://www.daviddlewis.com/resources/testcollections/reuters21578/.

References

  1. Curiskis, S.A., Drake, B., Osborn, T.R., Kennedy, P.J.: An evaluation of document clustering and topic modelling in two online social networks: Twitter and reddit. Inf. Process. Manage. 57(2), 102034 (2020)

    Article  Google Scholar 

  2. Zhao, Y., Karypis, G.: Empirical and theoretical comparisons of selected criterion functions for document clustering. Mach. Learn. 55(3), 311–331 (2004)

    Article  MATH  Google Scholar 

  3. Roul, R.K., Arora, K.: A nifty review to text summarization-based recommendation system for electronic products. Soft Comput. 23(24), 13183–13204 (2019)

    Article  Google Scholar 

  4. Roul, R.K.: Topic modeling combined with classification technique for extractive multi-document text summarization. Soft Comput. 24(22), 1–15 (2020)

    Google Scholar 

  5. Kim, H., Kim, H.K., Cho, S.: Improving spherical k-means for document clustering: fast initialization, sparse centroid projection, and efficient cluster labeling. Expert Syst. Appl. 150, 113288 (2020)

    Article  Google Scholar 

  6. Steinbach, M., Karypis, G., Kumar, V., et al.: A comparison of document clustering techniques. In: KDD Workshop on Text Mining, vol. 400, pp. 525–526, Boston (2000)

    Google Scholar 

  7. Basu, S., Bilenko, M., Mooney, R.J.: Comparing and unifying search-based and similarity-based approaches to semi-supervised clustering. In: Proceedings of the ICML-2003 Workshop on the Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining, pp. 42–49 (2003)

    Google Scholar 

  8. Roul, R.K., Gugnani, S., Kalpeshbhai, S.M.: Clustering based feature selection using extreme learning machines for text classification. In: 2015 Annual IEEE India Conference (INDICON), pp. 1–6. IEEE (2015)

    Google Scholar 

  9. Sayed, G.I., Hassanien, A.E., Azar, A.T.: Feature selection via a novel chaotic crow search algorithm. Neural Comput. Appl. 31(1), 171–188 (2019)

    Article  Google Scholar 

  10. Qian, W., Long, X., Wang, Y., Xie, Y.: Multi-label feature selection based on label distribution and feature complementarity. Appl. Soft Comput. 90, 106167 (2020)

    Article  Google Scholar 

  11. Roul, R.K., Sahoo, J.K.: Text categorization using a novel feature selection technique combined with ELM. In: Sa, P.K., Bakshi, S., Hatzilygeroudis, I.K., Sahoo, M.N. (eds.) Recent Findings in Intelligent Computing Techniques. AISC, vol. 709, pp. 217–228. Springer, Singapore (2018). https://doi.org/10.1007/978-981-10-8633-5_23

    Chapter  Google Scholar 

  12. Kira, K., Rendell, L.A.: The feature selection problem: traditional methods and a new algorithm. AAAI 2, 129–134 (1992)

    Google Scholar 

  13. Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1), 273–324 (1997)

    Article  MATH  Google Scholar 

  14. Lal, T.N., Chapelle, O., Weston, J., Elisseeff, A.: Embedded methods. In: Guyon, I., Nikravesh, M., Gunn, S., Zadeh, L.A. (eds.) Feature Extraction. Studies in Fuzziness and Soft Computing, vol. 207, pp. 137–165. Springer, Heidelberg (2006). https://doi.org/10.1007/978-3-540-35488-8_6

  15. Da Jiao, Z.L.Z.W., Cheng, L.: Kernel clustering algorithm. Chin. J. Comput. 6, 004 (2002)

    MathSciNet  Google Scholar 

  16. Kang, Z., Wen, L., Chen, W., Xu, Z.: Low-rank kernel learning for graph-based clustering. Knowl.-Based Syst. 163, 510–517 (2019)

    Article  Google Scholar 

  17. Hu, G., Du, Z.: Adaptive kernel-based fuzzy c-means clustering with spatial constraints for image segmentation. Int. J. Pattern Recognit. Artif. Intelli. 33(01), 1954003 (2019)

    Article  Google Scholar 

  18. Huang, G.-B., Ding, X., Zhou, H.: Optimization method based extreme learning machine for classification. Neurocomputing 74(1), 155–163 (2010)

    Article  Google Scholar 

  19. Huang, G.-B., Chen, L.: Enhanced random search based incremental extreme learning machine. Neurocomputing 71(16), 3460–3468 (2008)

    Article  Google Scholar 

  20. Kasun, L.L.C., Zhou, H., Huang, G.-B., Vong, C.M.: Representational learning with extreme learning machine for big data. IEEE Intell. Syst. 28(6), 31–34 (2013)

    Google Scholar 

  21. Roul, R.K., Asthana, S.R., Kumar, G.: Study on suitability and importance of multilayer extreme learning machine for classification of text data. Soft Comput. 21(15), 4239–4256 (2017)

    Article  Google Scholar 

  22. Huang, G.-B., Zhou, H., Ding, X., Zhang, R.: Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybern. B (Cybern.) 42(2), 513–529 (2012)

    Article  Google Scholar 

  23. Huang, G.-B., Zhu, Q.-Y., Siew, C.-K.: Extreme learning machine: theory and applications. Neurocomputing 70(1), 489–501 (2006)

    Article  Google Scholar 

  24. Gugnani, S., Bihany, T., Roul, R.K.: Importance of extreme learning machine in the field of query classification: a novel approach. In: 2014 9th International Conference on Industrial and Information Systems (ICIIS), pp. 1–6. IEEE (2014)

    Google Scholar 

  25. Weisstein, E.W.: Moore-penrose matrix inverse (2002). https://mathworld.wolfram.com/

  26. Roul, R.K.: Detecting spam web pages using multilayer extreme learning machine. Int. J. Big Data Intell. 5(1–2), 49–61 (2018)

    Article  Google Scholar 

  27. Roul, R.K.: Deep learning in the domain of near-duplicate document detection. In: Madria, S., Fournier-Viger, P., Chaudhary, S., Reddy, P.K. (eds.) BDA 2019. LNCS, vol. 11932, pp. 439–459. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-37188-3_25

    Chapter  Google Scholar 

  28. Huang, G.-B., Chen, L., Siew, C.K., et al.: Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans. Neural Networks 17(4), 879–892 (2006)

    Article  Google Scholar 

  29. Roul, R.K., Sahoo, J.K., Goel, R.: Deep learning in the domain of multi-document text summarization. In: Shankar, B.U., Ghosh, K., Mandal, D.P., Ray, S.S., Zhang, D., Pal, S.K. (eds.) PReMI 2017. LNCS, vol. 10597, pp. 575–581. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69900-4_73

    Chapter  Google Scholar 

  30. Basu, S., Banerjee, A., Mooney, R.: Semi-supervised clustering by seeding. In: In Proceedings of 19th International Conference on Machine Learning (ICML-2002), Citeseer (2002)

    Google Scholar 

  31. Hartigan, J.A., Wong, M.A.: Algorithm as 136: a k-means clustering algorithm. J. Roy. Stat. Soc. Ser. C (Appl. Stat.) 28(1), 100–108 (1979)

    MATH  Google Scholar 

  32. Miller, G.A.: Wordnet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  33. Li, Y., McLean, D., Bandar, Z.A., O’shea, J.D., Crockett, K.: Sentence similarity based on semantic nets and corpus statistics. IEEE Trans. Knowl. Data Eng. 18(8), 1138–1150 (2006)

    Article  Google Scholar 

  34. Gugnani, S., Roul, R.K.: Triple indexing: an efficient technique for fast phrase query evaluation. Int. J. Comput. Appl. 87(13), 9–13 (2014)

    Google Scholar 

  35. Erkan, G., Radev, D.R.: LexRank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 22, 457–479 (2004)

    Article  Google Scholar 

  36. Pedersen, T., Banerjee, S., Patwardhan, S.: Maximizing semantic relatedness to perform word sense disambiguation, Technical report

    Google Scholar 

  37. Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: bringing order to the web. In: Proceedings of the 7th International World Wide Web Conference (Brisbane, Australia), pp. 161–172 (1998)

    Google Scholar 

  38. Roul, R.K., Sahoo, J.: A novel approach for ranking web documents based on query-optimized personalized PageRank. Int. J. Data Sci. Anal. 10(2), 1–19 (2020)

    Google Scholar 

  39. Williams, R.J., Zipser, D.: A learning algorithm for continually running fully recurrent neural networks. Neural Comput. 1(2), 270–280 (1989)

    Article  Google Scholar 

  40. Fukushima, K.: Neocognitron. Scholarpedia 2(1), 1717 (2007). revision #91558

    Article  MathSciNet  Google Scholar 

  41. Bezdek, J.C., Ehrlich, R., Full, W.: FCM: the fuzzy c-means clustering algorithm. Comput. Geosci. 10(2), 191–203 (1984)

    Article  Google Scholar 

  42. Winkler, R., Klawonn, F., Kruse, R.: Fuzzy c-means in high dimensional spaces. Int. J. Fuzzy Syst. Appl. 1, 1–16 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rajendra Kumar Roul .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Roul, R.K. (2020). Study and Understanding the Significance of Multilayer-ELM Feature Space. In: Bellatreche, L., Goyal, V., Fujita, H., Mondal, A., Reddy, P.K. (eds) Big Data Analytics. BDA 2020. Lecture Notes in Computer Science(), vol 12581. Springer, Cham. https://doi.org/10.1007/978-3-030-66665-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-66665-1_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-66664-4

  • Online ISBN: 978-3-030-66665-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics