Skip to main content
Log in

Impact of multilayer ELM feature mapping technique on supervised and semi-supervised learning algorithms

  • Data analytics and machine learning
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

The popularity of deep learning architecture is increasing day by day. But the majority of deep learning algorithms have their own limitations, such as slow convergence, high training time, sensitivity to noisy data, the problem of local minimum, etc. A deep learning architecture called Multilayer ELM can overcome these limitations to a large extent where there is no backpropagation and hence saves a lot of training time, eliminates the need to fine-tune the parameters, ensure global optimum, able to handle a large volume of data, etc. As above, the most important feature of Multilayer ELM is the characteristics of its feature space, where the input features can linearly be separable without using any kernel techniques, and it still has not taken any attention from the research community. The paper studies the feature space of Multilayer ELM by considering its architecture and feature mapping technique. To justify the effectiveness of its feature mapping technique, semi-supervised and supervised learning algorithms are tested extensively on the feature space of Multilayer ELM and on TF-IDF vector space. Results from the experiment have shown that Multilayer ELM feature space is more effective than the TF-IDF vector space, and the performance of Multilayer ELM is better compared to the baseline machine and deep learning architectures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. https://tartarus.org/martin/PorterStemmer/

  2. https://pythonprogramming.net/combine-classifier-algorithms-nltk-tutorial/

  3. http://en.wikipedia.org/wiki/Brown_Corpus

  4. https://github.com/alvations/pywsd

  5. http://www.dataminingresearch.com/index.php/2010/09/classic3-classic4-datasets/

  6. https://data.mendeley.com/datasets/9mpgz8z257/1

  7. http://www.daviddlewis.com/resources/testcollections/reuters21578/

  8. http://qwone.com/~jason/20Newsgroups/

References

  • Abualigah LMQ et al (2019) Feature selection and enhanced krill herd algorithm for text document clustering. Springer, Berlin

    Book  Google Scholar 

  • Abualigah L, Yousri D, Abd Elaziz M, Ewees AA, Al-qaness MA, Gandomi AH (2021) Aquila optimizer: a novel meta-heuristic optimization algorithm. Comput Ind Eng 157:107250

    Article  Google Scholar 

  • Abualigah L, Diabat A, Mirjalili S, Abd Elaziz M, Gandomi AH (2021) The arithmetic optimization algorithm. Comput Methods Appl Mech Eng 376:113609

    Article  MathSciNet  Google Scholar 

  • Abualigah L, Diabat A (2021) Advances in sine cosine algorithm: a comprehensive survey. Artif Intell Rev, pp. 1–42

  • Bai L, Liang J, Cao F (2020) Semi-supervised clustering with constraints of different types from multiple information sources. IEEE Transactions on pattern analysis and machine intelligence

  • Basu S, Banerjee A, Mooney R (2002) Semi-supervised clustering by seeding. In: In Proceedings of 19th International conference on machine learning ICML-2002, Citeseer

  • Beel J, Gipp B, Langer S, Breitinger C (2016) paper recommender systems: a literature survey. Int J Digit Librar 17(4):305–338

    Article  Google Scholar 

  • Behera B, Kumaravelan G (2020) Text document classification using fuzzy rough set based on robust nearest neighbor (FRS-RNN). Soft Comput 95:9915–9923

    Google Scholar 

  • Bengio Y, LeCun Y et al (2007) Scaling learning algorithms towards AI. Large-Scale Kernel Mach 34(5):1–41

    Google Scholar 

  • Bryant FB, Satorra A (2012) Principles and practice of scaled difference chi-square testing. Struct Equ Model: Multidiscipl J 19(3):372–398

    Article  MathSciNet  Google Scholar 

  • Chen Z, Liu Z, Peng L, Wang L, Zhang L (2017) A novel semi-supervised learning method for internet application identification. Soft Comput 21(8):1963–1975

    Article  Google Scholar 

  • De Campos LM, Friedman N (2006) A scoring function for learning Bayesian networks based on mutual information and conditional independence tests. J Mach Learn Res, vol. 7, no. 10

  • Du J, Vong C-M, Chen CP (2020) Novel efficient RNN and LSTM-like architectures: recurrent and gated broad learning systems and their applications for text classification. IEEE Trans Cybern 51:1586–1597

    Article  Google Scholar 

  • Erkan G, Radev DR (2004) Lexrank: graph-based lexical centrality as salience in text summarization. J Artif Intell Res 22:457–479

    Article  Google Scholar 

  • Forman G (2003) An extensive empirical study of feature selection metrics for text classification. J Mach Learn Res 3:1289–1305

    MATH  Google Scholar 

  • Fukushima K (2007) Neocognitron. Scholarpedia 2(1):1717

    Article  MathSciNet  Google Scholar 

  • Goodfellow I, Bengio Y, Courville A, Bengio Y (2016) Deep learning, vol 1. MIT press, Cambridge

    MATH  Google Scholar 

  • Hartigan JA, Wong MA (1979) Algorithm as 136: A k-means clustering algorithm. J Royal Stat Soc Series C (Appl Stat) 28(1):100–108

  • Huang G-B, Zhou H, Ding X, Zhang R (2011) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst, Man, Cybern. Part B (Cybernetics) 42(2):513–529

  • Huang G-B, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst, Man, Cybern Part B (Cybernetics) 42(2):513–529

  • Huang G-B, Chen L (2007) Convex incremental extreme learning machine. Neurocomputing 70(16):3056–3062

    Article  Google Scholar 

  • Huang G-B, Chen L (2008) Enhanced random search based incremental extreme learning machine. Neurocomputing 71(16):3460–3468

    Article  Google Scholar 

  • Huang G-B, Chen Y-Q, Babri HA (2000) Classification ability of single hidden layer feedforward neural networks. IEEE Trans Neural Netw 11(3):799–801

    Article  Google Scholar 

  • Huang G-B, Zhu Q-Y, Siew C-K (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1):489–501

    Article  Google Scholar 

  • Huang G-B, Chen L, Siew CK et al (2006) Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans Neural Netw 17(4):879–892

    Article  Google Scholar 

  • Huang G-B, Ding X, Zhou H (2010) Optimization method based extreme learning machine for classification. Neurocomputing 74(1):155–163

    Article  Google Scholar 

  • Johnson WB, Lindenstrauss J (1984) Extensions of Lipschitz mappings into a Hilbert space. Contemp Math 26(189–206):1

    MathSciNet  MATH  Google Scholar 

  • Joseph SIT, Sasikala J, Juliet DS (2019) A novel vessel detection and classification algorithm using a deep learning neural network model with morphological processing (m-dlnn). Soft Comput 23(8):2693–2700

    Article  Google Scholar 

  • Kasun LLC, Zhou H, Huang G-B, Vong CM (2013) Representational learning with extreme learning machine for big data. IEEE Intell Syst 28(6):31–34

    Google Scholar 

  • Kuncheva LI, Arnaiz-González Á, Díez-Pastor J-F, Gunn IA (2019) Instance selection improves geometric mean accuracy: a study on imbalanced data classification. Progr Artif Intell 8(2):215–228

    Article  Google Scholar 

  • Li X, Wu Y, Ester M, Kao B, Wang X, Zheng Y (2020) Schain-iram: An efficient and effective semi-supervised clustering algorithm for attributed heterogeneous information networks. IEEE Transactions on knowledge and data engineering

  • Miller GA (1995) Wordnet: a lexical database for english. Commun ACM 38(11):39–41

    Article  Google Scholar 

  • Nelson JD (2005) Finding useful questions: on bayesian diagnosticity, probability, impact, and information gain. Psychol Rev 112(4):979

    Article  Google Scholar 

  • Pedersen T, Banerjee S, Patwardhan S (2005) “Maximizing semantic relatedness to perform word sense disambiguation. Research Report UMSI 2005/25. University of Minnesota Supercomputing Institute 25:2005

  • Qin Y, Ding S, Wang L, Wang Y (2019) Research progress on semi-supervised clustering. Cognit Comput 11(5):599–612

    Article  Google Scholar 

  • Rifkin R, Yeo G, Poggio T (2003) Regularized least-squares classification. Nato Sci Series Sub Series III Comput Syst Sci 190:131–154

    Google Scholar 

  • Roul RK (2018) Detecting spam web pages using multilayer extreme learning machine. Int J Big Data Intell 5(1–2):49–61

    Article  Google Scholar 

  • Roul R, Sahoo J, Goel R (2017) Deep learning in the domain of multi-document text summarization. International conference on pattern recognition and machine intelligence. Springer, Cham, pp 575–581

    Chapter  Google Scholar 

  • Roul R, Asthana S, Kumar M (2017) Study on suitability and importance of multilayer extreme learning machine for classification of text data. Soft Comput 21:4239–4256

    Article  Google Scholar 

  • Sabour S, Frosst N, Hinton G. E (2017) Dynamic routing between capsules. In: Adv Neural Inf Process Syst, pp. 3856–3866

  • Sandberg IW (1994) General structures for classification. IEEE Trans Circuits Syst I: Fundam Theory Appl 41(5):372–376

    Article  Google Scholar 

  • Shepard RN (1987) Toward a universal law of generalization for psychological science. Science 237(4820):1317–1323

    Article  MathSciNet  Google Scholar 

  • Tai K. S, Socher R, Manning C. D (July 2015) Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of the 53rd Annual meeting of the association for computational linguistics and the 7th International joint conference on natural language processing (Volume 1: Long Papers), (Beijing, China), pp. 1556–1566, Association for Computational Linguistics

  • Vapnik VN (1999) An overview of statistical learning theory. IEEE transactions on neural networks 10(5):988–999

    Article  Google Scholar 

  • Weisstein E. W (2002) Moore-penrose matrix inverse. https://mathworld.wolfram.com/

  • Wen X, Liu H, Yan G, Sun F (2018) Weakly paired multimodal fusion using multilayer extreme learning machine. Soft Comput 22(11):3533–3544

    Article  Google Scholar 

  • Williams RJ, Zipser D (1989) A learning algorithm for continually running fully recurrent neural networks. Neural Comput 1(2):270–280

    Article  Google Scholar 

  • Yi Y, Qiao S, Zhou W, Zheng C, Liu Q, Wang J (2018) Adaptive multiple graph regularized semi-supervised extreme learning machine. Soft Comput 22(11):3545–3562

    Article  Google Scholar 

  • Zhou S, Chen Q, Wang X (2014) Fuzzy deep belief networks for semi-supervised sentiment classification. Neurocomputing 131:312–322

    Article  Google Scholar 

  • Zhou H, Huang G-B, Lin Z, Wang H, Soh YC (2014) Stacked extreme learning machines. IEEE Trans Cybern 45(9):2013–2025

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rajendra Kumar Roul.

Ethics declarations

Conflict of interest

The author has declared that he has no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by the author.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Roul, R.K. Impact of multilayer ELM feature mapping technique on supervised and semi-supervised learning algorithms. Soft Comput 26, 423–437 (2022). https://doi.org/10.1007/s00500-021-06387-9

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-021-06387-9

Keywords