Abstract
The popularity of deep learning architecture is increasing day by day. But the majority of deep learning algorithms have their own limitations, such as slow convergence, high training time, sensitivity to noisy data, the problem of local minimum, etc. A deep learning architecture called Multilayer ELM can overcome these limitations to a large extent where there is no backpropagation and hence saves a lot of training time, eliminates the need to fine-tune the parameters, ensure global optimum, able to handle a large volume of data, etc. As above, the most important feature of Multilayer ELM is the characteristics of its feature space, where the input features can linearly be separable without using any kernel techniques, and it still has not taken any attention from the research community. The paper studies the feature space of Multilayer ELM by considering its architecture and feature mapping technique. To justify the effectiveness of its feature mapping technique, semi-supervised and supervised learning algorithms are tested extensively on the feature space of Multilayer ELM and on TF-IDF vector space. Results from the experiment have shown that Multilayer ELM feature space is more effective than the TF-IDF vector space, and the performance of Multilayer ELM is better compared to the baseline machine and deep learning architectures.
























Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
References
Abualigah LMQ et al (2019) Feature selection and enhanced krill herd algorithm for text document clustering. Springer, Berlin
Abualigah L, Yousri D, Abd Elaziz M, Ewees AA, Al-qaness MA, Gandomi AH (2021) Aquila optimizer: a novel meta-heuristic optimization algorithm. Comput Ind Eng 157:107250
Abualigah L, Diabat A, Mirjalili S, Abd Elaziz M, Gandomi AH (2021) The arithmetic optimization algorithm. Comput Methods Appl Mech Eng 376:113609
Abualigah L, Diabat A (2021) Advances in sine cosine algorithm: a comprehensive survey. Artif Intell Rev, pp. 1–42
Bai L, Liang J, Cao F (2020) Semi-supervised clustering with constraints of different types from multiple information sources. IEEE Transactions on pattern analysis and machine intelligence
Basu S, Banerjee A, Mooney R (2002) Semi-supervised clustering by seeding. In: In Proceedings of 19th International conference on machine learning ICML-2002, Citeseer
Beel J, Gipp B, Langer S, Breitinger C (2016) paper recommender systems: a literature survey. Int J Digit Librar 17(4):305–338
Behera B, Kumaravelan G (2020) Text document classification using fuzzy rough set based on robust nearest neighbor (FRS-RNN). Soft Comput 95:9915–9923
Bengio Y, LeCun Y et al (2007) Scaling learning algorithms towards AI. Large-Scale Kernel Mach 34(5):1–41
Bryant FB, Satorra A (2012) Principles and practice of scaled difference chi-square testing. Struct Equ Model: Multidiscipl J 19(3):372–398
Chen Z, Liu Z, Peng L, Wang L, Zhang L (2017) A novel semi-supervised learning method for internet application identification. Soft Comput 21(8):1963–1975
De Campos LM, Friedman N (2006) A scoring function for learning Bayesian networks based on mutual information and conditional independence tests. J Mach Learn Res, vol. 7, no. 10
Du J, Vong C-M, Chen CP (2020) Novel efficient RNN and LSTM-like architectures: recurrent and gated broad learning systems and their applications for text classification. IEEE Trans Cybern 51:1586–1597
Erkan G, Radev DR (2004) Lexrank: graph-based lexical centrality as salience in text summarization. J Artif Intell Res 22:457–479
Forman G (2003) An extensive empirical study of feature selection metrics for text classification. J Mach Learn Res 3:1289–1305
Fukushima K (2007) Neocognitron. Scholarpedia 2(1):1717
Goodfellow I, Bengio Y, Courville A, Bengio Y (2016) Deep learning, vol 1. MIT press, Cambridge
Hartigan JA, Wong MA (1979) Algorithm as 136: A k-means clustering algorithm. J Royal Stat Soc Series C (Appl Stat) 28(1):100–108
Huang G-B, Zhou H, Ding X, Zhang R (2011) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst, Man, Cybern. Part B (Cybernetics) 42(2):513–529
Huang G-B, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst, Man, Cybern Part B (Cybernetics) 42(2):513–529
Huang G-B, Chen L (2007) Convex incremental extreme learning machine. Neurocomputing 70(16):3056–3062
Huang G-B, Chen L (2008) Enhanced random search based incremental extreme learning machine. Neurocomputing 71(16):3460–3468
Huang G-B, Chen Y-Q, Babri HA (2000) Classification ability of single hidden layer feedforward neural networks. IEEE Trans Neural Netw 11(3):799–801
Huang G-B, Zhu Q-Y, Siew C-K (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1):489–501
Huang G-B, Chen L, Siew CK et al (2006) Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans Neural Netw 17(4):879–892
Huang G-B, Ding X, Zhou H (2010) Optimization method based extreme learning machine for classification. Neurocomputing 74(1):155–163
Johnson WB, Lindenstrauss J (1984) Extensions of Lipschitz mappings into a Hilbert space. Contemp Math 26(189–206):1
Joseph SIT, Sasikala J, Juliet DS (2019) A novel vessel detection and classification algorithm using a deep learning neural network model with morphological processing (m-dlnn). Soft Comput 23(8):2693–2700
Kasun LLC, Zhou H, Huang G-B, Vong CM (2013) Representational learning with extreme learning machine for big data. IEEE Intell Syst 28(6):31–34
Kuncheva LI, Arnaiz-González Á, Díez-Pastor J-F, Gunn IA (2019) Instance selection improves geometric mean accuracy: a study on imbalanced data classification. Progr Artif Intell 8(2):215–228
Li X, Wu Y, Ester M, Kao B, Wang X, Zheng Y (2020) Schain-iram: An efficient and effective semi-supervised clustering algorithm for attributed heterogeneous information networks. IEEE Transactions on knowledge and data engineering
Miller GA (1995) Wordnet: a lexical database for english. Commun ACM 38(11):39–41
Nelson JD (2005) Finding useful questions: on bayesian diagnosticity, probability, impact, and information gain. Psychol Rev 112(4):979
Pedersen T, Banerjee S, Patwardhan S (2005) “Maximizing semantic relatedness to perform word sense disambiguation. Research Report UMSI 2005/25. University of Minnesota Supercomputing Institute 25:2005
Qin Y, Ding S, Wang L, Wang Y (2019) Research progress on semi-supervised clustering. Cognit Comput 11(5):599–612
Rifkin R, Yeo G, Poggio T (2003) Regularized least-squares classification. Nato Sci Series Sub Series III Comput Syst Sci 190:131–154
Roul RK (2018) Detecting spam web pages using multilayer extreme learning machine. Int J Big Data Intell 5(1–2):49–61
Roul R, Sahoo J, Goel R (2017) Deep learning in the domain of multi-document text summarization. International conference on pattern recognition and machine intelligence. Springer, Cham, pp 575–581
Roul R, Asthana S, Kumar M (2017) Study on suitability and importance of multilayer extreme learning machine for classification of text data. Soft Comput 21:4239–4256
Sabour S, Frosst N, Hinton G. E (2017) Dynamic routing between capsules. In: Adv Neural Inf Process Syst, pp. 3856–3866
Sandberg IW (1994) General structures for classification. IEEE Trans Circuits Syst I: Fundam Theory Appl 41(5):372–376
Shepard RN (1987) Toward a universal law of generalization for psychological science. Science 237(4820):1317–1323
Tai K. S, Socher R, Manning C. D (July 2015) Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of the 53rd Annual meeting of the association for computational linguistics and the 7th International joint conference on natural language processing (Volume 1: Long Papers), (Beijing, China), pp. 1556–1566, Association for Computational Linguistics
Vapnik VN (1999) An overview of statistical learning theory. IEEE transactions on neural networks 10(5):988–999
Weisstein E. W (2002) Moore-penrose matrix inverse. https://mathworld.wolfram.com/
Wen X, Liu H, Yan G, Sun F (2018) Weakly paired multimodal fusion using multilayer extreme learning machine. Soft Comput 22(11):3533–3544
Williams RJ, Zipser D (1989) A learning algorithm for continually running fully recurrent neural networks. Neural Comput 1(2):270–280
Yi Y, Qiao S, Zhou W, Zheng C, Liu Q, Wang J (2018) Adaptive multiple graph regularized semi-supervised extreme learning machine. Soft Comput 22(11):3545–3562
Zhou S, Chen Q, Wang X (2014) Fuzzy deep belief networks for semi-supervised sentiment classification. Neurocomputing 131:312–322
Zhou H, Huang G-B, Lin Z, Wang H, Soh YC (2014) Stacked extreme learning machines. IEEE Trans Cybern 45(9):2013–2025
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The author has declared that he has no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by the author.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Roul, R.K. Impact of multilayer ELM feature mapping technique on supervised and semi-supervised learning algorithms. Soft Comput 26, 423–437 (2022). https://doi.org/10.1007/s00500-021-06387-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-021-06387-9