Abstract
Extreme learning machine (ELM) as well as its variants have been widely used in many fields for its good generalization performance and fast learning speed. Though distributed ELM can sufficiently process large-scale labeled training data, the current technology is not able to process partial labeled or unlabeled training data. Therefore, we propose a new unified distributed ELM with supervised, semi-supervised and unsupervised learning based on MapReduce framework, called U-DELM. The U-DELM method can be used to overcome the existing distributed ELM framework’s lack of ability to process partially labeled and unlabeled training data. We first compare the computation formulas of supervised, semi-supervised and unsupervised learning methods and found that the majority of expensive computations are decomposable. Next, MapReduce framework based U-DELM is proposed, which extracts three different matrices continued multiplications from the three computational formulas introduced above. After that, we transform the cumulative sums respectively to make them suitable for MapReduce. Then, the combination of the three computational formulas are used to solve the output weight in three different learning methods. Finally, by using benchmark and synthetic datasets, we are able to test and verify the efficiency and effectiveness of U-DELM on learning massive data. Results prove that U-DELM can achieve unified distribution on supervised, semi-supervised and unsupervised learning.






Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Cheng X, Liu H, Xu X, Sun F (2017) Denoising deep extreme learning machine for sparse representation. Memet Comput 9(3):199–212
Dean J, Ghemawat S (2010) MapReduce: a flexible data processing tool. Commun ACM 53(1):72–77
Elsayed S, Sarker R (2016) Differential evolution framework for big data optimization. Memet Comput 8(1):17–33
Ferrucci F, Salza P, Sarro F (2017) Using hadoop MapReduce for parallel genetic algorithms: a comparison of the global, grid and island models. Evol Comput 1:421–446
Han M, Yang X, Jiang E (2016) An extreme learning machine based on cellular automata of edge detection for remote sensing images. Neurocomputing 198:27–34
Hashem IAT, Anuar NB, Gani A, Yaqoob I, Xia F, Khan SU (2016) MapReduce: review and open challenges. Scientometrics 109(1):389–422
He Q, Shang T, Zhuang F, Shi Z (2013) Parallel extreme learning machine for regression based on MapReduce. Neurocomputing 102:52–58
Huang G, Song S, Gupta J, Wu C (2014) Semi-supervised and unsupervised extreme learning machines. IEEE Trans Cybern 44(12):2405–2417
Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1–3):489–501
Huang S, Wang B, Chen Y, Wang G, Yu G (2017) An efficient parallel method for batched OS-ELM training using MapReduce. Memet Comput 9(3):183–197
Koh JL, Chen CC, Chan CY, Chen ALP (2017) MapReduce skyline query processing with partitioning and distributed dominance tests. Inf Sci 375:114–137
Lai L, Qin L, Lin X, Chang L (2017) Scalable subgraph enumeration in MapReduce: a cost-oriented approach. VLDB J 26(3):421–446
Lu W, Shen Y, Chen S, Ooi BC (2012) Efficient processing of k nearest neighbor joins using MapReduce. Proc VLDB Endow 5(10):1016–1027
Lu X, Zou H, Zhou H, Xie L, Huang GB (2016) Robust extreme learning machine with its application to indoor positioning. IEEE Trans Cybern 46(1):194–205
Park Y, Min JK, Shim K (2017) Efficient processing of skyline queries using MapReduce. IEEE Trans Knowl Data Eng 29(5):1031–1044
Rizk Y, Awad M (2015) On the distributed implementation of unsupervised extreme learning machines for big data. Proc Comput Sci 53(1):167–174
Shvachko K, Kuang H, Radia S, Chansler R (2010) The hadoop distributed file system. In: Proceedings of the 26th IEEE symposium on mass storage systems and technologies (MSST 2010). Incline Village, pp 1–10
Wang Z, Qu Q, Yu G, Kang Y (2016) Breast tumor detection in double views mammography based on extreme learning machine. Neural Comput Appl 27(1):227–240
Wang Z, Xin J, Yang H, Tian S, Yu G, Xu C, Yao Y (2017) Distributed and weighted extreme learning machine for imbalanced big data learning. Tsinghua Sci Technol 22(2):160–173
Wang Z, Yu G, Kang Y, Zhao Y, Qu Q (2014) Breast tumor detection in digital mammography based on extreme learning machine. Neurocomputing 128:175–184
Wong KI, Vong CM, Wong PK, Luo J (2015) Sparse Bayesian extreme learning machine and its application to biofuel engine performance prediction. Neurocomputing 149(Part A):397–404
Xin J, Wang Z, Chen C, Ding L, Wang G, Zhao Y (2013) ELM*: distributed extreme learning machine with MapReduce. World Wide Web 17(5):1189–1204
Xin J, Wang Z, Qu L, Wang G (2015) Elastic extreme learning machine for big data classification. Neurocomputing 149(Part A):464–471
Xin J, Wang Z, Qu L, Yu G, Kang Y (2016) A-ELM*: adaptive distributed extreme learning machine with MapReduce. Neurocomputing 174(Part A):368–374
Zhao Y, Wang G, Yin Y, Li Y, Wang Z (2016) Improving ELM-based microarray data classification by diversified sequence features selection. Neural Comput Appl 27(1):155–166
Zong W, Huang GB, Chen Y (2013) Weighted extreme learning machine for imbalance learning. Neurocomputing 101:229–242
Acknowledgements
This research was partially supported by the following foundations: the National Natural Science Foundation of China under Grant Nos. 61472069, 61402089, and U1401256. The Fundamental Research Funds for the Central Universities under Grant Nos. N161602003, N171607010, N161904001, and N160601001. The Natural Science Foundation of Liaoning Province under Grant No. 2015020553.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, Z., Qu, L., Xin, J. et al. A unified distributed ELM framework with supervised, semi-supervised and unsupervised big data learning. Memetic Comp. 11, 305–315 (2019). https://doi.org/10.1007/s12293-018-0271-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12293-018-0271-8