Skip to main content
Log in

Accelerating ELM training over data streams

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

In the field of machine learning, offline training and online training occupy the same important position because they coexist in many real applications. The extreme learning machine (ELM) has the characteristics of fast learning speed and high accuracy for offline training, and online sequential ELM (OS-ELM) is a variant of ELM that supports online training. With the explosive growth of data volume, running these algorithms on distributed computing platforms is an unstoppable trend, but there is currently no efficient distributed framework to support both ELM and OS-ELM. Apache Flink is an open-source stream-based distributed platform for both offline processing and online data processing with good scalability, high throughput, and fault-tolerant ability, so it can be used to accelerate both ELM and OS-ELM. In this paper, we first research the characteristics of ELM, OS-ELM and distributed computing platforms, then propose an efficient stream-based distributed framework for both ELM and OS-ELM, named ELM-SDF, which is implemented on Flink. We then evaluate the algorithms in this framework with synthetic data on distributed cluster. In summary, the advantages of the proposed framework are highlighted as follows. (1) The training speed of FLELM is always faster than ELM on Hadoop and Spark, and its scalability behaves better as well. (2) Response time and throughput of FLOS-ELM achieve better performance than OS-ELM on Hadoop and Spark when the incremental training samples arrive. (3) The response time and throughput of FLOS-ELM behave better in native-stream processing mode when the incremental data samples are continuously arriving.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Huang GB, Zhu QY, Siew CK (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. IEEE Int Joint Conf Neural Netw 2:985–990

    Google Scholar 

  2. Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1):489–501

    Article  Google Scholar 

  3. Huang G, Huang GB, Song S, You K (2015) Trends in extreme learning machines: a review. Neural Netw Off J Int Neural Netw Soc 61(C):32–48

    Article  Google Scholar 

  4. Ding S, Zhao H, Zhang Y, Xinzheng X, Nie R (2015) Extreme learning machine: algorithm, theory and applications. Artif Intell Rev 44(1):103–115

    Article  Google Scholar 

  5. Wang Y, Cao F, Yuan Y (2011) A study on effectiveness of extreme learning machine *. Neurocomputing 74(16):2483–2490

    Article  Google Scholar 

  6. Zhang R, Lan Y, Huang GB, Zong Ben X (2012) Universal approximation of extreme learning machine with adaptive growth of hidden nodes. IEEE Trans Neural Netw Learn Syst 23(2):365–371

    Article  Google Scholar 

  7. Huang GB, Chen L, Siew CK (2006) Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans Neural Netw 17(4):879–892

    Article  Google Scholar 

  8. Guang Bin Huang and Lei Chen (2008) Enhanced random search based incremental extreme learning machine. Neurocomputing 71(16):3460–3468

    Google Scholar 

  9. Liang N, Huang G, Saratchandran P, Sundararajan N (2006) A fast and accurate online sequential learning algorithm for feedforward networks. IEEE Trans Neural Netw 17(6):1411–1423

    Article  Google Scholar 

  10. Rong HJ, Huang GB, Sundararajan N, Saratchandran P (2009) Online sequential fuzzy extreme learning machine for function approximation and classification problems. IEEE Trans Syst Man Cybern Part B 39(4):1067–1072

    Article  Google Scholar 

  11. Zhao J, Wang Z, Dong SP (2012) Online sequential extreme learning machine with forgetting mechanism. Neurocomputing 87(15):79–89

    Article  Google Scholar 

  12. Wang X, Han M (2014) Online sequential extreme learning machine with kernels for nonstationary time series prediction. Neurocomputing 145(145):90–97

    Article  Google Scholar 

  13. Scardapane S, Comminiello D, Scarpiniti M, Uncini A (2015) Online sequential extreme learning machine with kernels. IEEE Trans Neural Netw Learn Syst 26(9):2214–2220

    Article  MathSciNet  Google Scholar 

  14. Dong X, Li B, Zhang S (2018) An online sequential multiple hidden layers extreme learning machine method with forgetting mechanism. Chemom Intell Lab Syst 176:126–133

    Article  Google Scholar 

  15. Ding S, Zhang N, Zhang J, Xinzheng X, Shi Z (2017) Unsupervised extreme learning machine with representational features. Int J Mach Learn Cybern 8(2):587–595

    Article  Google Scholar 

  16. Zhang N, Ding S (2017) Unsupervised and semi-supervised extreme learning machine with wavelet kernel for high dimensional data. Memet Comput 9(2):129–139

    Article  Google Scholar 

  17. Zhang N, Ding S, Zhang J (2016) Multi layer elm-rbf for multi-label learning. Appl Soft Comput 43(C):535–545

    Article  Google Scholar 

  18. Ding S, Zhang N, Xinzheng X, Guo L, Zhang J (2015) Deep extreme learning machine and its application in eeg classification. Math Prob Eng. https://doi.org/10.1155/2015/129021

  19. Xi ZW, Tianlun Z, Ran W (2017) Noniterative deep learning: Incorporating restricted boltzmann machine into multilayer random weight neural networks. In: IEEE transactions on systems man and cybernetics systems, pp 1–10

  20. Zhang J, Ding S, Zhang N, Shi Z (2016) Incremental extreme learning machine based on deep feature embedded. Int J Mach Learn Cybern 7(1):111–120

    Article  Google Scholar 

  21. Cao K, Wang G, Han D, Ning J, Zhang X (2015) Classification of uncertain data streams based on extreme learning machine. Cogn Comput 7(1):150–160

    Article  Google Scholar 

  22. Shuliang X, Wang J (2017) Dynamic extreme learning machine for data stream classification. Neurocomputing 238(C):433–449

    Google Scholar 

  23. Bi X, Zhao X, Ma W, Zhang Z, Heng Z (2016) Record linkage for event identification in xml feeds stream using elm. Proc ELM 1:463–476

    Google Scholar 

  24. Asterios K, Sebastian S (2016) Apache flink: Stream analytics at scale. In: IEEE International Conference on Cloud Engineering Workshop, pp 193–193

  25. Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. Commun ACM 51(1):107–113

    Article  Google Scholar 

  26. Matei Z, Mosharaf C, Michael JF, Scott S, Ion S (2010) Spark: Cluster computing with working sets. In: Proceedings of the 2Nd USENIX conference on hot topics in cloud computing, pp 10–10

  27. He Q, Shang T, Zhuang F, Shi Z (2013) Parallel extreme learning machine for regression based on mapreduce. Neurocomputing 102(2):52–58

    Article  Google Scholar 

  28. Xin J, Wang Z, Chen C, Ding L, Wang G, Zhao Y (2014) Elm * : distributed extreme learning machine with mapreduce. World Wide Web-Internet Web Inf Syst 17(5):1189–1204

    Article  Google Scholar 

  29. Liu T, Fang Z, Chen Z, Zhou Y (2016) Parallelization of a series of extreme learning machine algorithms based on spark. In: International conference on computer and information science. https://doi.org/10.1109/ICIS.2016.7550906

  30. Huang S, Wang B, Chen Y, Wang G, Ge Y (2016) An efficient parallel method for batched os-elm training using mapreduce. Memet Comput 9(3):1–15

    Google Scholar 

  31. Deng S, Wang B, Huang S, Yue C, Zhou J, Wang G (2017) Self-adaptive framework for efficient stream data classification on storm. IEEE Trans Syst Man Cybern Syst 50(1):123–136

    Article  Google Scholar 

  32. Sun Y, Yuan Y, Wang G (2011) An os-elm based distributed ensemble classification framework in p2p networks. Neurocomputing 74(16):2438–2443

    Article  Google Scholar 

  33. Ning K, Liu M, Dong M (2015) A new robust elm method based on a bayesian framework with heavy-tailed distribution and weighted likelihood function. Neurocomputing 149(B):891–903

    Article  Google Scholar 

  34. Bi X, Zhao X, Wang G, Zhang P, Wang C (2015) Distributed extreme learning machine with kernels based on mapreduce. Neurocomputing 149(A):456–463

    Article  Google Scholar 

  35. Xin J, Wang Z, Luxuan Q, Wang G (2015) Elastic extreme learning machine for big data classification. Neurocomputing 149(A):464–471

    Article  Google Scholar 

  36. Sarwar JM, Juwel R, Marcelo M (2016) Open source initiatives and frameworks addressing distributed real-time data analytics. In: IEEE international parallel and distributed processing symposium workshops, pp 1481–1484

  37. Banerjee KS (1971) Generalized inverse of matrices and its applications. Technometrics 15(1):197–197

    Article  Google Scholar 

  38. Zhao YP (2016) Parsimonious kernel extreme learning machine in primal via cholesky factorization. Neural Netw 80:95–109

    Article  Google Scholar 

  39. Deng WY, Bai Z, Huang GB, Zheng QH (2016) A fast svd-hidden-nodes based extreme learning machine for large-scale data analytics. Neural Netw 77(1):14–28

    Article  Google Scholar 

  40. Wang B, Huang S, Qiu J, Liu Y, Wang G (2015) Parallel online sequential extreme learning machine based on mapreduce. Neurocomputing 149(A):224–232

    Article  Google Scholar 

  41. Akusok A, Bjork KM, Miche Y, Lendasse A (2015) High performance extreme learning machines: a complete toolbox for big data applications. IEEE Access 3:1011–1025

    Article  Google Scholar 

Download references

Acknowledgements

This research is partially funded by the National Key Research and Development Program of China (Grant No. 2016YFC1401900), the National Natural Science Foundation of China (Grant Nos. 61872072, 61572119, 61572121, 61622202, 61732003, 61729201, 61702086, and U1401256), the Fundamental Research Funds for the Central Universities (Grant Nos. N171604007, and N171904007), the Natural Science Foundation of Liaoning Province (Grant No. 20170520164), and the China Postdoctoral Science Foundation (Grant No. 2018M631806).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gang Wu.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Informed consent

Informed consent was obtained from all individual participants.

Human and Animal Rights

This article does not contain any studies involving human participants and/or animals by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ji, H., Wu, G. & Wang, G. Accelerating ELM training over data streams. Int. J. Mach. Learn. & Cyber. 12, 87–102 (2021). https://doi.org/10.1007/s13042-020-01158-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-020-01158-8

Keywords

Navigation