Abstract
We have proposed a distributed platform for machine learning without data accumulation. This method constructs feature models from distributed data and combines them to obtain the same level of performance as conventional methods based on data accumulation. In this paper, we propose and compare three methods for selecting and combining these combinable feature models named fog model; sequential model selection, a conventional method to select fog models in the order of node IDs; adaptive selection, a method to select only models to improve task performance; and similar model selection, a method to select models similar to the current model. Each method has different priorities in selecting models. As an evaluation, the proposed methods are compared with the method of previous research by simulation. The proposed method which gives priority to performance improvement with small number of combinations for the processing tasks comparison to previous one. The proposed method enables more efficient combinations of fog models with fewer models than previous research, and users are able to adaptively select fog models to improve overall performance or to prioritize the performance of specific features by selecting the target models.





Similar content being viewed by others
References
Tsuchiya T, Mochizuki R, Hirose HY, Koyanagi TK, Quang TM. Distributed data platform for machine learning using the fog computing model. SN Comput Sci. 2020;1:164.
Bonomi F, Milito R, Natarajan P, Zhu J. Fog computing: a platform for internet of things and analytics, big data and internet of things: a roadmap for smart environments. Berlin: Springer; 2014. p. 169–86.
Laperdrix P, Bielova N, Baudry B, Avoine G. Browser fingerprinting: a survey. CoRR arXiv:1905.01051 (2019)
McMahan HB, Moore E, Ramage D, Hampson S, Arcas BA. Communication-efficient learning of deep networks from decentralized data. In: Proceeding of the 20th Int’l conference on artificial intelligence and statistics, JMLR: W&CP vol 54; 2014. p. 169–186.
Devlin J, Chang M-W, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. CoRR arXiv:1810.04805 (2018).
Bonomi F, Milito R, Natarajan P, Zhu J, Bessis N, Dobre C. Fog computing: a platform for internet of things and analytics, big data and internet of things: a roadmap for smart environments. Berlin: Springer International Publishing; 2014. p. 169–86.
Zhou X, Huang S, Zheng Z. RPD: a distance function between word embeddings. In: Proceedings of 58th the Association for Computational Linguistics; 2020. p. 42–50.
Agirre E, Alfonseca E, Hall K, Kravalova J, Paşca M, Soroa A. A study on similarity and relatedness using distributional and wordnet-based approaches. In: Proceedings of human language technologies: the North American chapter of the association for computational linguistics, Boulder; 2009. p. 19–27.
WikimediaDownloads, https://dumps.wikimedia.org/. Accessed 25 Mar 2022.
Tomas M, Kai C, Greg C, Jeffrey D. Efficient estimation of word representations in vector space. CoRR arXiv:1301.3781 (2013).
https://nodejs.org/en/. Accessed 25 Mar 2022.
https://www.python.org/. Accessed 25 Mar 2022.
https://radimrehurek.com/gensim/. Accessed 25 Mar 2022.
Erickson N, Mueller J, Shirkov A, Zhang H, Larroy P, Li M, Smola A. AutoGluon-tabular: robust and accurate AutoML for structured data. arXiv:2003.06505 (2020)
Acknowledgements
This research was partially supported by the Ministry of Education, Science, Sports and Culture, Grant-in Aid for Scientific Research (C), 2021–2023 21K11850, Takeshi TSUCHIYA.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the topical collection “Future Data and Security Engineering 2022” guest edited by Tran Khanh Dang.
Rights and permissions
About this article
Cite this article
Tsuchiya, T., Mochizuki, R., Hirose, H. et al. Research on Selective Combination of Distributed Machine Learning Models. SN COMPUT. SCI. 3, 438 (2022). https://doi.org/10.1007/s42979-022-01312-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-022-01312-9