Resource Efficiency Optimization for Big Data Mining Algorithm with Multi-MapReduce Collaboration Scenario

Fengli, Zhou; Xiaoli, Lin

doi:10.1007/978-3-030-26766-7_46

Resource Efficiency Optimization for Big Data Mining Algorithm with Multi-MapReduce Collaboration Scenario

Zhou Fengli¹¹ &
Lin Xiaoli¹¹

Conference paper
First Online: 24 July 2019

1722 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11645))

Abstract

Because any MapReduce job requires a series of complex operations such as task scheduling and resource allocation independently, there are a lot of redundant disk I/O and resource duplicate application operations among multiple MapReduce jobs coordinated by the same algorithm, causing inefficient resource utilization in job computing process. Big data mining algorithms are usually divided into several MapReduce Jobs, taking ItemBased algorithm as an example, this paper has analyzed the resource efficiency of mining algorithm with multi-MapReduce job collaboration scenario. It proposed an ItemBased algorithm based on DistributedCache, which used DistributedCache to cache I/O data between multiple MapReduce Jobs, breaks the defect of independence between jobs, and reduced the waiting delay between Map and Reduce tasks. The experimental results show that, DistributedCache can improve the data reading speed of MapReduce jobs. The algorithm reconstructed by DistributedCache greatly reduces the waiting delay between Map and Reduce tasks, and improves the resource efficiency by more than three times.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

The digital universe in 2020: big data, bigger digital shadows, and biggest growth in the far east. http://www.emc.com/collateral/analyst-reports/idc-the-digitaluniverse-in-2020.pdf. Accessed 15 Mar 2018
Ghemawat, S., Gobioff, H., Leung, S.T.: The Google file system. In: Proceedings of the 19th ACM Symposium on Operating System Principles, pp. 29–43. ACM Press, New York (2003)
Google Scholar
Chen, C., Lin, J., Kuo, S.: MapReduce scheduling for deadline-constrained jobs in heterogeneous cloud computing systems. IEEE Trans. Cloud Comput. 6(1), 127–140 (2018)
Article Google Scholar
Liao, B., Zhang, T., Yu, J., et al.: Energy consumption modeling and optimization analysis for MapReduce. J. Comput. Res. Dev. 53(9), 2107–2131 (2016)
Google Scholar
Wu, Q., Wang, L.P., Luo, X.Z., et al.: Top-k high utility pattern mining algorithm based on MapReduce. Appl. Res. Comput. 34(10), 2897–2900 (2017)
Google Scholar
Liao, B., Zhang, T., Yu, J., et al.: Temperature aware energy-efficient task scheduling strategies for MapReduce. J. Commun. 37(1), 61–75 (2016)
Google Scholar
Zhao, Z.D., Shang, M.S.: User-based collaborative-filtering recommendation algorithms on Hadoop. In: Proceedings of International Conference on Knowledge Discovery and Data Mining, pp. 478–481. IEEE Press, Piscataway (2010)
Google Scholar
Ma, M.M., Wang, S.P.: Research of user-based collaborative filtering recommendation algorithm based on Hadoop. In: Proceedings of International Conference on Computer Information Systems and Industrial Applications, pp. 63–66. Atlantis, New York (2015)
Google Scholar
Schelter, S., Boden, C., Markl, V.: Scalable similarity-based neighborhood methods with MapReduce. In: Proceedings of ACM Conference on Recommender Systems, pp. 163–170. ACM Press, New York (2012)
Google Scholar
Das, A.S., Datar, M., Garg, A., et al.: Google news personalization: scalable online collaborative filtering. In: Proceedings of International Conference on World Wide Web, pp. 271–280. ACM Press, New York (2007)
Google Scholar
Jiang, J., Lu, J., Zhang, G., et al.: Scaling-up item-based collaborative filtering recommendation algorithm based on Hadoop. In: Proceedings of IEEE World Congress on Services, pp. 490–497. IEEE Press, Piscataway (2011)
Google Scholar
Liao, B., Zhang, T., Guo, B.L., et al.: Performance optimization of ItemBased recommendation algorithm based on spark. J. Comput. Appl. 37(7), 1900–1905 (2017)
Google Scholar

Download references

Acknowledgment

This work was supported in part by Research Project of Hubei Provincial Department of Education (No. B2017590).

Author information

Authors and Affiliations

Faculty of Information Technology, Wuhan College of Foreign Language and Foreign Affairs, Wuhan, 430083, China
Zhou Fengli & Lin Xiaoli

Authors

Zhou Fengli
View author publications
You can also search for this author in PubMed Google Scholar
Lin Xiaoli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhou Fengli .

Editor information

Editors and Affiliations

Tongji University, Shanghai, China
De-Shuang Huang
Nanchang Institute of Technology, Nanchang, China
Zhi-Kai Huang
Liverpool John Moores University, Liverpool, UK
Abir Hussain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fengli, Z., Xiaoli, L. (2019). Resource Efficiency Optimization for Big Data Mining Algorithm with Multi-MapReduce Collaboration Scenario. In: Huang, DS., Huang, ZK., Hussain, A. (eds) Intelligent Computing Methodologies. ICIC 2019. Lecture Notes in Computer Science(), vol 11645. Springer, Cham. https://doi.org/10.1007/978-3-030-26766-7_46

Download citation

DOI: https://doi.org/10.1007/978-3-030-26766-7_46
Published: 24 July 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-26765-0
Online ISBN: 978-3-030-26766-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics