Skip to main content

Resource Efficiency Optimization for Big Data Mining Algorithm with Multi-MapReduce Collaboration Scenario

  • Conference paper
  • First Online:
  • 1722 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11645))

Abstract

Because any MapReduce job requires a series of complex operations such as task scheduling and resource allocation independently, there are a lot of redundant disk I/O and resource duplicate application operations among multiple MapReduce jobs coordinated by the same algorithm, causing inefficient resource utilization in job computing process. Big data mining algorithms are usually divided into several MapReduce Jobs, taking ItemBased algorithm as an example, this paper has analyzed the resource efficiency of mining algorithm with multi-MapReduce job collaboration scenario. It proposed an ItemBased algorithm based on DistributedCache, which used DistributedCache to cache I/O data between multiple MapReduce Jobs, breaks the defect of independence between jobs, and reduced the waiting delay between Map and Reduce tasks. The experimental results show that, DistributedCache can improve the data reading speed of MapReduce jobs. The algorithm reconstructed by DistributedCache greatly reduces the waiting delay between Map and Reduce tasks, and improves the resource efficiency by more than three times.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. The digital universe in 2020: big data, bigger digital shadows, and biggest growth in the far east. http://www.emc.com/collateral/analyst-reports/idc-the-digitaluniverse-in-2020.pdf. Accessed 15 Mar 2018

  2. Ghemawat, S., Gobioff, H., Leung, S.T.: The Google file system. In: Proceedings of the 19th ACM Symposium on Operating System Principles, pp. 29–43. ACM Press, New York (2003)

    Google Scholar 

  3. Chen, C., Lin, J., Kuo, S.: MapReduce scheduling for deadline-constrained jobs in heterogeneous cloud computing systems. IEEE Trans. Cloud Comput. 6(1), 127–140 (2018)

    Article  Google Scholar 

  4. Liao, B., Zhang, T., Yu, J., et al.: Energy consumption modeling and optimization analysis for MapReduce. J. Comput. Res. Dev. 53(9), 2107–2131 (2016)

    Google Scholar 

  5. Wu, Q., Wang, L.P., Luo, X.Z., et al.: Top-k high utility pattern mining algorithm based on MapReduce. Appl. Res. Comput. 34(10), 2897–2900 (2017)

    Google Scholar 

  6. Liao, B., Zhang, T., Yu, J., et al.: Temperature aware energy-efficient task scheduling strategies for MapReduce. J. Commun. 37(1), 61–75 (2016)

    Google Scholar 

  7. Zhao, Z.D., Shang, M.S.: User-based collaborative-filtering recommendation algorithms on Hadoop. In: Proceedings of International Conference on Knowledge Discovery and Data Mining, pp. 478–481. IEEE Press, Piscataway (2010)

    Google Scholar 

  8. Ma, M.M., Wang, S.P.: Research of user-based collaborative filtering recommendation algorithm based on Hadoop. In: Proceedings of International Conference on Computer Information Systems and Industrial Applications, pp. 63–66. Atlantis, New York (2015)

    Google Scholar 

  9. Schelter, S., Boden, C., Markl, V.: Scalable similarity-based neighborhood methods with MapReduce. In: Proceedings of ACM Conference on Recommender Systems, pp. 163–170. ACM Press, New York (2012)

    Google Scholar 

  10. Das, A.S., Datar, M., Garg, A., et al.: Google news personalization: scalable online collaborative filtering. In: Proceedings of International Conference on World Wide Web, pp. 271–280. ACM Press, New York (2007)

    Google Scholar 

  11. Jiang, J., Lu, J., Zhang, G., et al.: Scaling-up item-based collaborative filtering recommendation algorithm based on Hadoop. In: Proceedings of IEEE World Congress on Services, pp. 490–497. IEEE Press, Piscataway (2011)

    Google Scholar 

  12. Liao, B., Zhang, T., Guo, B.L., et al.: Performance optimization of ItemBased recommendation algorithm based on spark. J. Comput. Appl. 37(7), 1900–1905 (2017)

    Google Scholar 

Download references

Acknowledgment

This work was supported in part by Research Project of Hubei Provincial Department of Education (No. B2017590).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhou Fengli .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fengli, Z., Xiaoli, L. (2019). Resource Efficiency Optimization for Big Data Mining Algorithm with Multi-MapReduce Collaboration Scenario. In: Huang, DS., Huang, ZK., Hussain, A. (eds) Intelligent Computing Methodologies. ICIC 2019. Lecture Notes in Computer Science(), vol 11645. Springer, Cham. https://doi.org/10.1007/978-3-030-26766-7_46

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-26766-7_46

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-26765-0

  • Online ISBN: 978-3-030-26766-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics