Skip to main content
Log in

WiseThrottling: a new asynchronous task scheduler for mitigating I/O bottleneck in large-scale datacenter servers

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Datacenter servers are stepping into an era marked by powerful multi-/many-core processors. Severe problems such as I/O contentions in those large-scale platforms pose an unprecedented challenge. Prior studies primarily considered I/O bandwidth as a major performance bottleneck. However, our work reveals that in many cases the fundamental cause of I/O contentions is the inefficiency of OS schedulers. Particularly, the modern system is not aware of this fact and thus suffers from poor I/O performance, especially for datacenter servers. Based on our findings, we propose a new software-based scheduling approach, WiseThrottling, to reduce I/O contention. WiseThrottling performs asynchronous and self-adjustment scheduling for concurrent tasks. We evaluate our approach across a wide range of C/OpenMP/MapReduce workloads on a 64-core server in Dawning Cluster datacenter. The experimental results exhibit that WiseThrottling is effective for reducing the I/O bottleneck and it can improve the overall system performance by up to 207 %.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26
Fig. 27

Similar content being viewed by others

References

  1. Alvarez GA, Chambliss DD, Jadav D et al (2009) Utilizing informed throttling to guarantee quality of service to I/O streams. US Patent, Google Patents

  2. Armbrust M, Fox A, Griffith R et al (2009) Above the clouds: a berkeley view of cloud computing. Technical Report UCB/EECS-2009-28

  3. Barroso L, Holzle U (2007) The case for energy-proportional computing. IEEE Comput 40(12):33–37

    Article  Google Scholar 

  4. Bienia C (2011) Benchmarking Modern Multiprocessors. Princeton University. http://parsec.cs.princeton.edu/publications/bienia11benchmarking.pdf

  5. Boneti C, Cazorla FJ, Gioiosa R, Buyuktosunoglu A, Cher C-Y, Valero M (2008) Software-controlled priority characterization of POWER5 processor. In: Proceedings of the 35th international symposium on computer architecture, June 21–25, pp 415–426

  6. Bordawekar R, Rosario JM, Choudhary AN (1993) Design and evaluation of primitives for parallel I/O. In: Proceedings of SC’93, pp 452–461

  7. Ching A, Choudhary A, Coloma K, Liao WK, Ross R, Gropp W (2003) Noncontiguous access through MPI-IO. In: Proceedings of CCGrid’03. pp 104–111

  8. Das R, Ausavarungnirun R, Mutlu O, Kumar A et al (Feb 2013) Application-to-core mapping policies to reduce memory interference in multi-core systems. In: Proceedings of PACT’13

  9. Dhodapkar A, Smith J (2003) Comparing program phase detection techniques [C]. In: Proceedings of the 36th annual IEEE/ACM international symposium on microarchitecture. IEEE Computer Society, Los Alamitos, pp 217–217

  10. Ding C, Dwarkadas S, Huang MC et al (2006) Program phase detection and exploitation. In: Proceedings of the 20th international conference on parallel and distributed processing. IEEE Computer Society, Los Alamitos, pp 279–279

  11. Durand D, Jain R, Tseytlin D et al (2003) Parallel I/O scheduling using randomized, distributed edge coloring algorithms. J Parallel Distrib Comput 63(6):611–618

    Article  MATH  Google Scholar 

  12. Govindan S, Nath AR, Das A et al (2007) Xen and co.: communication-aware CPU scheduling for consolidated xen-based hosting platforms. In: Proceedings of VEE’07, pp 126–136

  13. Hastings A, Choudhary A (Sep 2006) Exploiting shared memory to improve parallel i/o performance. In: EuroPVM/MPI’06, pp 212–221

  14. Jain R, Somalwar K, Werth J et al (1992) Scheduling parallel I/O operations in multiple-bus systems. IEEE Trans Parallel Distrib Syst 16(4):352–362

    MATH  Google Scholar 

  15. Jain R, Somalwar K, Werth J et al (1997) Heuristics for scheduling I/O operations. IEEE Trans Parallel Distrib Syst 8(3):310–320

    Article  Google Scholar 

  16. Jiang Y, Tian K, Shen X (2010) Combining locality analysis with online proactive job co-scheduling in chip multiprocessors. In: Proceedings of the 5th international conference on high performance embedded architectures and compilers. Springer, Berlin, pp 201–215

  17. Kambadur M, Moseley T, Hank R, Kim Martha A (2012) Measuring interference between live datacenter applications. In: IEEE/ACM SC’12, pp 51

  18. Lin Z, Zhou S (1993) Parallelizing I/O intensive applications for a workstation cluster: a case study. SIGARCH Comput Arch News 21(5):15–22

    Article  Google Scholar 

  19. Ling X, Jin H, Ibrahim S et al (2012) Efficient Disk I/O scheduling with QoS guarantee for Xen-based hosting platforms. In: Proceedings of CCGRID ’12, pp 81–89

  20. Lu Y, Chen Y, Amritkar P, Thakur R et al (2012) A new data sieving approach for high performance I/O. In: Proceedings of the 7th international conference on future information technology (FutureTech’12)

  21. Lv F, Cui H-M, Wang L, Liu L, Wu CG, Feng X-B, Yew PC (2014) Dynamic I/O-aware scheduling for batch-mode applications on chip multiprocessor systems of cluster platforms. J Comput Sci Technol 29(1):21–37

  22. Ma S, Sun X-H, Ioan R (2012) I/O throttling and coordination for MapReduce. Technical Report, Illinois Institute of Technology

  23. Mars J, Tang L, Hundt R et al (2011) Bubble-up: increasing utilization in modern warehouse scale computers via sensible co-locations. In: Proceedings of Micro’11, pp 248–259

  24. Mishra AK, Hellerstein JL, Cirne W, Das CR (2010) Towards characterizing cloud backend workloads: insights from google compute clusters. SIGMETRICS Perform Eval Rev 37(4):34–41

    Article  Google Scholar 

  25. Moreira JE, Franke H, Chan W et al (1999) A gang-scheduling system for ASCI Blue-Pacific. In: HPCN’99, pp 831–840

  26. Ma L, Chamberlain R, Agrawal K (2014) Performance modeling for highly-threaded many-core GPUs. In: Proceedings of IEEE ASAP’14, pp 84–91

  27. Ma L, Agrawal K, Chamberlain RD (2014) A memory access model for highly-threaded many-core architectures. Future Gener Comput Syst 30:202–215

    Article  Google Scholar 

  28. Ongaro D, Cox AL, Rixner S (2018) Scheduling I/O in virtual machine monitors. In: Proceedings of VEE’08, pp 1–10

  29. Park S, Shen K (2012) FIOS: a fair, efficient flash i/o scheduler. In: FAST’12

  30. Ryu KD, Hollingsworth JK, Keleher PJ (2001) Efficient network and I/O throttling for fine-grain cycle stealing. In: Proceedings of SC’01, pp 3–3 (CDROM)

  31. Schulz G (2006) Data center I/O performance issues and impacts a look at I/O performance bottlenecks and their impact on time sensitive applications. White paper

  32. Shakshober DJ (2015) Choosing an I/O Scheduler for Red Hat \(\textregistered \) Enterprise Linux \(\textregistered \) 4 and the 2.6 Kernel. http://www.redhat.com/magazine/008jun05/features/schedulers/

  33. Snavely A, Tullsen D (2000) Symbiotic jobscheduling for a simultaneous multithreaded processor. In: Proc of ASPLOS’00, pp 234–244

  34. Sun N-H, Meng D (2007) Dawning4000A high performance computer. Front Comput Sci China 1(1):20–25

  35. Thakur R, Gropp W, Lusk E (1999) Data sieving and collective I/O in romio. In: Frontiers’99, pp 182–189

  36. Thakur R, Ross R, Lusk E, Gropp W, Latham R (2004) Users guide for ROMIO: a high-performance, portable MPI-IO implementation. Technical Memorandum ANL/MCS-TM-234, Mathematics and Computer Science Division. Argonne National Laboratory (revised)

  37. Zhang Y, Yang A, Sivasubramaniam A et al (2003) Gang scheduling extensions for I/O intensive workloads. In: JSSPP’03, pp 183–207

  38. http://hadoop.apache.org/releases.html. Accessed Apr 2015

  39. http://www.graph500.org. Accessed Apr 2015

  40. http://parsec.cs.princeton.edu/. Accessed Apr 2015

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fang Lv.

Additional information

This research is supported by the National High Technology Research and Development Program of China under Grants No. 2012AA010902 and 2015AA011505; the NSFC under Grants No. 61202055, 61221062, 61303053, 61432016 and 61402445; and the National Basic Research Program of China under Grant No. 2011CB302504.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lv, F., Liu, L., Cui, Hm. et al. WiseThrottling: a new asynchronous task scheduler for mitigating I/O bottleneck in large-scale datacenter servers. J Supercomput 71, 3054–3093 (2015). https://doi.org/10.1007/s11227-015-1427-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-015-1427-7

Keywords

Navigation