Skip to main content
Log in

Asymmetry-aware load balancing for parallel applications in single-ISA multi-core systems

  • Published:
Journal of Zhejiang University SCIENCE C Aims and scope Submit manuscript

Abstract

Contemporary operating systems for single-ISA (instruction set architecture) multi-core systems attempt to distribute tasks equally among all the CPUs. This approach works relatively well when there is no difference in CPU capability. However, there are cases in which CPU capability differs from one another. For instance, static capability asymmetry results from the advent of new asymmetric hardware, and dynamic capability asymmetry comes from the operating system (OS) outside noise caused from networking or I/O handling. These asymmetries can make it hard for the OS scheduler to evenly distribute the tasks, resulting in less efficient load balancing. In this paper, we propose a user-level load balancer for parallel applications, called the’ capability balancer’, which recognizes the difference of CPU capability and makes subtasks share the entire CPU capability fairly. The balancer can coexist with the existing kernel-level load balancer without detrimenting the behavior of the kernel balancer. The capability balancer can fairly distribute CPU capability to tasks with very little overhead. For real workloads like the NAS Parallel Benchmark (NPB), we have accomplished speedups of up to 9.8% and 8.5% in dynamic and static asymmetries, respectively. We have also experienced speedups of 13.3% for dynamic asymmetry and 24.1% for static asymmetry in a competitive environment. The impacts of our task selection policies, FIFO (first in, first out) and cache, were compared. The use of the cache policy led to a speedup of 5.3% in overall execution time and a decrease of 4.7% in the overall cache miss count, compared with the FIFO policy, which is used by default.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Asanovic, K., Bodik, R., Demmel, J., Keaveny, T., Keutzer, K., Kubiatowicz, J., Morgan, N., Patterson, D., Sen, K., Wawrzynek, J., et al., 2009. A view of the parallel computing landscape. Commun. ACM, 52(10):56–67. [doi:10.1145/1562764.1562783]

    Article  Google Scholar 

  • Bailey, D.H., Barszcz, E., Barton, J.T., Browning, D.S., Carter, R.L., Dagum, L., Fatoohi, R.A., Frederickson, P.O., Lsinski, T.A., Schreiber, R.S., et al., 1991. The NAS Parallel Benchmarks. Int. J. Supercomput. Appl., 5(3): 63–73.

    Article  Google Scholar 

  • Beckman, P., Iskra, K., Yoshii, K., Coghlan, S., 2006. Operating system issues for petascale systems. ACM SIGOPS Oper. Syst. Rev., 40(2):29–33. [doi:10.1145/1131322.1131332]

    Article  Google Scholar 

  • Beckman, P., Iskra, K., Yoshii, K., Coghlan, S., Nataraj, A., 2008. Benchmarking the effects of operating system interference on extreme-scale parallel machines. Cluster Comput., 11(1):3–16. [doi:10.1007/s10586-007-0047-2]

    Article  Google Scholar 

  • Boneti, C., Gioiosa, R., Cazorla, F.J., Valero, M., 2008. A Dynamic Scheduler for Balancing HPC Applications. Proc. ACM/IEEE Conf. on Supercomputing, Article No.41, p.1–12.

  • Bovet, D.P., Cesati, M., 2005. Understanding the Linux Kernel (3rd Ed.). O’Reilly, p.258–293.

  • Brodowski, D., 2005. Current Trend in Linux Kernel Power Management, linuxtag 2005. Available from http://www.free-it.de/archiv/talks_2005/paper-11017/paper-11017.pdf [Accessed on Apr. 15, 2012].

  • De, P., Kothari, R., Mann, V., 2007. Identifying Sources of Operating System Jitter Through Fine-Grained Kernel Instrumentation. Proc. IEEE Int. Conf. on Cluster Computing, p.331–340. [doi:10.1109/CLUSTR.2007.4629247]

  • Demers, A., Keshav, S., Shenker, S., 1989. Analysis and simulation of a fair queueing algorithm. ACM SIGCOMM Comput. Commun. Rev., 19(4):1–12. [doi:10.1145/75247.75248]

    Article  Google Scholar 

  • Ferreira, K., Brightwell, R., Bridges, P., 2008. Characterizing Application Sensitivity to OS Interference Using Kernel-Level Noise Injection. Proc. ACM/IEEE Conf. on Supercomputing, Article No. 19, p.1–12.

  • Gioiosa, R., Petrini, F., Davis, K., Lebaillif-Delamare, F., 2004. Analysis of System Overhead on Parallel Computers. 4th IEEE Int. Symp. on Signal Processing and Information Technology, p.387–390.

  • Gioiosa, R., McKee, S.A., Valero, M., 2010. Designing OS for HPC Applications: Scheduling. Proc. IEEE Int. Conf. on Cluster Computing, p.78–87. [doi:10.1109/CLUSTER.2010.16]

  • Hill, M.D., Marty, M.R., 2008. Amdahl’s law in the multicore era. IEEE Comput., 41(7):33–38. [doi:10.1109/MC.2008.209]

    Article  Google Scholar 

  • Hoefler, T., Mehlan, T., Lumsdaine, A., Rehm, W., 2007. Netgauge: a Network Performance Measurement Framework. Proc. High Performance Computing and Communications, p.659–671.

  • Hofmeyr, S., Iancu, C., Blagojevíc, F., 2010. Load Balancing on Speed. Proc. 15th ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming, p.147–158. [doi:10.1145/1693453.1693475]

  • Jiang, X., Mishra, A., Zhao, L., Iyer, R., Fang, Z., Srinivasan, S., Makineni, S., Brett, P., Das, C.R., 2011. ACCESS: Smart Scheduling for Asymmetric Cache CMPs. Proc. IEEE 17th Int. Symp. on High Performance Computer Architecture, p.527–538. [doi:10.1109/HPCA.2011.5749757]

  • Koufaty, D., Reddy, D., Hahn, S., 2010. Bias Scheduling in Heterogeneous Multi-core Architectures. Proc. 5th European Conf. on Computer Systems, p.125–138. [doi:10.1145/1755913.1755928]

  • Kumar, R., Farkas, K.I., Jouppi, N.P., Ranganathan, P., Tullsen, D.M., 2003. Single-ISA Heterogeneous Multi-core Architectures: the Potential for Processor Power Reduction. Proc. 36th Annual IEEE/ACM Int. Symp. on Micro-architecture, p.81–92.

  • Kumar, R., Tullsen, D.M., Ranganathan, P., Jouppi, N.P., Farkas, K.I., 2004. Single-ISA Heterogeneous Multi-core Architectures for Multithreaded Workload Performance. Proc. 31st Annual Int. Symp. on Computer Architecture, p.64–75.

  • Lakshminarayana, N., Rao, S., Kim, H., 2008. Asymmetry Aware Scheduling Algorithms for Asymmetric Multiprocessors. Proc. 4th Annual Workshop on the Interaction Between Operating Systems and Computer Architecture.

  • Li, T., Baumberger, D., Hahn, S., 2009. Efficient and Scalable Multiprocessor Fair Scheduling Using Distributed Weighted Round-Robin. Proc. 14th ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming, p.65–74. [doi:10.1145/1504176.1504188]

  • Olsson, R., 2005. pktgen the Linux Packet Generator. Proc. Linux Symp., p.11–24.

  • Pallipadi, V., Starikovskiy, A., 2006. The Ondemand Governor. Proc. Linux Symp., p.223–238.

  • Petrini, F., Kerbyson, D.J., Pakin, S., 2003. The Case of the Missing Supercomputer Performance: Achieving Optimal Performance on the 8192 Processors of ASCI Q. Proc. ACM/IEEE Conf. on Supercomputing, p.55–71. [doi:10.1145/1048935.1050204]

  • Radojkovic, P., Cakarevic, V., Verdu, J., Pajuelo, A., Gioiosa, R., Cazorla, F.J., Nemirovsky, M., Valero, M., 2008. Measuring Operating System Overhead on CMT Processors. Proc. 20th Int. Symp. on Computer Architecture and High Performance Computing, p.133–140. [doi:10.1109/SBAC-PAD.2008.19]

  • Saez, J.C., Prieto, M., Fedorova, A., Blagodurov, S., 2010. A Comprehensive Scheduler for Asymmetric Multicore Systems. Proc. 5th European Conf. on Computer Systems, p.139–152. [doi:10.1145/1755913.1755929]

  • Salim, J.H., 2001. Beyond Softnet. Proc. 5th Annual Linux Showcase and Conf., p.165–172.

  • Scogland, T., Balaji, P., Feng, W., Narayanaswamy, G., 2008. Asymmetric Interactions in Symmetric Multi-core Systems: Analysis, Enhancements and Evaluation. Proc. ACM/IEEE Conf. on Supercomputing, Article No. 17, p.1–12.

  • Tsafrir, D., Etsion, Y., Feitelson, D.G., Kirkpatrick, S., 2005. System Noise, OS Clock Ticks, and Fine-Grained Parallel Applications. Proc. 19th Annual Int. Conf. on Supercomputing, p.303–312. [doi:10.1145/1088149.1088190]

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hyeonsang Eom.

Additional information

Project supported by the Next-Generation Information Computing Development Program through the National Research Foundation of Korea funded by the Ministry of Education, Science and Technology (No. 2011-0020521) and the Korea Communications Commission, under the Communications Policy Research Center Support Program supervised by the Korea Communications Agency (No. KCA-2011-1194100004-110010100)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kim, E., Eom, H. & Yeom, H.Y. Asymmetry-aware load balancing for parallel applications in single-ISA multi-core systems. J. Zhejiang Univ. - Sci. C 13, 413–427 (2012). https://doi.org/10.1631/jzus.C1100198

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1631/jzus.C1100198

Key words

CLC number

Navigation