HostoSink: A Collaborative Scheduling in Heterogeneous Environment

Liao, Xiaofei; Xiang, Xiaobao; Jin, Hai; Zhang, Wei; Lu, Feng

doi:10.1007/978-3-319-11197-1_17

Xiaofei Liao²⁴,
Xiaobao Xiang²⁴,
Hai Jin²⁴,
Wei Zhang²⁴ &
…
Feng Lu²⁴

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8630))

Included in the following conference series:

International Conference on Algorithms and Architectures for Parallel Processing

2565 Accesses
1 Citations

Abstract

Due to the limitations of power consumption and memory capacity, the past few years have observed a strong trend of using heterogeneous environment equipped with accelerators, such as GPU (Graphic Processing Unit) and FPGA (Field Programmable Gate Array), and even MIC (Many Integrated Core), to help the traditional SMP (Symmetric Multi-Processing) CPU to speed up applications. In this paper, we choose the Intel MIC architecture coprocessor as the accelerator and design HostoSink, a runtime system for collaborative scheduling based on Pthread task. With the help of runtime characteristics of the application and the heterogeneous environment for scheduling the Pthread tasks between CPU and MIC automatically and dynamically, HostoSink provides MIC users with an easier way to gain high performance in heterogeneous CPU-MIC environment without the need of optimizing the original Pthread-based multi-threaded applications manually too much. Experimental results show that by using HostoSink, the overall speedup can achieve more than 3x speedup compared with the original performance by using CPU only and the average amount of data transmission between CPU and MIC is also reduced.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

CUDA documents, http://developer.download.nvidia.com/compute/cuda/docs/CUDA_Architecture_Overview.pdf
John, E.S., David, G., Shi, G.: OpenCL: A parallel programming standard for heterogeneous computing systems. IEEE Science & Engineering Magazine 12(3), 66–68 (2010)
Google Scholar
Scanniello, G., Ugo, E., Giuseppe, C., Carmine, G.: Using the GPU to Green an Intensive and Massive Computation System. In: 17th IEEE European Conference on Software Maintenance and Reengineering (CSMR), pp. 384–387. IEEE Press (2013)
Google Scholar
Xiao, S., Balaji, P., Dinan, J., Zhu, Q., Thakur, R., Coghlan, S., Lin, H., Wen, G., Hong, J., Feng, W.: Transparent Accelerator Migration in a Virtualized GPU Environment. In: 12th IEEE/ACM Symposimu on Cluster, Cloud and Grid Computing (CCGrid), pp. 124–131. IEEE Press (2012)
Google Scholar
Alécio, P.D.B., Carlos, E.P., Arjan, K., Andre, S., Dieter, W.F.: An effective dynamic scheduling runtime and tuning system for heterogeneous multi and many-core desktop platforms. In: 13th IEEE International Conference on High Performance Computing and Communications (HPCC), pp. 78–85. IEEE Press (2011)
Google Scholar
Alexander, H., Michael, K., Bungartz, H.: From GPGPU to Many-Core: Nvidia Fermi and Intel Many Integrated Core Architecture. IEEE Science & Engineering Magazine 14(2), 78–83 (2012)
Google Scholar
Top500 supercomputer sites, http://www.top500.org/blog/lists/2013/11/press-release
Jeffrey, S.V., Richard, G., Jack, D., Karsten, S., Bruce, L., Stephen, M., Jeremy, M.: Keeneland: Bringing heterogeneous gpu computing to the computational science community. IEEE Science & Engineering Magazine 13(5), 90–95 (2011)
Google Scholar
Fan, K., Kudlur, M., Dasika, G., Mahlke, S.: Bridging the computation gap between programmable processors and hardwired accelerators. In: 15th IEEE International Symposium on High Performance Computer Architecture (HPCA), pp. 313–322. IEEE Press (2009)
Google Scholar
Givargis, T., Vahid, F.: Platune: A tuning framework for system-on-a-chip platforms. IEEE Transactions on Computer Aided Design of Integrated Circuits and Systems (CADICS) 21(11), 1317–1327 (2002)
Article Google Scholar
Tan, Z., Waterman, A., Avizienis, R., Lee, Y., Cook, H., Patterson, D., Asanović, K.: RAMP gold: An FPGA-based architecture simulator for multiprocessors. In: 47th ACM Design Automation Conference, pp. 463–468. ACM Press (2010)
Google Scholar
Intel developers guide, http://www.intel.com/content/www/us/en/processors/xeon/xeon-phi-coprocessor-system-software-developers-guide.html
Diaz, J., Camelia, M., Alfonso, N.: A survey of parallel programming models and tools in the multi and many-core era. IEEE Transactions on Parallel and Distributed Systems (TPDS) 23(8), 1369–1386 (2012)
Article Google Scholar
Saule, E., Umit, V.C.: An early evaluation of the scalability of graph algorithms on the Intel MIC architecture. In: 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), pp. 1629–1639. IEEE Press (2012)
Google Scholar
Marjan, M., Jan, H., Anthony, M.S.: When and how to develop domain-specific languages. ACM Transactions on Computing Surveys (CSUR) 37(4), 316–344 (2005)
Article Google Scholar
Michael, D.L., Jamison, D.C., Wang, H., Meng, T.H.: Merge: A programming model for heterogeneous multi-core systems. ACM Transactions on SIGOPS Operating Systems Review 42(2), 287–296 (2008)
Article Google Scholar
Naila, F., Andrew, K., Gregory, D., Sudhakar, Y., Karsten, S.: A framework for dynamically instrumenting gpu compute applications within gpu ocelot. In: 4th ACM Workshop on General Purpose Processing on Graphics Processing Units, pp. 9–17. ACM Press (2011)
Google Scholar
Arvind, S., Lee, H., Brown, K., Rompf, T., Chafi, H., Wu, M., Atreya, A., Odersky, M., Olukotun, K.: OptiML: An implicitly parallel domain-specific language for machine learning. In: 28th IMLS International Conference on Machine Learning (ICML), pp. 609–616. IEEE Press (2011)
Google Scholar
Gelado, I., Stone, J.E., Cabezas, J., Patel, S., Navarro, N., Hwu, W.: An asymmetric distributed shared memory model for heterogeneous parallel systems. ACM Transactions on SIGARCH Computer Architecture News 38(1), 347–358 (2010)
Google Scholar
Yang, Y., Xiang, P., Kong, J., Zhou, H.: A GPGPU compiler for memory optimization and parallelism management. ACM Sigplan Notices 45(6), 86–97 (2010)
Article Google Scholar
Hameed, R., Qadeer, W., Wachs, M., Azizi, O., Solomatnikov, A., Lee, B.C., Richardson, S., Kozyrakis, C., Horowitz, M.: Understanding sources of inefficiency in general-purpose chips. In: 37th IEEE/ACM International Symposium on Computer Architecture (ISCA), pp. 37–47. IEEE Press (2010)
Google Scholar
Qin, S., Geng, X., Jiang, Y.: Automatic Dynamic Task Distribution between CPU and GPU for VR Systems. Applied Mechanics and Materials 157, 1324–1330 (2012)
Article Google Scholar
Augonnet, C., Thibault, S., Namyst, R., Wacrenier, P.A.: StarPU: A unified platform for task scheduling on heterogeneous multicore architectures. Concurrency and Computation: Practice and Experience (CCPE) 23(2), 187–198 (2011)
Article Google Scholar
Winter, J.A., Albonesi, D.H., Shoemaker, C.A.: Scalable thread scheduling and global power management for heterogeneous many-core architectures. In: 19th ACM International Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 29–40. ACM Press (2010)
Google Scholar
Song, H., Choi, K.: Autonomic Diffusive Load Balancing on Many-core Architecture using Simulated Annealing. In: 9th International Conference on Autonomic and Autonomous Systems (ICAS), pp. 90–95. IEEE Press (2013)
Google Scholar
Bartzas, A., Bellasi, P., Anagnostopoulos, I., Silvano, C., Fornaciari, W., Soudris, D., Melpignano, D., Ykman-Couvreur, C.: Runtime Resource Management Techniques for Many-core Architectures: The 2PARMA Approach. In: The International Conference on Engineering of Reconfigurable Systems and Algorithms (ERSA), pp. 835–840. IEEE Press (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

Services Computing Technology and System Lab, Cluster and Grid Computing Lab, Huazhong University of Science and Technology, Wuhan, 430074, China
Xiaofei Liao, Xiaobao Xiang, Hai Jin, Wei Zhang & Feng Lu

Authors

Xiaofei Liao
View author publications
You can also search for this author in PubMed Google Scholar
Xiaobao Xiang
View author publications
You can also search for this author in PubMed Google Scholar
Hai Jin
View author publications
You can also search for this author in PubMed Google Scholar
Wei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Feng Lu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Illinois Institute of Technology, 60616-3793, Chicago, IL, USA
Xian-he Sun
School of Computer Science and Technology, Dalian Maritime University, 1 Linghai Road, 116026, Dalian, China
Wenyu Qu
University of Ottawa, SEECS, 8, King Edward Ave, K1N 6N5, Ottawa, ON, Canada
Ivan Stojmenovic
Deakin University, 221 Burwood Highway, 3125, Burwood, VIC, Australia
Wanlei Zhou
Dalian Maritime University, NO.1 Linhai Road, 116026, Dailian, China
Zhiyang Li & Tingting Yang &
BeiHang University, XueYuan Road No.37,HaiDian District, Beijing, China
Hua Guo
University of Bradford, BD7 1DP, Bradford, West Yorkshire, United Kingdom
Geyong Min
Computer Network Information Center, Chinese Academy of Sciences, 100190, Beijing, China
Yulei Wu
27 Shanda Nanlu, 250100, Jinan City, Shandong Province, China
Lei Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liao, X., Xiang, X., Jin, H., Zhang, W., Lu, F. (2014). HostoSink: A Collaborative Scheduling in Heterogeneous Environment. In: Sun, Xh., et al. Algorithms and Architectures for Parallel Processing. ICA3PP 2014. Lecture Notes in Computer Science, vol 8630. Springer, Cham. https://doi.org/10.1007/978-3-319-11197-1_17

Download citation

DOI: https://doi.org/10.1007/978-3-319-11197-1_17
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11196-4
Online ISBN: 978-3-319-11197-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics