Skip to main content
Log in

FuPerMod: a software tool for the optimization of data-parallel applications on heterogeneous platforms

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Optimization of data-parallel applications for modern HPC platforms requires partitioning the computations between the heterogeneous computing devices in proportion to their speed. Heterogeneous data partitioning algorithms are based on computation performance models of the executing platforms. Their implementation is not trivial as it requires: accurate and efficient benchmarking of computing devices, which may share resources and/or execute different codes; appropriate interpolation methods to predict performance; and advanced mathematical methods to solve the data partitioning problem. In this paper, we present FuPerMod, a software tool that addresses these implementation issues and automates the development of data partitioning code in data-parallel applications for heterogeneous HPC platforms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Aubanel E, Wu X (2007) Incorporating latency in heterogeneous graph partitioning. In: IPDPS 2007, pp 1–8

  2. Beaumont O, Boudet V, Rastello F, Robert Y (2001) Matrix multiplication on heterogeneous platforms. IEEE Trans Parallel Distrib Syst 12(10):1033–1051

    Article  MathSciNet  Google Scholar 

  3. Catalyurek U, Boman E, Devine K et al (2007) Hypergraph-based dynamic load balancing for adaptive scientific computations. In: IPDPS 2007, pp 1–11

  4. Chevalier C, Pellegrini F (2008) PT-Scotch: a tool for efficient parallel graph ordering. Parallel Comput 34(68):318–331

    Article  MathSciNet  Google Scholar 

  5. Clarke D, Lastovetsky A, Rychkov V (2012) Column-based matrix partitioning for parallel matrix multiplication on heterogeneous processors based on functional performance models. In: HeteroPar’2011, pp 450–459

  6. Clarke D et al (2011) Dynamic load balancing of parallel computational iterative routines on highly heterogeneous HPC platforms. Parallel Process Lett 21:195–217

    Article  MathSciNet  Google Scholar 

  7. Karypis G, Schloegel K (2013) ParMETIS: parallel graph partitioning and sparse matrix ordering library. Version 4

  8. Lastovetsky A, Reddy R (2007) Data partitioning with a functional performance model of heterogeneous processors. Int J High Perform C 21:76–90

    Article  Google Scholar 

  9. Lastovetsky A, Reddy R (2010) Distributed data partitioning for heterogeneous processors based on partial estimation of their functional performance models. In: Euro-Par 2009, LNCS, vol 6043. Springer, pp 91–101

  10. Malony AD, Biersdorff S, Shende S et al (2011) Parallel performance measurement of heterogeneous parallel systems with GPUs. In: ICPP ’11, pp 176–185

  11. Rychkov V, Clarke D, Lastovetsky A (2011) Using multidimensional solvers for optimal data partitioning on dedicated heterogeneous HPC platforms. In: PaCT-2011, LNCS, vol 6873. Springer, pp 332–346

  12. Walshaw C, Cross M (2001) Multilevel mesh partitioning for heterogeneous communication networks. Future Gener Comput Syst 17(5):601–623

    Article  Google Scholar 

  13. Zhong Z, Rychkov V, Lastovetsky A (2012) Data partitioning on heterogeneous multicore and multi-GPU systems using functional performance models of data-parallel applications. In: Cluster, pp 191–199

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vladimir Rychkov.

Additional information

This research is supported by Science Foundation Ireland (Grant 08/IN.1/I2054).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Clarke, D., Zhong, Z., Rychkov, V. et al. FuPerMod: a software tool for the optimization of data-parallel applications on heterogeneous platforms. J Supercomput 69, 61–69 (2014). https://doi.org/10.1007/s11227-014-1207-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-014-1207-9

Keywords

Navigation