Skip to main content

Advertisement

Log in

Energy efficiency of the simulation of three-dimensional coastal ocean circulation on modern commodity and mobile processors

A case study based on the Haswell and Cortex-A15 microarchitectures

  • Special Issue Paper
  • Published:
Computer Science - Research and Development

Abstract

We analyze energy efficiency of a 3D coastal ocean simulator on Haswell and Cortex-A15 architectures and propose a simple yet effective way to model energy-to-solution on different hardware platforms. The work also demonstrates that using processors from the field of embedded/mobile computing can increase the energy efficiency by 50 %.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Aizinger V (2011) A geometry independent slope limiter for the discontinuous Galerkin method. In: Computational science and high performance computing, vol IV. Note N Fl Mech Mul D (Springer) 115:207–217. doi:10.1007/978-3-642-17770-5

  2. Aizinger V, Proft J, Dawson C, Pothina D, Negusse S (2013) A three-dimensional discontinuous Galerkin model applied to the baroclinic simulation of Corpus Christi Bay. Ocean Dyn 63(1):89–113. doi:10.1007/s10236-012-0579-8

    Article  Google Scholar 

  3. Anzt H, Quintana-Ortí ES (2014) Improving the energy efficiency of sparse linear system solvers on multicore and manycore systems. Philos Trans R Soc A 372(2018). doi:10.1098/rsta.2013.0279

  4. Barker K, Kerbyson D (2005) A performance model and scalability analysis of the HYCOM ocean simulation application. In: Proceedings of the IASTED international conference on parallel and distributed computing

  5. Benner P, Ezzatti P, Quintana-Ortí E, Remón A (2013) On the impact of optimization on the time-power-energy balance of dense linear algebra factorizations. In: Rea A (ed) Algorithms and architectures for parallel processing. Lecture notes in comput science, vol 8286, pp 3–10. Springer, New York. doi:10.1007/978-3-319-03889-6_1

  6. Castelló A, Duato J, Mayo R, Peña A, Quintana-Ortí E, Roca VVS (2014) On the use of remote GPUs and low-power processors for the acceleration of scientific applications. Energy. In: The 4th international conference on smart grids, green communication and IT energy-aware. Technical report, pp 57–62

  7. Cockburn B, Shu CW (1989) TVB Runge–Kutta local projection discontinuous Galerkin finite element method for conservation laws II. General framework. Math Comput 52(186):411–435. doi:10.1090/S0025-5718-1989-0983311-4

    MathSciNet  MATH  Google Scholar 

  8. Cowles GW (2008) Parallelization of the FVCOM coastal ocean model. Int J High Perform Comput 22(2):177–193. doi:10.1177/1094342007083804

    Article  Google Scholar 

  9. Dawson C, Aizinger V (2005) A discontinuous Galerkin method for three-dimensional shallow water equations. J Sci Comput 22(1–3):245–267. doi:10.1007/s10915-004-4139-3

    Article  MathSciNet  MATH  Google Scholar 

  10. Dietrich J, Tanaka S, Westerink J, Dawson C, Luettich JRA, Zijlema M, Holthuijsen L, Smith J, Westerink L, Westerink H (2012) Performance of the unstructured-mesh, SWAN\(+\)ADCIRC model in computing hurricane waves and surge. J Sci Comput 52(2):468–497. doi:10.1007/s10915-011-9555-6

    Article  MATH  Google Scholar 

  11. Feng W, Cameron K, Scogland T, Subraumaniam B (2015) Green500 list. http://www.green500.org/lists/green201506

  12. Geveler M, Turek S (2016) Icarus project homepage. http://www.icarus-green-hpc.org

  13. Göddeke D, Komatitsch D, Geveler M, Ribbrock D, Rajovic N, Puzovic N, Ramirez A (2013) Energy efficiency vs. performance of the numerical solution of PDEs: an application study on a low-power arm-based cluster. J Comput Phys 237:132–150. doi:10.1016/j.jcp.2012.11.031

    Article  Google Scholar 

  14. Hager G, Treibig J, Habich J, Wellein G (2016) Exploring performance and power properties of modern multi-core chips via simple machine models. Concurr Comput Pract Exp 28(2). doi:10.1002/cpe.3180

  15. Intel Corp (2015) Desktop 4th generation Intel Core Processor family. Desktop Intel Pentium Processor family, and Desktop Intel Celeron\(^{\textregistered }\) processor family datasheet volume 1 of 2. http://www.intel.com/content/www/us/en/processors/core/4th-gen-core-family-desktop-vol-1-datasheet.html

  16. Kerbyson DJ, Jones PW (2005) A performance model of the parallel ocean program. Int J High Perform Comput 19(3):261–276. doi:10.1177/1094342005056114

    Article  Google Scholar 

  17. Kuzmin D (2010) A vertex-based hierarchical slope limiter for p-adaptive discontinuous Galerkin methods. J Comput Appl Math 233(12):3077–3085. doi:10.1016/j.cam.2009.05.028

    Article  MathSciNet  MATH  Google Scholar 

  18. Laros JH, Pedretti K, Kelly SM, Shu W, Ferreira K, Dyke JV, Vaughan C (2012) Energy-efficient high performance computing: measurement and tuning. Springer. doi:10.1007/978-1-4471-4492-2

  19. Lawrence Berkeley National Laboratory (2006) High-performance buildings for high-tech industries: data centers. http://hightech.lbl.gov/datacenters.htm

  20. Malas TM, Hager G, Ltaief H, Keyes DE (2014) Towards energy efficiency and maximum computational intensity for stencil algorithms using wavefront diamond temporal blocking. CoRR abs/1410.5561.arXiv:1410.5561

  21. Meuer H, Strohmeier E, Dongarra J, Simon H, Meuer M (2015) Top500 list. http://top500.org/lists/2015/06/

  22. Nair R, Choi HW, Tufo H (2009) Computational aspects of a scalable high-order discontinuous Galerkin atmospheric dynamical core. Comput Fluids 38(2):309–319. doi:10.1016/j.compfluid.2008.04.006

    Article  MathSciNet  MATH  Google Scholar 

  23. NVIDIA Corp (2014) NVIDIA Jetson TK1 development kit—bringing GPU-accelerated computing to embedded systems. http://developer.download.nvidia.com/embedded/jetson/TK1/docs/Jetson_platform_brief_May2014.pdf

  24. Rajovic N, Rico A, Vipond J, Gelado I, Puzovic N, Ramirez A (2013) Experiences with mobile processors for energy efficient HPC. In: Design, automation test in Europe conference exhibition (DATE), pp 464–468. doi:10.7873/DATE.2013.103

  25. Reuter B, Aizinger V, Köstler H (2015) A multi-platform scaling study for an OpenMP parallelization of a discontinuous Galerkin ocean model. Comput Fluids 117:325–335. doi:10.1016/j.compfluid.2015.05.020

    Article  MathSciNet  Google Scholar 

  26. Ringler T, Petersen M, Higdon RL, Jacobsen D, Jones PW, Maltrud M (2013) A multi-resolution approach to global ocean modeling. Ocean Model 69:211–232. doi:10.1016/j.ocemod.2013.04.010

    Article  Google Scholar 

  27. Sannino G, Artale V, Lanucara P (2001) An hybrid OpenMP-MPI parallelization of the Princeton ocean model. In: Proceedings of the international conference ParCo, pp 222–229. doi:10.1142/9781860949630_0028

  28. Sarkar V, Harrod W, Snavely AE (2009) Software challenges in extreme scale systems. J Phys Conf Ser 180(1):012045. http://stacks.iop.org/1742-6596/180/i=1/a=012045

  29. Schäppi B, Przywara B, Bellosa F, Bogner T, Weeren S, Harrison R, Anglade A (2009) Energy efficient servers in Europe—energy consumption, saving potentials and measures to support market development for energy efficient solutions. In: Technical report, Intelligent Energy Europe Project

  30. Scogland TR, Steffen CP, Wilde T, Parent F, Coghlan S, Bates N, Feng Wc, Strohmaier E (2014) A power-measurement methodology for large-scale, high-performance computing. In: Proceedings of the 5th ACM/SPEC international conference on performance engineering, ICPE ’14. ACM, New York, pp 149–159. doi:10.1145/2568088.2576795

  31. Tanaka S, Bunya S, Westerink JJ, Dawson C, Luettich RA (2011) Scalability of an unstructured grid continuous Galerkin based hurricane storm surge model. J Sci Comput 46(3):329–358. doi:10.1007/s10915-010-9402-1

    Article  MathSciNet  MATH  Google Scholar 

  32. Treibig J, Dolz MF, Guillen C, Navarrete C, Knobloch M, Rountree B (2014) Tools and methods for measuring and tuning the energy efficiency of HPC systems. J Sci Program 22:273–283. doi:10.3233/SPR-140393

    Google Scholar 

  33. Umlauf L, Burchard H (2003) A generic length-scale equation for geophysical turbulence models. J Mar Res 61(2):235–265. doi:10.1357/002224003322005087

    Article  Google Scholar 

  34. Wallcraft A, Hurlburt H, Townsend T, Chassignet E (2005) 1/25 degree Atlantic Ocean simulation using HYCOM. In: Users group conference, pp 222–225. doi:10.1109/DODUGC.2005.1

  35. Wang G, Qiao F, Xia C (2010) Parallelization of a coupled wave-circulation model and its application. Ocean Dyn 60(2):331–339. doi:10.1007/s10236-010-0274-6

    Article  Google Scholar 

  36. Wittmann M, Hager G, Zeiser T, Wellein G (2013) An analysis of energy-optimized lattice-Boltzmann CFD simulations from the chip to the highly parallel level. CoRR abs/1304.7664. arXiv:1304.7664

  37. Worley P, Levesque J (2004) The performance evolution of the parallel ocean program on the Cray X1. In: Proceedings of the 46th Cray User Group conference, pp 17–21

Download references

Acknowledgments

This work has been supported in part by the German Research Foundation (DFG) through the Priority Program 1648 ‘Software for Exascale Computing’ (Grants TU 102/50-1, GO 1758/2-1), and through the individual Grant AI 117/1. ICARUS hardware is financed by MIWF NRW under the lead of MERCUR.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Markus Geveler.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Geveler, M., Reuter, B., Aizinger, V. et al. Energy efficiency of the simulation of three-dimensional coastal ocean circulation on modern commodity and mobile processors. Comput Sci Res Dev 31, 225–234 (2016). https://doi.org/10.1007/s00450-016-0324-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00450-016-0324-5

Keywords

Navigation