Skip to main content

Automatic Acceleration of Stencil Codes in Android Devices

  • Conference paper
  • First Online:
Algorithms and Architectures for Parallel Processing (ICA3PP 2017)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10393))

  • 2459 Accesses

Abstract

The increase of performance in handheld devices due to their widespread adoption has required the integration of several distinct kinds of processor in a single chip. These technologies have turned current Systems on Chip into heterogeneous platforms. Stencil codes are a family of algorithms that appear in many relevant scientific and image processing codes. In order to improve the performance of these algorithms in heterogeneous platforms, the usage of accelerators is very important but, for a mobile applications developer, the development cost is very high. We propose a methodology, based in our framework Paralldroid, for automatically generating accelerated implementations of several well-known representative stencil codes. The performance of these codes has also been measured in order to demonstrate how Paralldroid is able to accelerate code without extensive or complex modifications. Results show great performance improvements for few code modifications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Acosta, A., Almeida, F.: Parallel implementations of the particle filter algorithm for android mobile devices. In: 2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, pp. 244–247, March 2015

    Google Scholar 

  2. Acosta, A., Afonso, S., Almeida, F.: Extending paralldroid with object oriented annotations. Parallel Comput. 57, 25–36 (2016). http://www.sciencedirect.com/science/article/pii/S0167819116300126

    Article  Google Scholar 

  3. Acosta, A., Almeida, F.: Towards a unified heterogeneous development model in android. In: Eleventh International Workshop HeteroPar 2013: Algorithms, Models and Tools for Parallel Computing on Heterogeneous Platforms (2013)

    Google Scholar 

  4. ARM: ARM®MaliGPU OpenCL developer guide. http://malideveloper.arm.com/documentation/developer-guides/arm-guide-opencl/

  5. Christen, M., Schenk, O., Burkhart, H.: Patus: a code generation and autotuning framework for parallel iterative stencil computations on modern microarchitectures. In: 2011 IEEE International Parallel & Distributed Processing Symposium (IPDPS), pp. 676–687. IEEE (2011)

    Google Scholar 

  6. Cray Inc.: Cray®XCseries software environment. http://www.cray.com/sites/default/files/resources/CrayXC40_SoftwareEnvironment.pdf

  7. Datta, K., Murphy, M., Volkov, V., Williams, S., Carter, J., Oliker, L., Patterson, D., Shalf, J., Yelick, K.: Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures. In: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, SC 2008, pp. 4:1–4:12. IEEE Press, Piscataway (2008). http://dl.acm.org/citation.cfm?id=1413370.1413375

  8. Intel: Intel® Atom Processor for Smartphone and Tablet. https://ark.intel.com/products/family/70095/Intel-Atom-Processor-for-Smartphone-and-Tablet

  9. Notebook Check: Apple A10 Fusion. https://www.notebookcheck.net/Apple-A10-Fusion-SoC.173824.0.html

  10. NVIDIA: Tegra mobile processors: Tegra 2, Tegra 3 and Tegra 4. http://www.nvidia.com/object/tegra-superchip.html

  11. Packard, N.H., Wolfram, S.: Two-dimensional cellular automata. J. Stat. Phys. 38(5), 901–946 (1985). http://dx.doi.org/10.1007/BF01010423

    Article  MathSciNet  MATH  Google Scholar 

  12. PGI: PGI Accelerator compilers with OpenACC directives. https://www.pgroup.com/resources/accel.htm

  13. Qualcomm: Snapdragon mobile processors. http://www.qualcomm.com/snapdragon

  14. Reyes, R., López-Rodríguez, I., Fumero, J.J., Sande, F.: accULL: an OpenACC implementation with CUDA and OpenCL support. In: Kaklamanis, C., Papatheodorou, T., Spirakis, P.G. (eds.) Euro-Par 2012. LNCS, vol. 7484, pp. 871–882. Springer, Heidelberg (2012). doi:10.1007/978-3-642-32820-6_86

    Chapter  Google Scholar 

  15. Samsung: Exynos mobile processors. http://www.samsung.com/global/business/semiconductor/minisite/Exynos/

  16. Shimokawabe, T., Aoki, T., Onodera, N.: High-productivity framework for large-scale GPU/CPU stencil applications. Procedia Comput. Sci. 80, 1646–1657 (2016). http://www.sciencedirect.com/science/article/pii/S1877050916309863

    Article  Google Scholar 

  17. Smith, G.D.: Numerical Solution of Partial Differential Equations: Finite Difference Methods. Oxford University Press, New York (1985)

    MATH  Google Scholar 

  18. Unat, D., Cai, X., Baden, S.B.: Mint: realizing cuda performance in 3d stencil methods with annotated c. In: Proceedings of the International Conference on Supercomputing, pp. 214–224. ACM (2011)

    Google Scholar 

  19. Zhang, T., Suen, C.Y.: A fast parallel algorithm for thinning digital patterns. Commun. ACM 27(3), 236–239 (1984)

    Article  Google Scholar 

  20. Zhang, Y., Mueller, F.: Auto-generation and auto-tuning of 3d stencil codes on GPU clusters. In: Proceedings of the Tenth International Symposium on Code Generation and Optimization, CGO 2012, NY, USA, pp. 155–164 (2012). http://doi.acm.org/10.1145/2259016.2259037

Download references

Acknowledgement

This work was supported by the EC (ERDF), the NESUS IC1315 COST Action, the Spanish Ministry of Economy, Industry and Competitiveness through the TIN2016-78919-R project, and the CAPAP-H network.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sergio Afonso .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Afonso, S., Acosta, A., Almeida, F. (2017). Automatic Acceleration of Stencil Codes in Android Devices. In: Ibrahim, S., Choo, KK., Yan, Z., Pedrycz, W. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2017. Lecture Notes in Computer Science(), vol 10393. Springer, Cham. https://doi.org/10.1007/978-3-319-65482-9_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-65482-9_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-65481-2

  • Online ISBN: 978-3-319-65482-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics