Automatic Acceleration of Stencil Codes in Android Devices

Afonso, Sergio; Acosta, Alejandro; Almeida, Francisco

doi:10.1007/978-3-319-65482-9_6

Sergio Afonso¹⁷,
Alejandro Acosta¹⁷ &
Francisco Almeida¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10393))

Included in the following conference series:

International Conference on Algorithms and Architectures for Parallel Processing

2459 Accesses

Abstract

The increase of performance in handheld devices due to their widespread adoption has required the integration of several distinct kinds of processor in a single chip. These technologies have turned current Systems on Chip into heterogeneous platforms. Stencil codes are a family of algorithms that appear in many relevant scientific and image processing codes. In order to improve the performance of these algorithms in heterogeneous platforms, the usage of accelerators is very important but, for a mobile applications developer, the development cost is very high. We propose a methodology, based in our framework Paralldroid, for automatically generating accelerated implementations of several well-known representative stencil codes. The performance of these codes has also been measured in order to demonstrate how Paralldroid is able to accelerate code without extensive or complex modifications. Results show great performance improvements for few code modifications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Modern Code Applied in Stencil in Edge Detection of an Image for Architecture Intel Xeon Phi KNL

Parallel programming in mobile devices with FancyJCL

Article Open access 23 February 2024

High-performance code optimizations for mobile devices

Article 11 October 2018

References

Acosta, A., Almeida, F.: Parallel implementations of the particle filter algorithm for android mobile devices. In: 2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, pp. 244–247, March 2015
Google Scholar
Acosta, A., Afonso, S., Almeida, F.: Extending paralldroid with object oriented annotations. Parallel Comput. 57, 25–36 (2016). http://www.sciencedirect.com/science/article/pii/S0167819116300126
Article Google Scholar
Acosta, A., Almeida, F.: Towards a unified heterogeneous development model in android. In: Eleventh International Workshop HeteroPar 2013: Algorithms, Models and Tools for Parallel Computing on Heterogeneous Platforms (2013)
Google Scholar
ARM: ARM^®Mali^™GPU OpenCL developer guide. http://malideveloper.arm.com/documentation/developer-guides/arm-guide-opencl/
Christen, M., Schenk, O., Burkhart, H.: Patus: a code generation and autotuning framework for parallel iterative stencil computations on modern microarchitectures. In: 2011 IEEE International Parallel & Distributed Processing Symposium (IPDPS), pp. 676–687. IEEE (2011)
Google Scholar
Cray Inc.: Cray^®XC^™series software environment. http://www.cray.com/sites/default/files/resources/CrayXC40_SoftwareEnvironment.pdf
Datta, K., Murphy, M., Volkov, V., Williams, S., Carter, J., Oliker, L., Patterson, D., Shalf, J., Yelick, K.: Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures. In: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, SC 2008, pp. 4:1–4:12. IEEE Press, Piscataway (2008). http://dl.acm.org/citation.cfm?id=1413370.1413375
Intel: Intel^® Atom^™ Processor for Smartphone and Tablet. https://ark.intel.com/products/family/70095/Intel-Atom-Processor-for-Smartphone-and-Tablet
Notebook Check: Apple A10 Fusion. https://www.notebookcheck.net/Apple-A10-Fusion-SoC.173824.0.html
NVIDIA: Tegra mobile processors: Tegra 2, Tegra 3 and Tegra 4. http://www.nvidia.com/object/tegra-superchip.html
Packard, N.H., Wolfram, S.: Two-dimensional cellular automata. J. Stat. Phys. 38(5), 901–946 (1985). http://dx.doi.org/10.1007/BF01010423
Article MathSciNet MATH Google Scholar
PGI: PGI Accelerator compilers with OpenACC directives. https://www.pgroup.com/resources/accel.htm
Qualcomm: Snapdragon mobile processors. http://www.qualcomm.com/snapdragon
Reyes, R., López-Rodríguez, I., Fumero, J.J., Sande, F.: accULL: an OpenACC implementation with CUDA and OpenCL support. In: Kaklamanis, C., Papatheodorou, T., Spirakis, P.G. (eds.) Euro-Par 2012. LNCS, vol. 7484, pp. 871–882. Springer, Heidelberg (2012). doi:10.1007/978-3-642-32820-6_86
Chapter Google Scholar
Samsung: Exynos mobile processors. http://www.samsung.com/global/business/semiconductor/minisite/Exynos/
Shimokawabe, T., Aoki, T., Onodera, N.: High-productivity framework for large-scale GPU/CPU stencil applications. Procedia Comput. Sci. 80, 1646–1657 (2016). http://www.sciencedirect.com/science/article/pii/S1877050916309863
Article Google Scholar
Smith, G.D.: Numerical Solution of Partial Differential Equations: Finite Difference Methods. Oxford University Press, New York (1985)
MATH Google Scholar
Unat, D., Cai, X., Baden, S.B.: Mint: realizing cuda performance in 3d stencil methods with annotated c. In: Proceedings of the International Conference on Supercomputing, pp. 214–224. ACM (2011)
Google Scholar
Zhang, T., Suen, C.Y.: A fast parallel algorithm for thinning digital patterns. Commun. ACM 27(3), 236–239 (1984)
Article Google Scholar
Zhang, Y., Mueller, F.: Auto-generation and auto-tuning of 3d stencil codes on GPU clusters. In: Proceedings of the Tenth International Symposium on Code Generation and Optimization, CGO 2012, NY, USA, pp. 155–164 (2012). http://doi.acm.org/10.1145/2259016.2259037

Download references

Acknowledgement

This work was supported by the EC (ERDF), the NESUS IC1315 COST Action, the Spanish Ministry of Economy, Industry and Competitiveness through the TIN2016-78919-R project, and the CAPAP-H network.

Author information

Authors and Affiliations

Universidad de La Laguna, San Cristóbal de La Laguna, Spain
Sergio Afonso, Alejandro Acosta & Francisco Almeida

Authors

Sergio Afonso
View author publications
You can also search for this author in PubMed Google Scholar
Alejandro Acosta
View author publications
You can also search for this author in PubMed Google Scholar
Francisco Almeida
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sergio Afonso .

Editor information

Editors and Affiliations

Inria, Rennes, France
Shadi Ibrahim
University of Texas at San Antonio, San Antonio, Texas, USA
Kim-Kwang Raymond Choo
Aalto University, Espoo, Finland
Zheng Yan
University of Alberta, Edmonton, Alberta, Canada
Witold Pedrycz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Afonso, S., Acosta, A., Almeida, F. (2017). Automatic Acceleration of Stencil Codes in Android Devices. In: Ibrahim, S., Choo, KK., Yan, Z., Pedrycz, W. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2017. Lecture Notes in Computer Science(), vol 10393. Springer, Cham. https://doi.org/10.1007/978-3-319-65482-9_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-65482-9_6
Published: 11 August 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-65481-2
Online ISBN: 978-3-319-65482-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics