ARM/NEON Co-design of Multiplication/Squaring

Seo, Hwajeong; Park, Taehwan; Ji, Janghyun; Hu, Zhi; Kim, Howon

doi:10.1007/978-3-319-93563-8_7

Hwajeong Seo¹⁵,
Taehwan Park¹⁶,
Janghyun Ji¹⁶,
Zhi Hu¹⁷ &
…
Howon Kim¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 10763))

Included in the following conference series:

International Workshop on Information Security Applications

1130 Accesses

Abstract

Many modern mobile processors support new SIMD extensions (e.g. NEON engine) and previous applications (e.g. image processing, cryptography) written in SISD are accelerated by re-writing the previous implementations in SIMD instruction sets. Particularly, integer multiplication and squaring operations are the most expensive in Public Key Cryptography (PKC). Many works have been conducted to reduce the execution timing in NEON instruction set. However, ARM–NEON processor also supports powerful ARM instruction set as well. By exploiting the ARM instruction together with NEON engine, we can achieve further improved performance. After this observation, we introduce new parallel approach for integer multiplication and squaring operations on ARM–NEON processors. Unlike previous implementations, we mix-use both ARM and NEON instructions to hide computation latency for ARM into NEON. Since ARM and NEON modules are separated units, the assignments are successfully issued independently. The integer multiplication and squaring are finely divided into several sub-tasks and the sub-tasks are properly assigned to ARM and NEON in order to balance the workloads. Finally, the proposed implementations outperform the best-known results on the identical ARM–NEON processors by 22.4% and 18.3% for 2048-bit integer multiplication and squaring, respectively.

This work was supported by the Energy Efficiency & Resources Core Technology Program of the Korea Institute of Energy Technology Evaluation and Planning (KETEP), granted financial resource from the Ministry of Trade, Industry & Energy, Republic of Korea. (No. 20152000000170). Hwajeong Seo was supported by the MSIP (Ministry of Science, ICT and Future Planning), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2017-2014-0-00743) supervised by the IITP (Institute for Information & communications Technology Promotion). Zhi Hu was partially supported by the Natural Science Foundation of China (Grant No. 61602526).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

SMCOS: Fast and Parallel Modular Multiplication on ARM NEON Architecture for ECC

Montgomery Modular Multiplication on ARM-NEON Revisited

PhiRSA: Exploiting the Computing Power of Vector Instructions on Intel Xeon Phi for RSA

Notes

1.
When m-bit multiplication is required, one m-bit multiplication is evenly divided into 4 m/2-bit multiplication operations. Among them 3 m/2-bit multiplication is performed in COS method on NEON engine. On ARM processor, m/2-bit multiplication is performed in hybrid-scanning method (width: 64-bit).
2.
$\mathtt{umlal\; a,b,c,d:}\; \{\mathtt{b,a}\} \leftarrow \{\mathtt{b,a}\}+ \mathtt{c} \times \mathtt{d}$.
3.
If we define multi-core processing through OpenMP library and execute multiple threads, the performance is enhanced by the number of threads.

References

Bernstein, D.J., Schwabe, P.: NEON crypto. In: Prouff, E., Schaumont, P. (eds.) CHES 2012. LNCS, vol. 7428, pp. 320–339. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33027-8_19
Chapter Google Scholar
Bos, J.W., Montgomery, P.L., Shumow, D., Zaverucha, G.M.: Montgomery multiplication using vector instructions. In: Lange, T., Lauter, K., Lisoněk, P. (eds.) SAC 2013. LNCS, vol. 8282, pp. 471–489. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-43414-7_24
Chapter Google Scholar
Koziel, B., Jalali, A., Azarderakhsh, R., Jao, D., Mozaffari-Kermani, M.: NEON-SIDH: efficient implementation of supersingular isogeny Diffie-Hellman Key exchange protocol on ARM. In: Foresti, S., Persiano, G. (eds.) CANS 2016. LNCS, vol. 10052, pp. 88–103. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48965-0_6
Chapter Google Scholar
Martins, P., Sousa, L.: On the evaluation of multi-core systems with SIMD engines for public-key cryptography. In: 2014 International Symposium on Computer Architecture and High Performance Computing Workshop (SBAC-PADW), pp. 48–53. IEEE (2014)
Google Scholar
Martins, P., Sousa, L.: Stretching the limits of programmable embedded devices for public-key cryptography. In: Proceedings of the Second Workshop on Cryptography and Security in Computing Systems, pp. 19–24. ACM (2015)
Google Scholar
Pabbuleti, K.C., Mane, D.H., Desai, A., Albert, C., Schaumont, P.: SIMD acceleration of modular arithmetic on contemporary embedded platforms. In: 2013 IEEE High Performance Extreme Computing Conference (HPEC), pp. 1–6. IEEE (2013)
Google Scholar
Seo, H., Liu, Z., Choi, J., Kim, H.: Multi-precision squaring for public-key cryptography on embedded microprocessors. In: Paul, G., Vaudenay, S. (eds.) INDOCRYPT 2013. LNCS, vol. 8250, pp. 227–243. Springer, Cham (2013). https://doi.org/10.1007/978-3-319-03515-4_15
Chapter Google Scholar
Seo, H., Liu, Z., Großschädl, J., Choi, J., Kim, H.: Montgomery modular multiplication on ARM-NEON revisited. In: Lee, J., Kim, J. (eds.) ICISC 2014. LNCS, vol. 8949, pp. 328–342. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-15943-0_20
Chapter Google Scholar
Seo, H., Liu, Z., Großschädl, J., Kim, H.: Efficient arithmetic on ARM-NEON and its application for high-speed RSA implementation. IACR Cryptology ePrint Archive 2015:465 (2015)
Google Scholar
Seo, H., Liu, Z., Nogami, Y., Park, T., Choi, J., Zhou, L., Kim, H.: Faster ECC over $\mathbb{F}_{2^{521}-1}$ (feat. NEON). In: Kwon, S., Yun, A. (eds.) ICISC 2015. LNCS, vol. 9558, pp. 169–181. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-30840-1_11
Chapter Google Scholar
Seo, H., et al.: Parallel implementations of LEA, revisited. In: Choi, D., Guilley, S. (eds.) WISA 2016. LNCS, vol. 10144, pp. 318–330. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-56549-1_27
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Hansung University, Seoul, Republic of Korea
Hwajeong Seo
Pusan National University, Busan, Republic of Korea
Taehwan Park, Janghyun Ji & Howon Kim
Central South University, Changsha, China
Zhi Hu

Authors

Hwajeong Seo
View author publications
You can also search for this author in PubMed Google Scholar
Taehwan Park
View author publications
You can also search for this author in PubMed Google Scholar
Janghyun Ji
View author publications
You can also search for this author in PubMed Google Scholar
Zhi Hu
View author publications
You can also search for this author in PubMed Google Scholar
Howon Kim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Howon Kim .

Editor information

Editors and Affiliations

KAIST, Daejeon, Korea (Republic of)
Brent ByungHoon Kang
Georgia Institute of Technology, Atlanta, Georgia, USA
Taesoo Kim

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Seo, H., Park, T., Ji, J., Hu, Z., Kim, H. (2018). ARM/NEON Co-design of Multiplication/Squaring. In: Kang, B., Kim, T. (eds) Information Security Applications. WISA 2017. Lecture Notes in Computer Science(), vol 10763. Springer, Cham. https://doi.org/10.1007/978-3-319-93563-8_7

Download citation

DOI: https://doi.org/10.1007/978-3-319-93563-8_7
Published: 23 June 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93562-1
Online ISBN: 978-3-319-93563-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics