Skip to main content

Performance Evaluation of NPB and SPEC CPU2006 on Various SIMD Extensions

  • Conference paper
  • First Online:
Big Data Computing and Communications (BigCom 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9196))

Included in the following conference series:

  • 2259 Accesses

Abstract

Nowadays, almost all the processors are integrated with SIMD extensions, with which significant speedup is obtained for the programs in multimedia and scientific computation. The length of SIMD vector register has been increasing all the time. For instance, the original length of SIMD extension components is 64-bit in MMX. It then rises to 128-bit in SSE and further 256-bit in AVX. The new Intel Many Integrated Core (MIC) architecture supports 512-bits SIMD. Though a higher speedup is theoretically possible as the vector length increases, more complex and efficient instructions are required to support the vectorization. We analyze the vectorization performance of NPB and SPEC CPU2006 with the increase of vector length and different SIMD instruction sets of SSE, AVX, and IMCI, based on which some advice are given for the vector length and instruction set design.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Ramachandran, A., Vienne, J., Van Der Wijngaart, R.: Performance evaluation of NAS parallel benchmarks on Intel Xeon Phi. In: 42nd International Conference on Parallel Processing (2013)

    Google Scholar 

  2. Pennycook, S., Hughes, C., Smelyanskiy, M., Jarvis, S.: Exploring simd for molecular dynamics, using intel xeon processors and intel xeon phi coprocessors. In: IPDPS (2013)

    Google Scholar 

  3. Liu, X., Smelyanskiy, M., Chow, E., Dubey, P.: Efficient sparse matrix-vector multiplication on x86-based many-core processors. In: Proceedings of the 27th International ACM Conference on Supercomputing, pp. 273–282. ACM (2013)

    Google Scholar 

  4. Huo, X., Ren, B., Agrawal, G.: A Programming system for Xeon Phis with runtime SIMD parallelization. In: ICS (2014)

    Google Scholar 

  5. Mytkowicz, T., Marron, M.: Single-Core Performance is Still Relevant in the Multi-Core Era

    Google Scholar 

  6. Park, Y., Park, J.J.K., Park, H.: Tailoring SIMD execution using heterogeneous hardware and dynamic configurability. In: Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) (2012)

    Google Scholar 

  7. Nuzman, D., Zaks, A.: Outer-loop vectorization-revisited for short SIMD architectures. In: Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT) (2008)

    Google Scholar 

  8. Trifunovic, K., Nuzman, D., Cohen, A., et al.: Polyhedral-model guided loop-nest auto-vectorization. In: Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT) (2009)

    Google Scholar 

  9. Kong, M., Veras, R., Stock, K.: When polyhedral transformations meet SIMD code generation. In: Proceedings of the Conference on Programming Language Design and Implementation (PLDI) (2013)

    Google Scholar 

  10. Larsen, S., Amarasinghe, S.: Exploiting superword level parallelism with multimedia instruction sets. In: Proceedings of the Conference on Programming Language Design and Implementation (PLDI), pp. 145–156 (2000)

    Google Scholar 

  11. Liu, J., Zhang, Y., Kandemir, M.: A compiler framework for extracting superword level parallelism. In: Proceedings of the Conference on Programming Language Design and Implementation (PLDI) (2012)

    Google Scholar 

  12. Barik, R., Zhao, J., Sarkar, V.: Efficient selection of vector instructions using dynamic programming. In: Proceedings of the 43rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) (2010)

    Google Scholar 

  13. Rosen, I., Nuzman, D., Zaks, A.: Loop-aware SLP in GCC. In: Proceedings of GCC Developers’ Summit, pp. 131–142 (2007)

    Google Scholar 

  14. Kumar, R., Martínez, A.: Speculative dynamic vectorization for HW/SW codesigned processors. In: Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT) (2012)

    Google Scholar 

  15. Karrenberg, R., Hack, S.: Whole-function vectorization. In: Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO) (2011)

    Google Scholar 

  16. Eichenberger, A.E., Peng, W., O’Brien, K.: Vectorization for SIMD architectures with alignment constraints. SIGPLAN 39(6), 82–93 (2004)

    Article  Google Scholar 

  17. Kudriavtsev, A., Kogge, P.: Generation of permutations for SIMD processors. In: LCTES 2005, pp. 147–156. ACM, New York (2005)

    Google Scholar 

  18. Nuzman, D., Rosen, I., Zaks, A.: Auto-vectorization of interleaved data for simd. In: Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2006, pp. 132–143. ACM, New York (2006)

    Google Scholar 

  19. Shin, J., Hall, M., Chame, J.: Superword-level parallelism in the presence of control flow. In: CGO (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bo Zhao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Zhao, B., Gao, W., Zhao, R., Han, L., Sun, H., Li, Y. (2015). Performance Evaluation of NPB and SPEC CPU2006 on Various SIMD Extensions. In: Wang, Y., Xiong, H., Argamon, S., Li, X., Li, J. (eds) Big Data Computing and Communications. BigCom 2015. Lecture Notes in Computer Science(), vol 9196. Springer, Cham. https://doi.org/10.1007/978-3-319-22047-5_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-22047-5_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-22046-8

  • Online ISBN: 978-3-319-22047-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics