Skip to main content

Advertisement

Log in

A Case Study on Compiler Optimizations for the Intel® CoreTM 2 Duo Processor

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

The complexity of modern processors poses increasingly more difficult challenges to software optimization. Modern optimizing compilers have become essential tools for leveraging the power of recent processors by means of high-level optimizations to exploit multi-core platforms and single-instruction-multiple-data (SIMD) instructions, as well as advanced code generation to deal with microarchitectural performance aspects. Using the Intel® CoreTM 2 Duo processor and Intel Fortran/C++ compiler as a case study, this paper gives a detailed account of the sort of optimizations required to obtain high performance on modern processors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Allen J.R. and Kennedy K. (1987). Automatic translation of Fortran programs to vector form. ACM T. Progr. Lang. Sys. 9: 491–542

    Article  MATH  Google Scholar 

  2. Bik A.J.C. (2004). The Software Vectorization Handbook. Intel Press, Hillsboro, OR

    Google Scholar 

  3. Bik A.J.C., Girkar M., Grey P.M. and Tian X. (1998). Automatic intra-register vectorization for the Intel architecture. Int. J. Parallel Process. 30: 65–98

    Article  Google Scholar 

  4. Callahan, D., Cooper, K.D., Kennedy, K., Torczon, L.: Interprocedural constant propagation. In: SIGPLAN ’86 Symposium on Compiler Construction, pp. 152–161. July 1986

  5. Chandra, R., Dagum, L., Kohr, D., Maydan, D., McDonald, H., Menon, R.: Parallel Programming in OpenMP. Morgan Kaufmann Publishers Inc. (2001)

  6. Eichenberger, A., Wu, P., O’Brien, K.: Vectorization for SIMD architectures with alignment constraints. In: Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation, pp. 82–93. Washington DC, June 2004

  7. Hennessy J.L. and Patterson D.A. (1990). Computer Architecture: A Quantitative Approach. Morgan Kaufmann Publishers, San Mateo, Californa

    Google Scholar 

  8. Intel Corporation. Intel Architecture Software Developer’s Manual, vol. 1: Basic Architecture. Intel Corporation (available at http://developer.intel.com/) (2007)

  9. Krall A. and Lelait S. (2000). Compilation techniques for multi-media processors. Int. J. Parallel Prog. 28(4): 347–361

    Article  Google Scholar 

  10. Larsen, S., Amarasinghe, S.: Exploiting Superword level parallelism with multimedia instruction sets. In: Proceeding of the SIGPLAN Conference on Programming Language Design and Implementation. Vancouver, B.C., June 2000

  11. Larsen, S., Witchel, E., Amarasinghe, S.: Increasing and detecting memory address congruence. In: Proceedings of the 11th International Conference on Parallel Architectures and Compilation Techniques. Charlottesville, VA, September 2002

  12. McCalpin, J.D.: Memory bandwidth and machine balance in current high performance computers. IEEE Computer Society Technical Committee on Computer Architecture (TCCA) Newsletter, December (1995)

  13. Muchnick S. (1997). Advanced Compiler Design and Implementation. Morgan Kaufmann Publishers, San Mateo, CA

    Google Scholar 

  14. Pryanishnikov, I., Krall, A., Horspool, N.: Pointer alignment analysis for processors with SIMD instructions. In: Proceedings of the 5th Workshop on Media and Streaming Processors. San Diego, CA, December 2003

  15. Tian, X., Bik, A.J.C., Girkar, M., Grey, P.M., Saito, H., Su, E.: Intel® OpenMP C++/Fortran compiler for hyper-threading technology: implementation and performance. Intel Technol. J. 6(1) (2002)

  16. Tian X., Gikar M., Bik A.J.C. and Saito H. (2005). Practical compiler techniques on efficient multithreaded code generation for OpenMP programs. Comput. J. 48(5): 558–601

    Article  Google Scholar 

  17. Wolfe M.J. (1996). High Performance Compilers for Parallel Computing. Addison-Wesley, Redwood City, California

    MATH  Google Scholar 

  18. Zima H. (1990). Supercompilers for Parallel and Vector Computers. ACM Press, New York

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aart J. C. Bik.

Additional information

The first author was working for Intel Corp. when the paper was written, but moved to Google Inc. since.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bik, A.J.C., Kreitzer, D.L. & Tian, X. A Case Study on Compiler Optimizations for the Intel® CoreTM 2 Duo Processor. Int J Parallel Prog 36, 571–591 (2008). https://doi.org/10.1007/s10766-008-0071-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10766-008-0071-8

Keywords

Navigation