ABSTRACT
In modern software development, high-level languages are becoming progressively more feature-rich. The expressiveness and increased abstraction provided by these features allow programmers to be more productive and less concerned with low-level details. It is then the compiler's job to strip these layers of abstraction to actually implement its language's features. In contrast to average developers, compiler engineers operate within the compiler's infrastructure, looking for opportunities to optimize code or analyze programs. However, given their vantage point, these developers often assume that the program representation they use contains near-complete information of what will end up in the program's binary. We show that this is not quite the case. To this end, we introduce the notion of invisible instructions, which are present in the binary but are not visible in the program's compiler-generated intermediate representation. We use static analysis and profiling techniques to measure the prevalence of these instructions for a wide variety of programs in several benchmark suites, and show that for some instruction types, up to 36% of their occurrences on average are invisible.
- Gene M. Amdahl. 1967. Validity of the Single Processor Approach to Achieving Large Scale Computing Capabilities. In Proceedings of the April 18--20, 1967, Spring Joint Computer Conference (AFIPS '67 (Spring)). ACM, New York, NY, USA, 483--485. Google ScholarDigital Library
- Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman, and F. Kenneth Zadeck. 1991. Efficiently Computing Static Single Assignment Form and the Control Dependence Graph. ACM Trans. Program. Lang. Syst. 13, 4 (Oct. 1991), 451--490. Google ScholarDigital Library
- Arnaldo Carvalho De Melo. 2010. The new linuxfiperffitools. In Slides from Linux Kongress, Vol. 18.Google Scholar
- Gregory J. Duck and Roland H. C. Yap. 2016. Heap Bounds Protection with Low Fat Pointers. In CC. ACM, New York, NY, USA, 132--142. Google ScholarDigital Library
- Sumit Gulwani. 2010. Dimensions in Program Synthesis. In Proceedings of the 12th International ACM SIGPLAN Symposium on Principles and Practice of Declarative Programming (PPDP '10). ACM, New York, NY, USA, 13--24. Google ScholarDigital Library
- Sumit Gulwani. 2011. Automating String Processing in Spreadsheets Using Input-output Examples. In Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL '11). ACM, New York, NY, USA, 317--330. Google ScholarDigital Library
- Sumit Gulwani, Susmit Jha, Ashish Tiwari, and Ramarathnam Venkatesan. 2011. Synthesis of Loop-free Programs. In Proceedings of the 32Nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '11). ACM, New York, NY, USA, 62--73. Google ScholarDigital Library
- Matthew R Guthaus, Jeffrey S Ringenberg, Dan Ernst, Todd M Austin, Trevor Mudge, and Richard B Brown. 2001. MiBench: A free, commercially representative embedded benchmark suite. In Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop on. IEEE, 3--14. Google ScholarDigital Library
- John L Henning. 2006. SPEC CPU2006 benchmark descriptions. ACM SIGARCH Computer Architecture News 34, 4 (2006), 1--17. Google ScholarDigital Library
- Chris Lattner and Vikram Adve. 2004. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In Proceedings of the International Symposium on Code Generation and Optimization: Feedback-directed and Runtime Optimization (CGO '04). IEEE Computer Society, Washington, DC, USA, 75--. http://dl.acm.org/citation.cfm?id=977395.977673 Google ScholarDigital Library
- Chunho Lee, Miodrag Potkonjak, and William H Mangione-Smith. 1997. Media-bench: A tool for evaluating and synthesizing multimedia and communications systems. In Microarchitecture, 1997. Proceedings., Thirtieth Annual IEEE/ACM International Symposium on. IEEE, 330--335. Google ScholarDigital Library
- Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. 2005. Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '05). ACM, New York, NY, USA, 190--200. Google ScholarDigital Library
- Steven S. Muchnick. 1997. Advanced Compiler Design and Implementation. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA. Google ScholarDigital Library
- Nicholas Nethercote and Julian Seward. 2007. Valgrind: A Framework for Heavy-weight Dynamic Binary Instrumentation. In PLDI. ACM, New York, NY, USA, 89--100. Google ScholarDigital Library
- David A Patterson and John L Hennessy. 2013. Computer Organization and Design MIPS Edition: The Hardware/Software Interface. Newnes. Google ScholarDigital Library
- Konstantin Serebryany, Derek Bruening, Alexander Potapenko, and Dmitry Vyukov. 2012. AddressSanitizer: A Fast Address Sanity Checker. In USENIX. USENIX Association, Berkeley, CA, USA, 28--28. Google ScholarDigital Library
- Raja Vallée-Rai, Phong Co, Etienne Gagnon, Laurie Hendren, Patrick Lam, and Vijay Sundaresan. 1999. Soot - a Java Bytecode Optimization Framework. In CASCON. IBM Press, 13--.Google Scholar
- Reinhold P Weicker. 1984. Dhrystone: a synthetic systems programming benchmark. Commun. ACM 27, 10 (1984), 1013--1030. Google ScholarDigital Library
- Jianzhou Zhao, Santosh Nagarakatte, Milo M.K. Martin, and Steve Zdancewic. 2013. Formal Verification of SSA-based Optimizations for LLVM. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '13). ACM, New York, NY, USA, 175--186. Google ScholarDigital Library
Index Terms
- More than meets the eye: invisible instructions
Recommendations
Increasing the Instruction Fetch Rate via Block-Structured Instruction Set Architectures
To exploit larger amounts of instruction level parallelism, processors are being built with wider issue widths and larger numbers of functional units. Instruction fetch rate must also be increased in order to effectively exploit the performance ...
The Superthreaded Processor Architecture
The common single-threaded execution model limits processors to exploiting only the relatively small amount of instruction-level parallelism available in application programs. The superthreaded processor, on the other hand, is a concurrent multithreaded ...
Chainsaw: Using Binary Matching for Relative Instruction Mix Comparison
PACT '09: Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation TechniquesWith advances in hardware, instruction set architectures are undergoing continual evolution. As a result, compilers are under constant pressure to adapt and take full advantage of available features. However, current techniques for evaluating relative ...
Comments