Skip to main content
Log in

Improving Code Density with Variable Length Encoding Aware Instruction Scheduling

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

Variable length encoding can considerably decrease code size in VLIW processors by reducing the number of bits wasted on encoding No Operations(NOPs). A processor may have different instruction templates where different execution slots are implicitly NOPs, but all combinations of NOPs may not be supported by the instruction templates. The efficiency of the NOP encoding can be improved by the compiler trying to place NOPs in such way that the usage of implicit NOPs is maximized. Two different methods of optimizing the use of the implicit NOP slots are evaluated: (a) prioritizing function units that have fewer implicit NOPs associated with them and (b) a post-pass to the instruction scheduler which utilizes the slack of the schedule by rescheduling operations with slack into different instruction words so that the available instruction templates are better utilized. Three different methods for selecting basic blocks to apply FU priorization on are also analyzed: always, always outside inner loops, and only outside inner loops only in basic blocks after testing where it helped to decrease code size. The post-pass optimizer alone saved an average of 2.4 % and a maximum of 10.5 % instruction memory, without performance loss. Prioritizing function units in only those basic blocks where it helped gave the best case instruction memory savings of 10.7 % and average savings of 3.0 % in exchange for an average 0.3 % slowdown. Applying both of the optimizations together gave the best case code size decrease of 12.2 % and an average of 5.4 %, while performance decreased on average by 0.1 %.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6

Similar content being viewed by others

References

  1. Corporaal, H., & Arnold, M. (1998). Using Transport Triggered Architectures for embedded processor design. Integrated Computer-Aided Engineering, 5(1), 19–38.

    Google Scholar 

  2. Conte, T.M., Banerjia, S., Larin, S.Y., Menezes, K.N., & Sathaye, S.W. (1996). Instruction fetch mechanisms for VLIW architectures with compressed encodings. In Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture (pp. 201–211).

  3. Aditya, S., Mahlke, S. A., & Rau, B. R. (2000). Code size minimization and retargetable assembly for custom EPIC and VLIW instruction formats. ACM Transactions on Design Automation of Electronic Systems, 5(4), 752–773.

    Article  Google Scholar 

  4. Helkala, J., Viitanen, T., Kultala, H., Jääskeläinen, P., Takala, J., Zetterman, T., & Berg, H. (2014). Variable length instruction compression on transport triggered architectures. In Proceedings of the International Conference on Embedded Computing Systems: Architectures Modeling and Simulation (pp. 149–155). Samos, Greece.

  5. Kultala, H., Viitanen, T., Jääskelainen, P., Helkala, J., & Takala, J. (2014). Compiler optimizations for code density of variable length instructions. In Proceedings of the IEEE Workshop on Signal Processing Systems (pp. 1–6).

  6. Lee, C., Lee, J.K., & Hwang, T. (2000). Compiler optimization on instruction scheduling for low power. In Proceedings of the 13th International Symposium on System Synthesis (pp. 55–60).

  7. Hahn, T.T., Stotzer, E., Sule, D., & Asal, M. (2008). Compilation strategies for reducing code size on a VLIW processor with variable length instructions. In Proceedings of the 3rd International Conference on High Performance Embedded Architectures and Compilers (pp. 147–160). Berlin Heidelberg: Springer-Verlag.

    Chapter  Google Scholar 

  8. Stotzer, E.J., & Leiss, E.L. (2012). Co-design of compiler and hardware techniques to reduce program code size on a vliw processor. CLEI Electronic Journal, 15(2), 2–2.

    Google Scholar 

  9. Jee, S., & Palaniappan, K. (2002). Performance evaluation for a compressed-VLIW processor. In Proceedings of the ACM Symposium on Applied Computing (pp. 913–917).

  10. Ros, M., & Sutton, P. (2005). A post-compilation register reassignment technique for improving hamming distance code compression. In Proceedings of the 2005 International Conference on Compilers, Architectures and Synthesis for Embedded Systems (pp. 97–104).

  11. Larin, S.Y., & Conte, M.T. (1999). Compiler-driven cached code compression schemes for embedded ilp processors. In Proceedings of the 32nd Annual IEEE/ACM International Symposium on Microarchitecture (pp. 82–92): IEEE.

  12. Haga, S., Webber, A., Zhang, Y., Nguyen, N., & Barua, R. (2005). Reducing code size in VLIW instruction scheduling. Journal of Embedded Computing, 1(3), 415–433.

    Google Scholar 

  13. Haga, S., & Barua, R. (2001). EPIC instruction scheduling based on optimal approaches. In Proceedings of the First Annual Workshop on Explicitly Parallel Instruction Computing Architectures and Compiler Technology (pp. 22–31).

  14. Muchnick, S.S. (1997). Advanced Compiler Design and Implementation: Morgan Kaufmann.

  15. Hara, Y., Tomiyama, H., Honda, S., & Takada, H. (2009). Proposal and quantitative analysis of the CHStone benchmark program suite for practical C-based high-level synthesis. Journal of Information Processing, 17, 242–254.

    Article  Google Scholar 

  16. Jääskeläinen, P., Guzma, V., Cilio, A., & Takala, J. (2007). Codesign toolset for application-specific instruction-set processors. In Proceedings of SPIE Multimedia on Mobile Devices (pp. 65070X–1 – 65070X–11).

  17. Viitanen, T., Kultala, H., Jääskeläinen, P., & Takala, J. (2014). Heuristics for greedy transport triggered architecture interconnect exploration. In Proceedings of the 2014 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (pp. 2:1–2:7).

  18. Fisher, J.A., Faraboschi, P., & Young, C. (2005). Embedded Computing: A VLIW Approach to Architecture, Compilers and Tools: Elsevier.

Download references

Acknowledgments

This work was funded by Academy of Finland (funding decision 253087), Finnish Funding Agency for Technology and Innovation (project ”Parallel Acceleration”, funding decision 40115/13), and ARTEMIS Joint Undertaking under grant agreement no 621439 (ALMARVI).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Heikki Kultala.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kultala, H., Viitanen, T., Jääskeläinen, P. et al. Improving Code Density with Variable Length Encoding Aware Instruction Scheduling. J Sign Process Syst 84, 435–446 (2016). https://doi.org/10.1007/s11265-015-1081-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-015-1081-6

Keywords

Navigation