Abstract
To meet strict speed and power requirements for embedded applications, many high-end digital Signal Processors (DSPs) commonly employ non-orthogonal architectures that are typically characterized by irregular data paths, heterogeneous registers, and multiple memory banks. Obviously to harvest the benefits provided by this non-orthogonal architecture sufficient compiler support is necessary and important. However, the complexity of such architectures presents a great challenge to compiler design and the usual compilation techniques for general-purpose CPUs do not adapt well to the irregularity of DSP. The entire code generation process must include the following phases: intermediate representation, code compaction, instruction scheduling, memory bank assignment (or variable partition), and register/accumulator assignment. Much related research only considers some phases, which is inadequate. In this paper, we present an effective code generation algorithm named Rotation Scheduling with Spill Codes Predicting (RSSP) to maximally exploit the benefits of non-orthogonal architectures. It contains six parts that cover almost the entire phases of the code generation process. As well as introducing the detailed principles and algorithms of the proposed RSSP, we use an analytic model to evaluate its preliminary performance. Evaluation results clearly demonstrate the effectiveness of the proposed method. Furthermore, we also present some preliminary ideas to generalize RSSP, which can make it more practicable and suit various DSPs with similar architectural features.
Similar content being viewed by others
References
Z. Wang and X. S. Hu, “Power Aware Variable Partitioning and Instruction Scheduling for Multiple Memory Banks,” Proc. of Design, Automation and Test in Europe Conference and Exhibition, vol. 1, 2004, pp. 312–317.
J. Eyre and J. Bier, “The Evolution of DSP Processors,” IEEE Signal Process. Mag., vol. 17, no. 2, 2000, pp. 43–51.
P. Lapsley, J. Bier, A. Shoham, and E. A. Lee, DSP Processor Fundamentals: Architectures and Features, Berkeley Design Technology, Inc., 1996.
V. K. Madisetti, VLSI Digital Signal Processors: An Introduction to Rapid Prototyping and Design Synthesis, Butterworth-Heinemann, 1995.
Q. Zhuge, B. Xiao, and E. H.-M. Sha, “Exploring Variable Partitioning for Dual Data-memory Bank Processors,” Proc. of 34th International Symposium on Microarchitecture, 2001, pp. 45–52.
J. Cho, Y. Paek, and D. Whalley, “Efficient Register and Memory Assignment for Non-orthogonal Architectures via Graph Coloring and MST Algorithms,” Proc. of ACM Joint conference LCTES-SCOPES, 2002, pp. 130–138.
A. Sudarsanam and S. Malik, “Simultaneous Reference Allocation in Code Generation for Dual Data Memory Bank ASIPs,” ACM Transact. Des. Automat. Electron. Syst., vol. 5, no. 2, 2000, pp. 242–264.
W.-T. Shiue, “Energy-efficient Backend Compiler Design for Embedded Systems,” Proc. of 10th International Conference on Electrical and Electronic Technology, vol. 1, 2001, pp. 103–109.
C. Kessler and A. Bednarski, “Optimal Integrated Code Generation for Clustered VLIW Architectures,” Proc. of ACM Joint conference LCTES-SCOPES, 2002, pp. 102–111.
M. A. R. Saghir, P. Chow, and C. G. Lee, “Exploiting Dual-memory Banks in Digital Signal Processors,” Proc. of 7th International Conference on Architecture Support for Programming Language and Operating Systems, 1996, pp. 234–243.
Y.-H. Lee and C. Chen, “Efficient Variable Partitioning and Scheduling Methods of Multiple Memory Modules for DSP,” Proc. of 10th Workshop on Compiler Techniques for High-Performance Computing, 2004, pp. 80–89.
M. A. R. Saghir, P. Chow, and C. G. Lee, “Towards Better DSP Architectures and Compilers,” Proc. of International Conference on Signal Processing Applications and Technology, 1994, pp. 658–664.
R. Leupers and D. Kotte, “Variable Partitioning for Dual Memory Bank DSPs,” Proc. of International Conference on Acoustics, Speech, and Signal Processing, 2001, vol. 2, pp. 1121–1124.
J. M. Daveau, T. Thery, T. Lepley, and M. Santana, “A Retargetable Register Allocation Framework for Embedded Processors,” Proc. of ACM SIGPLAN/SIGBED, 2004, pp. 202–210.
B. Scholz and E. Eckstein, “Register Allocation for Irregular Architectures,” Proc. of ACM Joint conference LCTES-SCOPES, 2002, pp. 139–148.
X. Zhuang, T. Zhang, and S. Pande, “Hardware-managed Register Allocation for Embedded Processors,” Proc. of ACM SIGPLAN/SIGBED, 2004, pp. 192–201.
L. Lamport, “The Parallel Execution of DO Loops,” Comm. ACM (SIGPLAN), vol. 17, no. 2, 1974, pp. 82–93, Feb.
C. E. Leiserson and J. B. Saxe, “Retiming Synchronous Circuitry,” Algorithmica, vol. 6, no. 1, 1991, pp. 5–35.
DSP56000/DSP56001 Digital Signal Processor User’s Manual, Motorola Inc., Phoenix, AZ.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lee, YH., Chen, C. An Efficient Code Generation Algorithm for Non-orthogonal DSP Architecture. J VLSI Sign Process Syst Sign Image Video Technol 47, 281–296 (2007). https://doi.org/10.1007/s11265-007-0053-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-007-0053-x