Skip to main content
Log in

Combining Extended Retiming and Unfolding for Rate-Optimal Graph Transformation

  • Published:
Journal of VLSI signal processing systems for signal, image and video technology Aims and scope Submit manuscript

Abstract

Many computation-intensive iterative or recursive applications commonly found in digital signal processing and image processing applications can be represented by data-flow graphs (DFGs). The execution of all tasks of a DFG is called an iteration, with the average computation time of an iteration the iteration period. A great deal of research has been done attempting to optimize such applications by applying various graph transformation techniques to the DFG in order to minimize this iteration period. Two of the most popular are retiming and unfolding, which can be performed in tandem to achieve an optimal iteration period. However, the result is a transformed graph which is much larger than the original DFG. To the authors’ knowledge, there is no technique which can be combined with minimal unfolding to transform a DFG into one whose iteration period matches that of the optimal schedule under a pipelined design. This paper proposes a new technique, extended retiming, which does just this. We construct the appropriate retiming functions and design an efficient retiming algorithm which may be applied directly to a DFG instead of the larger unfolded graph. Finally, we show through experiments the effectiveness of our algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. L.-F. Chao and E.H.-M. Sha, “Scheduling Data-Flow Graphs via Retiming and Unfolding,” IEEE Transactions on Parallel and Distributed Systems, vol. 8, 1997, pp. 1259–1267.

    Article  Google Scholar 

  2. A. Zaky and P. Sadayappan, “Optimal Static Scheduling of Sequential Loops on Multiprocessors,” in Proceedings of the International Conference on Parallel Processing, 1992, pp. III 130–137.

  3. C.E. Leiserson and J.B. Saxe, “Retiming Synchronous Circuitry,” Algorithmica, vol. 6, 1991, pp. 5–35.

    Article  MathSciNet  MATH  Google Scholar 

  4. S.Y. Kung, J. Whitehouse, and T. Kailath, VLSI and Modern Signal Processing, Prentice Hall, 1985.

  5. L.-F. Chao and E.H.-M. Sha, “Retiming and Unfolding Data-Flow Graphs,” in Proceedings of the International Conference on Parallel Processing, 1992, pp. II 33–40.

  6. M. Lam, “Software Pipelining: An Effective Scheduling Technique for VLIW Machines,” in Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 1988, pp. 318–328.

  7. T.W. O’Neil, S. Tongsima, and E.H.-M. Sha, “Extended Retiming: Optimal Retiming via a Graph-Theoretical Approach,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 4, 1999, pp. 2001–2004.

    Google Scholar 

  8. T.W. O’Neil, S. Tongsima, and E.H.-M. Sha, “Optimal Scheduling of Data-Flow Graphs Using Extended Retiming,” in Proceedings of the ISCA 12th International Conference on Parallel and Distributed Computing Systems, 1999, pp. 292–297.

  9. T.W. O’Neil and E.H.-M. Sha, “Rate-Optimal Graph Transformation via Extended Retiming and Unfolding,” in Proceedings of the IASTED 11th International Conference on Parallel and Distributed Computing and Systems, vol. 10, 1999, pp. 764–769.

    Google Scholar 

  10. T.W. O’Neil and E.H.-M. Sha, “Optimal Graph Transformation using Extended Retiming with Minimal Unfolding,” in Proceedings of the IASTED 12th International Conference on Parallel and Distributed Computing and Systems, 2000, pp. 128–133.

  11. K.K. Parhi and D.G. Messerschmitt, “Static Rate-Optimal Scheduling of Iterative Data-Flow Programs via Optimum Unfolding,” IEEE Transactions on Computers, vol. 40, 1991, pp. 178–195.

    Article  Google Scholar 

  12. P.-Y. Calland, A. Darte, and Y. Robert, “Circuit Retiming Applied to Decomposed Software Pipelining,” IEEE Transactions on Parallel and Distributed Systems, vol. 9, 1998, pp. 24–35.

    Article  Google Scholar 

  13. F. Sanchez and J. Cortadella, “Reducing Register Pressure in Software Pipelining,” Journal of Information Science and Engineering, vol. 14, 1998, pp. 265–279.

    Google Scholar 

  14. K.S. Chatha and R. Vemuri, “RECOD: A Retiming Heuristic to Optimize Resource and Memory Utilization in HW/SW Codesigns,” in Proceedings of the IEEE International Workshop on Hardware/Software Codesign, 1998, pp. 139–143.

  15. M. Sheliga, N.L. Passos, and E.H-M. Sha, “Fully Parallel Hardware/Software Codesign for Multi-Dimensional DSP Applications,” in Proceedings of the IEEE International Workshop on Hardware/Software Codesign, 1996, pp. 18–25.

  16. M. Renfors and Y. Neuvo, “The Maximum Sampling Rate of Digital Filters Under Hardware Speed,” Transactions on Circuits and Sampling, vol. CAS-28, 1981, pp. 196–202.

    Article  MathSciNet  Google Scholar 

  17. L.-F. Chao, “Scheduling and Behavioral Transformations for Parallel Systems,” PhD thesis, Dept. of Computer Science, Princeton University, 1993.

  18. L.-F. Chao and E. H.-M. Sha, “Static Scheduling for Synthesis of DSP Algorithms on Various Models,” Journal of VLSI Signal Processing, vol. 10, 1995, pp. 207–223.

    Article  Google Scholar 

  19. A. Dasdan and R.K. Gupta, “Faster Maximum and Minimum Mean Cycle Algorithms for System-Performance Analysis,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 17, 1998, pp. 889–899.

    Article  Google Scholar 

  20. E.A. Lee and D.G. Messerschmitt, “Static Scheduling of Synchronous Data-Flow Programs for Digital Signal Processing,” IEEE Transactions on Computers, vol. 36, 1987, pp. 24–35.

    Article  MATH  Google Scholar 

  21. G. Bilsen, M. Engels, R. Lauwereins, and J. Peperstraete, “Cyclo-Static Dataflow,” IEEE Transactions on Signal Processing, vol. 44, 1996, pp. 397–408.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

Timothy O’Neil received his Ph.D. in Computer Science and Engineering from the University of Notre Dame in 2002, where he was awarded the Arthur J. Schmitt Fellowship. He also received master’s degrees in mathematics (1991) and computer and information sciences (1993) from The Ohio State University in Columbus, Ohio. He is presently an Assistant Professor in the Computer Science Department at the University of Akron. His current research interests include loop transformations and data scheduling.

Edwin Hsing-Mean Sha received the B.S.E. degree in computer science and information engineering from National Taiwan University, Taipei, Taiwan, in 1986; he received the M.A. and Ph.D. degree from the Department of Computer Science, Princeton University, Princeton, NJ, in 1991 and 1992, respectively. From August 1992 to August 2000, he was with the Department of Computer Science and Engineering at University of Notre Dame, Notre Dame, IN. He served as Associate Chairman for Graduate Studies from 1995 to 2000. Since 2000, he has been a tenured full professor in the Department of Computer Science at the University of Texas at Dallas.

He has published more than 170 research papers in referred conferences and journals. He has been serving as an editor for several journals such as IEEE Transactions on Signal Processing and Journal of VLSI Signal Processing. He also served as program committee members in numerous conferences. He received Oak Ridge Association Junior Faculty Enhancement Award in 1994, and NSF CAREER Award. He was a guest editor for the special issue on Low Power Design of IEEE Transactions on VLSI Systems in 1997. He also served as the program chairs for the International Conference on Parallel and Distributed Computing Systems (PDCS), 2000 and PDCS 2001. He received Teaching award in 1998.

Rights and permissions

Reprints and permissions

About this article

Cite this article

O’Neil, T.W., Sha, E.HM. Combining Extended Retiming and Unfolding for Rate-Optimal Graph Transformation. J VLSI Sign Process Syst Sign Image Video Technol 39, 273–293 (2005). https://doi.org/10.1007/s11265-005-4845-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-005-4845-6

Keywords

Navigation