Skip to main content

Advertisement

Log in

Transformations of a 3D Image Reconstruction Algorithm for Data Transfer and Storage Optimisation

  • Published:
Design Automation for Embedded Systems Aims and scope Submit manuscript

Abstract

When implementing a 3D image reconstruction algorithm on a DSP architecture, we find ourselves confronted with a large memory transfer overhead, reducing the possible speedup attainable on recent multi-media oriented architectures. This paper describes how the critical part of the algorithm is re-specified and aggressively transformed at the algorithm code level, to improve the data access locality of the multi-dimensional image signal, while preserving the input/output behaviour. Experiments show that a close to optimal reuse of the data in the foreground memory and registers is obtained, removing the data transfer and storage bottleneck and enabling real-time prototyping of the algorithm on a DSP architecture.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Amarasinghe, S., Anderson, J., Lam, M., and Tseng, C.1995. The SUIF compiler for scalable parallel machines. Proceedings of the 7th SIAM Conference on Parallel Processing for Scientific Computing.

  2. Anderson, J., Amarasinghe, S., and Lam, M. 1995. Data and computation transformations for multiprocessors. 5th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. pp. 39–50.

  3. Banerjee, P., Chandy, J., Gupta, M., Hodges, E., Holm, J., Lain, A., Palermo, D., Ramaswamy, S., and Su, E. 1995. The paradigm compiler for distributed-memory multicomputers. IEEE Computer Magazine 28(10): 37–47.

    Google Scholar 

  4. Banerjee, U. 1993. Loop Transformations for Restructuring Compilers: the Foundations. Kluwer, Boston.

    Google Scholar 

  5. Blake, A., McCowen, D., Lo, H., and Lindsey, P. 1993. Triconular active range-sensing. IEEE PAMI 15(5): 477–483.

    Google Scholar 

  6. Catthoor, F., Janssen, M., Nachtergaele, L., and Man, H. D. 1998a. System-level data-flow transformation exploration and power-area trade-offs demonstrated on video codecs. In M. Ibrahim and W. Wolf, editors, special issue on Systematic trade-off analysis in signal processing systems design. Journal of VLSI Signal Processing Boston: Kluwer. 18(1): 39–50.

    Google Scholar 

  7. Catthoor, F., Wuytack, S., Greef, E. D., Balasa, F., Nachtergaele, L., and Vandecappelle, A. 1998b. Custom memory management methodology—exploration of memory organisation for embedded multimedia system design, No. ISBN 0–7923–8288–9. Boston: Kluwer Acad. Publ.

    Google Scholar 

  8. Fang, J. and Lu, M. 1993. An iteration partition approach for cache or local memory thrashing on parallel processing. IEEE Trans. on Computers C-42(5): 529–546.

    Google Scholar 

  9. Gannon, D., Jalby, W., and Gallivan, K. 1988. Strategies for cache and local memory management by global program transformations. Journal of Parallel and Distributed Computing 5: 568–586.

    Google Scholar 

  10. Ghosh, S., Martonosi, M., and Malik, S. 1997. Cache miss equations: an analytical representation of cache misses. IEEE TC on Computer Architecture Newsletter Special issue on Interaction between Compilers and Computer Architectures. pp. 52–54.

  11. Greef, E. D., Catthoor, F., and Man, H. D. 1998. Program transformation strategies for memory size and power reduction of pseudo-regular multimedia subsystems. Transactions on Circuits and Systems for Video Technology 8(6): 719–733.

    Google Scholar 

  12. Hall, M., Anderson, J., Amarasinghe, S., Murphy, B., Liao, S., Bugnion, E., and Lam, M. 1996. Maximizing multiprocessor performance with the SUIF compiler. IEEE Computer Magazine 30(12): 84–89.

    Google Scholar 

  13. Kelly, W., and Pugh, W. 1992. Generating schedules and code within a unified reordering transformation framework. Technical report umiacs-tr–92–126, cs-tr-2995 Institute for Advanced Computer Studies Dept. of Computer Science, Univ. of Maryland, College Park, MD 20742.

    Google Scholar 

  14. Kolson, D., Nicolau, A., and Dutt, N. 1996. Elimination of redundant memory traffic in high-level synthesis. IEEE Trans. on Comp-aided Design 15(11): 1354–1363.

    Google Scholar 

  15. Li, W. and Pingali, K. 1992. A singular loop transformation framework based on non-singular matrices. Proc. 5th Annual Workshop on Languages and Compilers for Parallelism. New Haven, CN.

  16. Maruyama, M., and Abe, S. 1993. Range sensing by projecting multiple slits with random cuts. IEEE PAMI 15(6): 647–650.

    Google Scholar 

  17. McKinley, K., Carr, S., and Tseng, C.-W. 1996. Improving data locality with loop transformations. ACM Trans. on Programming Languages and Systems 18(4): 424–453.

    Google Scholar 

  18. McKinley, K., Hall, M., Harvey, T., Kennedy, K., McIntosh, N., Oldham, J., Paleczny, M., and Roth, G. 1993. Experiences using the ParaScope editor: an interactive parallel programming tool. 4th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. San Diego, USA.

  19. Passos, N. and Sha, E. 1996. Synchronous circuit optimization via multi-dimensional retiming. IEEE Trans. on Circuits and Systems II: Analog and Digital Signal Processing CAS-43(7): 507–519.

    Google Scholar 

  20. Proesmans, M., Gool, L. V., and Oosterlinck, A. 1996a. Active acquisition of 3D shape for moving objects. Proceedings ICIP International Conference on Image Processing. Lausanne, Switserland.

  21. Proesmans, M., Gool, L. V., and Oosterlinck, A. 1996b. One shot active 3D shape reconstruction. Proceedings 13th ICPR International Conference on Pattern Recognition: applications & robotic systems Vienna, Austria, IIIC: 336–340.

    Google Scholar 

  22. Truong, D., Bodin, F., and Seznec, A. 1997. Accurate data distribution into blocks may boost cache performance. IEEE TC on Computer Architecture Newsletter. Special issue on Interaction between Compilers and Computer Architectures pp. 55–57.

  23. Van Achteren, T., Adé, M., Lauwereins, R., Proesmans, M., Gool, L. V., Bormans, J., and Catthoor, F. 1999. Transformations of a 3D image reconstruction algorithm for data transfer and storage optimisation. Proc. 10th IEEE International Workshop on Rapid System Prototyping. Clearwater, FL, U.S.A., pp. 81–86.

  24. Verbauwhede, I., Catthoor, F., Vandewalle, J., and Man, H. D. 1989. Background memory management for the synthesis of algebraic algorithms on multi-processor DSP chips. Proc. VLSI'89, Int. Conf. on VLSI. Munich, Germany, pp. 209–218.

  25. Vuylsteke, P., and Oosterlinck, A. 1990. Range image acquisition with a single binary-encoded light pattern. IEEE PAMI 12(2): 148–164.

    Google Scholar 

  26. Wolf, M., and Lam, M. 1991. A data locality optimizing algorithm. Proc. of the SIGPLAN'91 Conf. on Programming Language Design and Implementation. Toronto, ON, Canada, pp. 30–43.

  27. Wolfe, M. 1990. Data dependence and program restructuring. J. of Supercomputing Kluwer (4): 321–344.

  28. Wolfe, M. 1991. The Tiny loop restructuring tool. Proc. of Intnl. Conf. on Parallel Processing pp. II.46-II.53.

  29. Wuytack, S., Diguet, J., Catthoor, F., and Man, H. D. 1998. Formalized methodology for data reuse exploration for low-power hierarchical memory mapping. IEEE Trans. on VLSI Systems 6(4): 529–537.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Achteren, T.V., Adé, M., Lauwereins, R. et al. Transformations of a 3D Image Reconstruction Algorithm for Data Transfer and Storage Optimisation. Design Automation for Embedded Systems 5, 313–327 (2000). https://doi.org/10.1023/A:1008958303888

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1008958303888