Transformations of a 3D Image Reconstruction Algorithm for Data Transfer and Storage Optimisation

Achteren, Tanja Van; Adé, Marleen; Lauwereins, Rudy; Proesmans, Marc; Gool, Luc Van; Bormans, Jan; Catthoor, Francky

doi:10.1023/A:1008958303888

Transformations of a 3D Image Reconstruction Algorithm for Data Transfer and Storage Optimisation

Published: August 2000

Volume 5, pages 313–327, (2000)
Cite this article

Design Automation for Embedded Systems Aims and scope Submit manuscript

Tanja Van Achteren¹,
Marleen Adé¹,
Rudy Lauwereins¹,
Marc Proesmans¹,
Luc Van Gool¹,
Jan Bormans² &
…
Francky Catthoor²

62 Accesses
Explore all metrics

Abstract

When implementing a 3D image reconstruction algorithm on a DSP architecture, we find ourselves confronted with a large memory transfer overhead, reducing the possible speedup attainable on recent multi-media oriented architectures. This paper describes how the critical part of the algorithm is re-specified and aggressively transformed at the algorithm code level, to improve the data access locality of the multi-dimensional image signal, while preserving the input/output behaviour. Experiments show that a close to optimal reuse of the data in the foreground memory and registers is obtained, removing the data transfer and storage bottleneck and enabling real-time prototyping of the algorithm on a DSP architecture.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient reference frame compression scheme for video coding systems: algorithm and VLSI design

Article 11 December 2015

DSP-based parallel optimization for real-time video stitching

Article 06 March 2023

Motion estimation using maximum sub-image and sub-pixel phase correlation on a DSP platform

Article 06 February 2019

References

Amarasinghe, S., Anderson, J., Lam, M., and Tseng, C.1995. The SUIF compiler for scalable parallel machines. Proceedings of the 7th SIAM Conference on Parallel Processing for Scientific Computing.
Anderson, J., Amarasinghe, S., and Lam, M. 1995. Data and computation transformations for multiprocessors. 5th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. pp. 39–50.
Banerjee, P., Chandy, J., Gupta, M., Hodges, E., Holm, J., Lain, A., Palermo, D., Ramaswamy, S., and Su, E. 1995. The paradigm compiler for distributed-memory multicomputers. IEEE Computer Magazine 28(10): 37–47.
Google Scholar
Banerjee, U. 1993. Loop Transformations for Restructuring Compilers: the Foundations. Kluwer, Boston.
Google Scholar
Blake, A., McCowen, D., Lo, H., and Lindsey, P. 1993. Triconular active range-sensing. IEEE PAMI 15(5): 477–483.
Google Scholar
Catthoor, F., Janssen, M., Nachtergaele, L., and Man, H. D. 1998a. System-level data-flow transformation exploration and power-area trade-offs demonstrated on video codecs. In M. Ibrahim and W. Wolf, editors, special issue on Systematic trade-off analysis in signal processing systems design. Journal of VLSI Signal Processing Boston: Kluwer. 18(1): 39–50.
Google Scholar
Catthoor, F., Wuytack, S., Greef, E. D., Balasa, F., Nachtergaele, L., and Vandecappelle, A. 1998b. Custom memory management methodology—exploration of memory organisation for embedded multimedia system design, No. ISBN 0–7923–8288–9. Boston: Kluwer Acad. Publ.
Google Scholar
Fang, J. and Lu, M. 1993. An iteration partition approach for cache or local memory thrashing on parallel processing. IEEE Trans. on Computers C-42(5): 529–546.
Google Scholar
Gannon, D., Jalby, W., and Gallivan, K. 1988. Strategies for cache and local memory management by global program transformations. Journal of Parallel and Distributed Computing 5: 568–586.
Google Scholar
Ghosh, S., Martonosi, M., and Malik, S. 1997. Cache miss equations: an analytical representation of cache misses. IEEE TC on Computer Architecture Newsletter Special issue on Interaction between Compilers and Computer Architectures. pp. 52–54.
Greef, E. D., Catthoor, F., and Man, H. D. 1998. Program transformation strategies for memory size and power reduction of pseudo-regular multimedia subsystems. Transactions on Circuits and Systems for Video Technology 8(6): 719–733.
Google Scholar
Hall, M., Anderson, J., Amarasinghe, S., Murphy, B., Liao, S., Bugnion, E., and Lam, M. 1996. Maximizing multiprocessor performance with the SUIF compiler. IEEE Computer Magazine 30(12): 84–89.
Google Scholar
Kelly, W., and Pugh, W. 1992. Generating schedules and code within a unified reordering transformation framework. Technical report umiacs-tr–92–126, cs-tr-2995 Institute for Advanced Computer Studies Dept. of Computer Science, Univ. of Maryland, College Park, MD 20742.
Google Scholar
Kolson, D., Nicolau, A., and Dutt, N. 1996. Elimination of redundant memory traffic in high-level synthesis. IEEE Trans. on Comp-aided Design 15(11): 1354–1363.
Google Scholar
Li, W. and Pingali, K. 1992. A singular loop transformation framework based on non-singular matrices. Proc. 5th Annual Workshop on Languages and Compilers for Parallelism. New Haven, CN.
Maruyama, M., and Abe, S. 1993. Range sensing by projecting multiple slits with random cuts. IEEE PAMI 15(6): 647–650.
Google Scholar
McKinley, K., Carr, S., and Tseng, C.-W. 1996. Improving data locality with loop transformations. ACM Trans. on Programming Languages and Systems 18(4): 424–453.
Google Scholar
McKinley, K., Hall, M., Harvey, T., Kennedy, K., McIntosh, N., Oldham, J., Paleczny, M., and Roth, G. 1993. Experiences using the ParaScope editor: an interactive parallel programming tool. 4th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. San Diego, USA.
Passos, N. and Sha, E. 1996. Synchronous circuit optimization via multi-dimensional retiming. IEEE Trans. on Circuits and Systems II: Analog and Digital Signal Processing CAS-43(7): 507–519.
Google Scholar
Proesmans, M., Gool, L. V., and Oosterlinck, A. 1996a. Active acquisition of 3D shape for moving objects. Proceedings ICIP International Conference on Image Processing. Lausanne, Switserland.
Proesmans, M., Gool, L. V., and Oosterlinck, A. 1996b. One shot active 3D shape reconstruction. Proceedings 13th ICPR International Conference on Pattern Recognition: applications & robotic systems Vienna, Austria, IIIC: 336–340.
Google Scholar
Truong, D., Bodin, F., and Seznec, A. 1997. Accurate data distribution into blocks may boost cache performance. IEEE TC on Computer Architecture Newsletter. Special issue on Interaction between Compilers and Computer Architectures pp. 55–57.
Van Achteren, T., Adé, M., Lauwereins, R., Proesmans, M., Gool, L. V., Bormans, J., and Catthoor, F. 1999. Transformations of a 3D image reconstruction algorithm for data transfer and storage optimisation. Proc. 10th IEEE International Workshop on Rapid System Prototyping. Clearwater, FL, U.S.A., pp. 81–86.
Verbauwhede, I., Catthoor, F., Vandewalle, J., and Man, H. D. 1989. Background memory management for the synthesis of algebraic algorithms on multi-processor DSP chips. Proc. VLSI'89, Int. Conf. on VLSI. Munich, Germany, pp. 209–218.
Vuylsteke, P., and Oosterlinck, A. 1990. Range image acquisition with a single binary-encoded light pattern. IEEE PAMI 12(2): 148–164.
Google Scholar
Wolf, M., and Lam, M. 1991. A data locality optimizing algorithm. Proc. of the SIGPLAN'91 Conf. on Programming Language Design and Implementation. Toronto, ON, Canada, pp. 30–43.
Wolfe, M. 1990. Data dependence and program restructuring. J. of Supercomputing Kluwer (4): 321–344.
Wolfe, M. 1991. The Tiny loop restructuring tool. Proc. of Intnl. Conf. on Parallel Processing pp. II.46-II.53.
Wuytack, S., Diguet, J., Catthoor, F., and Man, H. D. 1998. Formalized methodology for data reuse exploration for low-power hierarchical memory mapping. IEEE Trans. on VLSI Systems 6(4): 529–537.
Google Scholar

Download references

Author information

Authors and Affiliations

KULeuven - ESAT/ACCA-PSI, Kard. Mercierlaan 94, B-3001, Leuven, Belgium
Tanja Van Achteren, Marleen Adé, Rudy Lauwereins, Marc Proesmans & Luc Van Gool
IMEC, Kapeldreef 75, B-3001, Leuven, Belgium
Jan Bormans & Francky Catthoor

Authors

Tanja Van Achteren
View author publications
You can also search for this author in PubMed Google Scholar
Marleen Adé
View author publications
You can also search for this author in PubMed Google Scholar
Rudy Lauwereins
View author publications
You can also search for this author in PubMed Google Scholar
Marc Proesmans
View author publications
You can also search for this author in PubMed Google Scholar
Luc Van Gool
View author publications
You can also search for this author in PubMed Google Scholar
Jan Bormans
View author publications
You can also search for this author in PubMed Google Scholar
Francky Catthoor
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Achteren, T.V., Adé, M., Lauwereins, R. et al. Transformations of a 3D Image Reconstruction Algorithm for Data Transfer and Storage Optimisation. Design Automation for Embedded Systems 5, 313–327 (2000). https://doi.org/10.1023/A:1008958303888

Download citation

Issue Date: August 2000
DOI: https://doi.org/10.1023/A:1008958303888

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Transformations of a 3D Image Reconstruction Algorithm for Data Transfer and Storage Optimisation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Efficient reference frame compression scheme for video coding systems: algorithm and VLSI design

DSP-based parallel optimization for real-time video stitching

Motion estimation using maximum sub-image and sub-pixel phase correlation on a DSP platform

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Subscribe and save

Buy Now