research-article

Automated generation of polyhedral process networks from affine nested-loop programs with dynamic loop bounds

Authors:
Dmitry Nadezhkin

Leiden Institute of Advanced Computer Science, Leiden, The Netherlands

Leiden Institute of Advanced Computer Science, Leiden, The Netherlands
View Profile

,
Hristo Nikolov

Leiden Institute of Advanced Computer Science, Leiden, The Netherlands

Leiden Institute of Advanced Computer Science, Leiden, The Netherlands
View Profile

,
Todor Stefanov

Leiden Institute of Advanced Computer Science, Leiden, The Netherlands

Leiden Institute of Advanced Computer Science, Leiden, The Netherlands
View Profile

ACM Transactions on Embedded Computing Systems Volume 13 Issue 1sArticle No.: 28pp 1–24https://doi.org/10.1145/2536747.2536750

Published:06 December 2013Publication History

ACM Transactions on Embedded Computing Systems

Abstract

The Process Networks (PNs) is a suitable parallel model of computation (MoC) used to specify embedded streaming applications in a parallel form facilitating the efficient mapping onto embedded parallel execution platforms. Unfortunately, specifying an application using a parallel MoC is a very difficult and highly error-prone task. To overcome the associated difficulties, we have developed the pn compiler, which derives specific Polyhedral Process Networks (PPN) parallel specifications from sequential static affine nested loop programs (SANLPs). However, there are many applications, for example, multimedia applications (MPEG coders/decoders, smart cameras, etc.) that have adaptive and dynamic behavior which cannot be expressed as SANLPs. Therefore, in order to handle dynamic multimedia applications, in this article we address the important question whether we can relax some of the restrictions of the SANLPs while keeping the ability to perform compile-time analysis and to derive PPNs. Achieving this would significantly extend the range of applications that can be parallelized in an automated way.

The main contribution of this article is a first approach for automated translation of affine nested loop programs with dynamic loop bounds into input-output equivalent Polyhedral Process Networks. In addition, we present a method for analyzing the execution overhead introduced in the PPNs derived from programs with dynamic loop bounds. The presented automated translation approach has been evaluated by deriving a PPN parallel specification from a real-life application called Low Speed Obstacle Detection (LSOD) used in the smart cameras domain. By executing the derived PPN, we have obtained results which indicate that the approach we present in this article facilitates efficient parallel implementations of sequential nested loop programs with dynamic loop bounds. That is, our approach reveals the possible parallelism available in such applications, which allows for the utilization of multiple cores in an efficient way.

References

Arulampalam, S. and Maskell, S. 2002. A tutorial of partical filter for on-line non-linear/non-Gaussian Bayesian tracking. IEEE Trans. Sig. Process. 68--73. Google ScholarDigital Library
Benabderrahmane, M.-W., Pouchet, L.-N., Cohen, A., and Bastoul, C. 2010. The polyhedral model is more widely applicable than you think. In Proceedings of ETAPS CC'10. Google ScholarDigital Library
Castrillon, J., et al. 2010. Trace-based KPN composability analysis for mapping simultaneous applications to MPsoc platforms. In Proceedings of DATE'10. Google ScholarDigital Library
Collard, J.-F., Barthou, D., and Feautrier, P. 1995. Fuzzy array dataflow analysis. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM Press, 92--101. Google ScholarDigital Library
de Kock, E. 2002. Multiprocessor mapping of process networks: A JPEG decoding case study. In Proceedings of the 15th International Symposium on System Synthesis (ISSS'02), 68--73. Google ScholarDigital Library
Dwivedi, B., et al. 2004. Automatic synthesis of system on chip multiprocessor architectures for process networks. In Proceedings of the CODES+ISSS. Google ScholarDigital Library
Farago, T. 2009. A framework for heterogeneous desktop parallel computing. M.S. thesis, LERC, LIACS.Google Scholar
Feautrier, P. 1988. Parametric integer programming. RAIRO Recherche Opérationnelle 22, 3, 243--268.Google ScholarCross Ref
Feautrier, P. 1991. Dataflow analysis of scalar and array references. Para. Prog. 20, 1, 23--53.Google Scholar
Feautrier, P. 1996. Automatic parallelization in the polytope model. In The Data Parallel Programming Model. Lecture Notes in Computer Science, vol. 1132, 79--103. Google ScholarDigital Library
Geigl, M., Griebl, M., and Lengauer, C. 1999. Termination detection in parallel loop nests with while loops. Paral. Comput. 25, 12, 1489--1510. Google ScholarDigital Library
Goossens K., et al. 2003. Guaranteeing the quality of services in networks on chip. In Networks on Chip. Kluwer Publishers, 61--82. Google ScholarDigital Library
Griebl, M. and Lengauer, C. 1996. The loop parallelizer loopo. In Proceedings of the 6th Workshop on Compilers for Parallel Computers, vol. 21. Forschungszentrum, 311--320.Google Scholar
Haid, W., et al. 2009. Efficient execution of Kahn process networks on multi-processor systems using protothreads and windowed FIFOS. In Proceedings of ESTIMedia. IEEE, 35--44.Google ScholarCross Ref
Kahn, G. 1974. The Semantics of a simple language for parallel programming. In Proceedings of the IFIP Congress 74. North-Holland Publishing Co.Google Scholar
Knobe, K. and Sarkar, V. 1998. Array SSA form and its use in Parallelization. In Proceedings of the ACM Symposium on Principles of Programming Languages (PoPL). CA, 107--120. Google ScholarDigital Library
Martin, G. 2006. Overview of the MPSoC design challenge. In Proceedings of DAC. Google ScholarDigital Library
Mihal, A. and Keutzer, K. 2003. Mapping concurrent applications onto architectural platforms. In Networks on Chips, A. Jantsch and H. Tenhunen, Eds., Kluwer Academic Publishers, 39--59. Google ScholarDigital Library
Nadezhkin, D. and Stefanov, T. 2010. Identifying communication models in process networks derived from weakly dynamic programs. In Proceedings of SAMOS X. 372--379.Google Scholar
Nikolov, H., Stefanov, T., and Deprettere, E. F. 2008. Systematic and automated multiprocessor system design, programming, and implementation. IEEE Trans. CAD 27, 3, 542--555. Google ScholarDigital Library
Raman, E., Ottoni, G., Raman, A., Bridges, M. J., and August, D. I. 2008. Parallel-stage decoupled software pipelining. In Proceedings of the 6th CGO, 114--123. Google ScholarDigital Library
Stefanov, T. 2004. Converting weakly dynamic programs to equivalent process network specifications. Ph.D. thesis. Leiden University, The Netherlands, ISBN: 90-9018629-8.Google Scholar
Stefanov T., et al. 2004. System design using Kahn process networks: The Compaan/Laura approach. In Proceedings of DATE. 340--345. Google ScholarDigital Library
Turjan, A. 2007. Compiling nested loop programs to process networks. Ph.D. thesis. Leiden University, The Netherlands.Google Scholar
Turjan, A., Kienhuis, B., and Deprettere, E. 2002. Realizations of the extended linearization model in the Compaan tool chain. In Proceedings of the 2nd Samos Workshop.Google Scholar
Turjan, A., Kienhuis, B., and Deprettere, E. 2004. Translating affine nested-loop programs to process networks. In Proceedings of CASES'04, DC. Google ScholarDigital Library
Verdoolaege, S., Nikolov, H., and Stefanov, T. 2007. PN: A tool for improved derivation of process networks. EURASIP J. Embed. Syst. 2007, 1, 19--19. Google ScholarDigital Library

Index Terms

Automated generation of polyhedral process networks from affine nested-loop programs with dynamic loop bounds
1. Software and its engineering
  1. Software notations and tools
    1. Compilers

Recommendations

Tiling imperfectly-nested loop nests
SC '00: Proceedings of the 2000 ACM/IEEE conference on Supercomputing

Tiling is one of the more important transformations for enhancing loca lity of reference in programs. Intuitively, tiling a set of loops achieves the effect of interleaving iterations of these loops. Tiling of perfectly-nested loop nests (which are loop ...
Read More
Joint affine transformation and loop pipelining for mapping nested loop on CGRAs
DATE '15: Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition

Coarse-Grained Reconfigurable Architectures (CGRAs) are the promising architectures with high performance, high power- efficiency and attractions of flexibility. The computation-intensive portions of application, i.e. loops, are often implemented on ...
Read More
Synthesizing Transformations for Locality Enhancement of Imperfectly-Nested Loop Nests

Linear loop transformations and tiling are known to be very effective for enhancing locality of reference in perfectly-nested loops. However, they cannot be applied directly to imperfectly-nested loops. Some compilers attempt to convert imperfectly-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Embedded Computing Systems Volume 13, Issue 1s
Special Section on ESTIMedia'10
November 2013
354 pages
ISSN:1539-9087
EISSN:1558-3465
DOI:10.1145/2536747
Editors:
Naehyuck Chang
Seoul National University (SNU), Korea
,
Jian-Jia Chen
Karlsruhe Institute of Technology (KIT), Germany
Issue’s Table of Contents
Copyright © 2013 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States

Journal Family
ACM Journals for the Design of Smart and Connected Systems
Publication History
- Published: 6 December 2013
- Accepted: 1 June 2012
- Revised: 1 February 2012
- Received: 1 August 2011
Published in tecs Volume 13, Issue 1s

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Models of Computation
compiler techniques for MPSoCs
parallel programing
polyhedral process networks
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 143
  Total Downloads
- Downloads (Last 12 months)4
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Automated generation of polyhedral process networks from affine nested-loop programs with dynamic loop bounds

ACM Transactions on Embedded Computing Systems

Abstract

References

Cited By

Index Terms

Recommendations

Tiling imperfectly-nested loop nests

Joint affine transformation and loop pipelining for mapping nested loop on CGRAs

Synthesizing Transformations for Locality Enhancement of Imperfectly-Nested Loop Nests