Multiprocessor, Multithreading and Memory Optimization for On-Chip Multimedia Applications

Girodias, B.; Bouchebaba, Y.; Nicolescu, G.; Aboulhamid, E. M.; Paulin, P.; Lavigueur, B.

doi:10.1007/s11265-008-0293-4

Multiprocessor, Multithreading and Memory Optimization for On-Chip Multimedia Applications

Published: 14 November 2008

Volume 57, pages 263–283, (2009)
Cite this article

Journal of Signal Processing Systems Aims and scope Submit manuscript

B. Girodias¹,
Y. Bouchebaba¹,
G. Nicolescu¹,
E. M. Aboulhamid²,
P. Paulin³ &
…
B. Lavigueur³

243 Accesses
1 Citation
Explore all metrics

Abstract

Multiprocessor System-on-Chip is one of the main drivers of the semiconductor industry revolution by enabling the integration of complex functionality on a single chip. The techniques for processor design and application optimizations can be combined together for more efficient design of these systems. Thus, the memory optimization techniques improving the data locality can be combined with multithreading technology, improving the overall processor efficiency. The combination of these techniques is mainly challenged by the adaptation of memory optimization techniques to the high parallelism offered by the multithreading environments. This paper presents an in-depth analysis of the impact of multiprocessor and multithreading environments on memory optimization techniques. A discussion is provided on the different types of parallelization (fine and coarse grain) and their influence on memory optimization technique. Some improvements on existing memory optimization techniques are presented as well some adaptation necessary to use them in this type of environment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Performance and Energy Efficiency Analysis of Data Reuse Transformation Methodology on Multicore Processor

Multithreaded processors

Article 01 September 2015

Multicore Platforms: Processors, Communication and Memories

References

Jerraya, A. A., & Wayne, W. (2005). Multiprocessor systems-on-chips, Elsevier ed.. United States of America: Morgan Kaufmann.
Wolf, W. (2004). The future of multiprocessor systems-on-chips. Design Automation Conference, pp. 681–685.
Haines, M., & Bohm, W. (1993). An evaluation of software multithreading in a conventional. Proceedings of the Fifth IEEE Symposium on Parallel and Distributed Processing, pp. 106–113.
Catthoor, F., Franssen, F., Wuytack, S., et al. (1994). Global communication and memory optimizing transformations for low. IEE Workshop on VLSI Signal Processing, VII, 178–187.
Article Google Scholar
Catthoor, F., Wuytack, S., Greef, E. D., et al. (1998). Custom memory management methodology—Exploration of memory organisation for embedded multimedia system design. Boston: Kluwer.
MATH Google Scholar
Wolf, M. E., & Lam M. S. (1991). A data locality optimizing algorithm. Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation, pp. 30–44.
Paulin, P. G., Pilkington, C., Langevin, M., et al. (2006). Parallel programming models for a multiprocessor SoC platform applied to networking and multimedia. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 14(7), 667–680.
Article Google Scholar
Carr, S., & Kennedy, K. (1994). Scalar replacement in the presence of conditional control flow. Software—Practice and Experience, 24(1), 51–77 (1994/01/).
Article Google Scholar
Greef, E. D. (1998). Storage size reduction for multimedia application. PhD thesis. Katholieke Universiteit, Leuven.
Olukotun, K., Nayfeh, B. A., Hammond, L., et al. (1996) The case for a single chip multiprocessor. Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 2–11.
Cierniak, M., & Li, W. (1995). Unifying data and control transformations for distributed shared-memory machines. Proceedings of the ACM SIGPLAN 1995 Conference on Programming Language Design and Implementation, pp. 205–217.
Darte, A. (1999). On the complexity of loop fusion. Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, pp. 149–157.
Kennedy, K. (2001). Fast greedy weighted fusion. International Journal of Parallel Programming, 29(5), 463–491 (2001/10/).
Article MATH Google Scholar
Fraboulet, A., Kodary, K., & Mignotte, A. (2001). Loop fusion for memory space optimization. Proceedings of the 14th International Symposium on System Synthesis, pp. 95–100.
Marchal, P., Catthoor, F., & Gomez, J. I. (2004). Optimizing the memory bandwidth with loop fusion. CODES + ISSS 2004. International Conference on Hardware/Software Codesign and System Synthesis, pp. 188–193.
Kandemir, M., Kadayif, I., Choudhary, A., et al. (2002). Optimizing inter-nest data locality. Proceedings of the 2002 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, pp. 127–135.
Kandemir, M. (2002). Data space oriented tiling. Programming Languages and Systems. 11th European Symposium on Programming, ESOP 2002. Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2002. Proceedings (Lecture Notes in Computer Science 2305). pp. 178–193.
Li, F., & Kandemir, M. (2005). Locality-conscious workload assignment for array-based computations in MPSOC architectures. Proceedings of the 42nd. Design Automation Conference, pp. 95–100.
Krishnan, V., & Torrellas, J. (1999). A chip-multiprocessor architecture with speculative multithreading. IEEE Transactions on Computers, 48(9), 866–880.
Article Google Scholar
Van Achteren, T., Deconinck, G., Catthoor, F., et al. (2002). Data reuse exploration techniques for loop-dominated applications. Proceedings of Design, Automation and Test in Europe Conference and Exhibition, pp. 428–435.
Ilya, I., Erik, B., Miguel, M., et al. (2007). DRDU: A data reuse analysis technique for efficient scratch-pad memory management. ACM Transactions on Design Automation of Electronic Systems, 12(2), 15.
Article Google Scholar
Ghez, C., Miranda, M., Vandecappelle, A., et al. (2000). Systematic high-level address code transformations for piece-wise linear indexing: Illustration on a medical imaging algorithm. SiPS 2000. 2000 IEEE Workshop on Signal Processing Systems, pp. 603–612.
Catthoor, F., Danckaert, K., Kulkarni, K. K., et al. (2002). Data access and storage management for embedded programmable processors. p. 324. Berlin: Springer.
Schaumont, P., Lai, B.-C. C., Qin, W., et al. (2005). Cooperative multithreading on embedded multiprocessor architectures enables energy-scalable design. Proceedings of the 42nd Design Automation Conference, pp. 27–30.
Chong, Y.-K., & Hwang, K. (1995). Performance analysis of four memory consistency models for. IEEE Transactions on Parallel and Distributed Systems, 6(10), 1085–1099.
Article Google Scholar
Dimitroulakos, G., Galanis, M. D., & Goutis, C. E. (2005). Performance improvements using coarse-grain reconfigurable logic in embedded SOCs. International Conference on Field Programmable Logic and Applications, pp. 630–635.
Al-Hashimi, B. M. (2006). System-on-chip: Next Generation Electronics: IEE.
Forsell, M. J. (2005). Step caches—A novel approach to concurrent memory access on shared memory MP-SOCs. NORCHIP 23rd Conference, pp. 74–77.
Bouchebaba, Y., & Coelho, F. (2002). Tiling and memory reuse for sequences of nested loops. Euro-Par 2002 Parallel Processing. Proceedings of the 8th International Euro-Par Conference. (Lecture Notes in Computer Science Vol.2400), pp. 255–264.
Bouchebaba, Y., Girodias, B., Nicolescu, G., et al. (2007). MPSoC memory optimization using program transformation. ACM Transactions on Design Automation of Electronic Systems, 12(4), 43.
Article Google Scholar
Bouchebaba, Y., Lavigueur, B., Girodias, B., et al. (2007). MPSoC memory optimization for digital camera applications: Digital system design architectures, methods and tools, 2007. DSD 2007. 10th Euromicro Conference on “Digital System Design Architectures, Methods and Tools, 2007. DSD 2007, pp. 424–427.
Girodias, B., Bouchebaba, Y., Nicolescu, G., et al. (2006). Application-level memory optimization for MPSoC. Seventeenth IEEE International Workshop on Rapid System Prototyping, pp. 169–178.
Kwak, H., Lee, B., Hurson, A. R., et al. (1999). Effects of multithreading on cache performance. IEEE Transactions on Computers, 48(2), 176–184.
Article Google Scholar
Atitallah, R., Niar, S., Greiner, A., et al. (2006). Estimating energy consumption for an MPSoC architectural exploration. Architecture of Computing Systems—ARCS, pp. 298–310.
“SUIF, http://suif.stanford.edu” November 2006.
“CLooG” http://www.prism.uvsq.fr/∼cedb/bastools/cloog.html.

Download references

Author information

Authors and Affiliations

École Polytechnique de Montréal, Quebec, Canada
B. Girodias, Y. Bouchebaba & G. Nicolescu
Université de Montréal, Quebec, Canada
E. M. Aboulhamid
STMicroelectronics, Ottawa, Canada
P. Paulin & B. Lavigueur

Authors

B. Girodias
View author publications
You can also search for this author in PubMed Google Scholar
Y. Bouchebaba
View author publications
You can also search for this author in PubMed Google Scholar
G. Nicolescu
View author publications
You can also search for this author in PubMed Google Scholar
E. M. Aboulhamid
View author publications
You can also search for this author in PubMed Google Scholar
P. Paulin
View author publications
You can also search for this author in PubMed Google Scholar
B. Lavigueur
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to B. Girodias.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Girodias, B., Bouchebaba, Y., Nicolescu, G. et al. Multiprocessor, Multithreading and Memory Optimization for On-Chip Multimedia Applications. J Sign Process Syst Sign Image Video Technol 57, 263–283 (2009). https://doi.org/10.1007/s11265-008-0293-4

Download citation

Received: 30 November 2007
Accepted: 23 September 2008
Published: 14 November 2008
Issue Date: November 2009
DOI: https://doi.org/10.1007/s11265-008-0293-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multiprocessor, Multithreading and Memory Optimization for On-Chip Multimedia Applications

Abstract

Access this article

Similar content being viewed by others

Performance and Energy Efficiency Analysis of Data Reuse Transformation Methodology on Multicore Processor

Multithreaded processors

Multicore Platforms: Processors, Communication and Memories

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multiprocessor, Multithreading and Memory Optimization for On-Chip Multimedia Applications

Abstract

Access this article

Similar content being viewed by others

Performance and Energy Efficiency Analysis of Data Reuse Transformation Methodology on Multicore Processor

Multithreaded processors

Multicore Platforms: Processors, Communication and Memories

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation