ABSTRACT
No abstract available.
- A. Rodrigues, S. Sridharan, S. Thoziyoor, J. Brockman, K. Underwood, and P. Kogge. Enhancing Price/Performance for OpenMP using Processing-In-Memory. In 1st Workshop on Programming Models for Ubiquitous Parallelism (PMUP), held in conjunction with the 15th International Conference on Parallel Architectures and Compilation Techniques(PACT), Sep 2006.Google Scholar
- Srinivas Sridharan. Implementing Scalable Locks and Barriers on Large-Scale Light-Weight Multithreaded Systems. M.S CSE Thesis, University of Notre Dame, July 2006.Google Scholar
- J. M. Mellor-Crummey and M. L. Scott. Algorithms for Scalable Synchronization on Shared-Memory Multiprocessors. ACM Transactions Computing Systems, 9(1):21--65, 1991. Google ScholarDigital Library
Index Terms
- Evaluating synchronization techniques for light-weight multithreaded/multicore architectures
Recommendations
Parallelism via Multithreaded and Multicore CPUs
Multicore and multithreaded CPUs have become the new approach to obtaining increases in CPU performance. Numeric applications mostly benefit from a large number of computationally powerful cores. Servers typically benefit more if chip circuitry is used ...
Scheduling Techniques for GPU Architectures with Processing-In-Memory Capabilities
PACT '16: Proceedings of the 2016 International Conference on Parallel Architectures and CompilationProcessing data in or near memory (PIM), as opposed to in conventional computational units in a processor, can greatly alleviate the performance and energy penalties of data transfers from/to main memory. Graphics Processing Unit (GPU) architectures and ...
Lock-based synchronization for GPU architectures
CF '16: Proceedings of the ACM International Conference on Computing FrontiersModern GPUs have shown promising results in accelerating compute-intensive and numerical workloads with limited data sharing. However, emerging GPU applications manifest ample amount of data sharing among concurrently executing threads. Often data ...
Comments