- AC93.B. Alpern and L. Carter. Toward# a Model for Portable Parallel Performance: Exposing the Memory Hierarchy. In T. Hey and J. Ferrante, editors, Portability and Performance for Parallel Processing. Wiley, 1993.Google Scholar
- ACS89.A. Aggarwal, A. K. Chandra, and M. Snir. On Communication Latency in PRAM Computation. In Proceedings of the A CM Symposium on Parallel Algorithms and Architectures. ACM, June 1989. Google ScholarDigital Library
- ACS90.A. Aggarwal, A. K. Chandra, and M. Snir. Communication Complexity of PRAMs. In Theoretical Computer Science, March 1990. Google ScholarDigital Library
- AISS95.A. Alexandrov, M. F. Ionescu, K. E. Schauser, and C. Scheiman. LogGP: Incorporating Long Messages into the LogP model One step closer towards a realistic model for parallel computation. Technical Report TRCS 95-09, Department of Computer Science, University of California, Santa Barbara, April 1995. Google ScholarDigital Library
- BBB+94.V. Bala, J. Bruck, R. Bryant, R. Cypher, P. de Jong, P. Elustondo, D. Frye, A. Ho, C-T. Ho, G. Irwin, S. Kipnis, R. Lawrence, and M. Snir. The IBM external user interface for scalable parallel systems. Parallel Computing, 20(4), April 1994. Google ScholarDigital Library
- BBC+94.V. Bala# J. Bruck, R. Cypher, P. Elustondo# A. Ho, C-T. Ho, S. Kipnis, and M. Snir. CCL: a portable and tunable collective communication library for scalable parallel computers. In 8th International Parallel Processing Symposium, April 1994. Google ScholarDigital Library
- BCM94.E. Bartson, J. Cownie, and M. McLaren. Message passing on the Meiko CS-2. Parallel Computing, 20(4), April 1994. Google ScholarDigital Library
- Ble87.G.E. Blelloch. Scans as Primitive Parallel Operations. In Proceedings o/International Conference on Parallel Processing, 1987.Google Scholar
- BNK92.A. Bar-Noy and S. Kipnis. Designing broadcasting algorithms in the postal model for message-passing systems. In Proceedings of the A CM Symposium on Parallel Algorithms and Architectures, June 1992. Google ScholarDigital Library
- BOS+91.D.P. Bertsekas, C. Ozveran, G. D. Stamoulis, P. Tseng, and J. N. Tsitsiklis. Optimal Communication Algorithms for Hypercubes. Journal of Parallel and Distributed Computing, 11, 1991. Google ScholarDigital Library
- BT89.D. Bertsekas and J. Tsitsiklis. Parallel and Distr, buted Computation. Prentice Hall, 1989. Google ScholarDigital Library
- CDG+93.D.E. Culler, A. Dusseau, S. C. Golstein, A. Krishnamurthy, S. Lumetta, T. von Eicken, and K. Yelick. Parallel Programming in Split-C. In Proc. of Supercomputmg, November 1993. Google ScholarDigital Library
- CDMS93.D.E. Culler, A. Dusseau, R. Martin, and K. E. Schauser. Fast Parallel Sorting under LogP: from theory to practice. In Proceedings of the Workshop on Portabd#ty and Performance for Parallel Processing, Southampton, England, July 1993.Google Scholar
- CKL+94.D.E. Culler, K. Keeton, L. T. Liu, A. Mainwaring, R. Martin, S. Rodrigues, and K. Wright. Generic Active Message Interface Specification. UC Berkeley, November 1994.Google Scholar
- CKP+93.D.E. Culler, R. M. Karp, D. A. Patterson, A. Sahay, K. E. Schauser, E. Santos, R. Subramonian, and T. von Eicken. LogP: Towards a Realistic Model of Parallel Computation. In Fourth A CM SIGPLAN Symposium on Pmnciples and Practice of Parallel Programming, May 1993. Google ScholarDigital Library
- CZ89.R. Cole and O. Zajicek. The APRAM: Incorporating asynchrony into the PRAM model. In Proceedings of the Symposium on Parallel Architectures and Algorithms, 1989. Google ScholarDigital Library
- FW78.S. Fortune and J. Wyllie. Parallelism in Random Access Machines. In Proceedings of the l Oth Annual Symposium on Theory o/ Computing, 1978. Google ScholarDigital Library
- Gib89.P.B. Gibbons. A More Practical PRAM Model. In Proceedings of the A CM Symposzum on Parallel Algorithms and Archztectures. ACM, 1989. Google ScholarDigital Library
- HM93.M. Homewood and M. McLaren. Meiko CS-2 Interconnect Elan-Elite Design. In Proc. of Hot Interconnects, August 1993.Google Scholar
- Hoc93.R. Hockney. Performance Parameters and Results for the Genesis Parallel Benchmarks. In Proceedings of the Workshop on Portability and Performance .for Parallel Processing, 5outhampton, England, July 1993.Google Scholar
- JH89.S.L. Johnsson and C. T. Ho. Optimum broadcasting and personalized communication in hypercubes. IEEE Transactwns on Computers, 1989. Google ScholarDigital Library
- KGGK94.V. Kumar, A. Grama, A. Gupta, and G. Karypis. Introduction to Parallel Computing. Benjamin Cummings, 1994. Google ScholarDigital Library
- KLMadH92.R. M. Karp, M. Luby, and F. Meyer auf der Heide. Efficient PRAM Simulation on a Distributed Memory Machine. In Proceedings of the Twenty-Fourth Annual A CM Symposium of the Theory of Computing, May 1992. Google ScholarDigital Library
- KR90.R.M. Karp and V. Ramachandran. Parallel Algorithms for Shared-Memory Machines. In J. van Leeuwen, editor, Handbook of Theoretical Computer Science. Elsevier Science Publishers, 1990. Google ScholarDigital Library
- KSSS93.I#. Karp, A. Sahay, E. Santos, and K. E. Schauser. Optimal Broadcast and Summation in the LogP Model. in 5th Symp. on Parallel Algorithms and Architectures, June 1993. Google ScholarDigital Library
- LC94.L.T. Liu and D. E. Culler. Measurements of Active Messages Performance on the CM-5. Technical Report UCB/CSD 94-807, CS Div., UC Berkeley, May 1994. Google ScholarDigital Library
- LCW93.J.R. Laurus, S. Chandra, and D. A. Wood. CICO: A Practical Shared-Memory Programming Performance Model. In Proceedings o# the Workshop on Portabzhty and Per/ormance for Parallel Processing, Southampton, England, July 1993.Google Scholar
- Lei92.F.T. Leighton. Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hypercubes. Morgan Kaufman, I992. Google ScholarDigital Library
- MV84.K. Mehlhorn and U. Vishkin. Randomized and deterministic simulations of PRAMs by parallel machines with restricted granularity of parallel memories. Acta Informatica, 21, 1984. Google ScholarDigital Library
- Pie94.P. Pierce. The NX message passing interface. Parallel Computing, 20(4), April 1994. Google ScholarDigital Library
- PY88.C.H. Papadimitriou and M. Yannakakis. Towards an Architecture-Independent Analysis of Parallel Algorithms. In Proceedings o/ the Twentieth Annual A CM Symposium of the Theory of Computing. ACM, 1988. Google ScholarDigital Library
- Sny86.L. Snyder. Type Architectures, Shared Memory, and the Corollary of Modest Potential. In Ann. Rev. Comput. Sci. Annual Reviews Inc., 1986. Google ScholarDigital Library
- SS89.Y. Saad and M. H. Schultz. Data communication in hypercubes. Journal of Parallel and Distributed Computing, 6(1), February 1989. Google ScholarDigital Library
- SS95.K.E. Schauser and C. J. Scheiman. Experience with Active Messages on the Meiko CS-2. In 9th International Parallel Processing Symposium, April 1995. Google ScholarDigital Library
- SV94.M. Schmidt-Voigt. Efficient parallel communication with the nCUBE 2S processor. Parallel Computing, 20(4), April 1994. Google ScholarDigital Library
- TM94.L.W. Tucker and A. Mainwaring. CMMD: Active messages on the CM-5. Parallel Computing, 20(4), April 1994. Google ScholarDigital Library
- Val90.L.G. Valiant. A Bridging Model for Parallel Computation. Communications o/ the A CM, 33(8), August 1990. Google ScholarDigital Library
- vECGS92.T. von Eicken, D. E. Culler, S. C. Goldstein, and K. E. Schauser. Active Messages: a Mechanism for Integrated Communication and Computation. In Proc. of the 19th Int'l Symposium on Computer Architecture, Gold Coast, Australia, May 1992. Google ScholarDigital Library
Index Terms
- LogGP: incorporating long messages into the LogP model—one step closer towards a realistic model for parallel computation
Recommendations
LogGP Performance Evaluation of MPI
HPDC '98: Proceedings of the 7th IEEE International Symposium on High Performance Distributed ComputingUsers of parallel machines need good performance evaluations for several communication patterns in order to develop efficient message-passing applications. LogGP is a simple parallel machine model that reflects the important parameters required to ...
Efficient Algorithms for the Reduce-Scatter Operation in LogGP
We consider the problem of efficiently performing a reduce-scatter operation in a message passing system. Reduce-scatter is the composition of an element-wise reduction on vectors of n elements initially held by n processors, with a scatter of the ...
Comments