- 1.S. Araki, A. Bilas, C. Dubnicki, J. Edler, K. Konishi, and J. Philbin, "User-space communication: A quantitative study," in SC98: High Performance Networking and Computing, (Orlando, FL), November 1998.]] Google ScholarDigital Library
- 2.I. Ashok, "Runtime support for dynamic space-based applications on distributed memory multiprocessors," Tech. Rep. 94-12-03, University of Washington, Seattie, WA, Dec. 1994.]]Google Scholar
- 3.I. Ashok and J. Zahorjan, "Adhara: Runtime support for dynamic, space-based," in Proceedings of the Scalable High Performance Computing Conference, May 1994.]]Google Scholar
- 4.H.E. Bal, M. E Kaashoek, and A. S. Tenenbaum, "Orca: A language for parallel programming of distributed systems," IEEE Transactions on Software Engineering, vol. 18, no. 3, March 1992.]] Google ScholarDigital Library
- 5.B.N. Bershad, M. J. Zekauskas, and W. A. Sawdon, "The midway distributed shared memory system," in COMPCON 1993, March 1993.]]Google Scholar
- 6.R.H. Bisseling and W. E McColl, "Scientific computing on bulk synchronous parallel architectures," in Proceedings of the 13th IFIP Worm Computer Congress (B. Pehrson and I. Simon, eds.), vol. 1, pp. 509-514, Elsevier, 1994.]]Google Scholar
- 7.R.H. Bisseling, "Sparse matrix computations on bulk synchronous parallel computers," in Proceedings of the International Conference on Industrial and Applied Mathematics, (Hamburg), July 1995.]]Google Scholar
- 8.R.D. Bjornson, Linda on Distributed Memory Multiprocessors. PhD thesis, Yale University, Department of Computer Science, November 1992.]] Google ScholarDigital Library
- 9.G.E. Blelloch and J. Greiner, "A provable time and space efficient implementation of NESL," in Proceedings of the 1996 A CM SIGPLAN lnternational Conference on Functional Programming, (Philadelphia, PA), pp. 213-225, 24-26 May 1996.]] Google ScholarDigital Library
- 10.R. D. Blumofe, M. Frigo, C. E Joerg, C. E. Leiserson, and K. H. Randall, "An analysis of dag-consistent distributed shared-memory algorithms," in 8th Annual ACM Symposium on Parallel Algorithms and Architectures, pp. 297-308, June 1996.]] Google ScholarDigital Library
- 11.R.D. Blumofe and C. E. Leiserson, "Space-efficient scheduling of multithreaded computations," SlAM Journal on Computing, vol. 27, no. 1, pp. 202-229, Feb. 1998.]] Google ScholarDigital Library
- 12.B. Buchberger, An Algorithm for Finding a Basis for the Residue Class Ring of a Zero-Dimensional Polynomial Ideal. PhD thesis, University of Innsbruck, 1965.]]Google Scholar
- 13.J. B. Carter, I. K. Bennett, and W. Zwaenepoel, "Implementation and performance of Munin," in Proc. of the 13th ACM Syrup. on Operating Systems Principles (SOSP-13), pp. 152-164, Oct. 1991.]] Google ScholarDigital Library
- 14.S. Chakrabarti and K. Yelick, "Implementing an irregular application on a distributed memory multiprocessor," in Proceedings oft he Fourth ACM/SIGPLAN Symposium on Principles and Practices of Parallel Programming, pp. 169-179, May 1993.]] Google ScholarDigital Library
- 15.S. Chakrabarti and K. Yelick, "On the correctness of a distributed memory Gr6bner basis algorithm," in International Conference on Rewriting Techniques and Applications, (Montreal, Canada), June 1993.]] Google ScholarDigital Library
- 16.J. Chase, E Amador, E. Lazowska, H. Levy, and R. Littlefield, "The Amber system: Parallel programming on a network of multiprocessors," in Proceedings of the Twelfth ACM Symposium on Operating Systems, pp. 147-158, December 1989.]] Google ScholarDigital Library
- 17.B. V. Cherkassky, A. V. Goldberg, and T. Radzik, "Shortest Paths Algorithms: Theory and Experimental Evaluation," Math. Prog., vol. 73, pp. 129--174, 1996.]]Google ScholarDigital Library
- 18.C.K.Birdsall and A. B. Langdon, Plasma Physics Via Computer Simulation. McGraw-Hill, 1985.]] Google ScholarDigital Library
- 19.D. Cox, J. Little, and D. O'Shea, Ideals, Varieties, andAlgorithn~: An Introduction of Computational Algebraic Geometry and Commutative Algebra. Springer- Vedag, second ed., 1997.]] Google ScholarDigital Library
- 20.D. Culler, R. Karp, D. Patterson, A. Sahay, K. E. Schauser, E. Santos, R. Subramonian, and T. yon Eicken, "LogP: Towards a realistic model of parallel computation," in Fourth ACM Symposium on Principles and Practice of Parallel Programming, pp. 1-12, May 1993.]] Google ScholarDigital Library
- 21.D. E. Culler, R. M. Karp, D. Patterson, A. Sahay, E. E. Santos, K. E. Schauser, R. Subramonian, and T. yon Eicken, "LogP: A practical model of parallel computation," Communications of the ACM, vol. 39, no. 11, pp. 78-85, November 1996.]] Google ScholarDigital Library
- 22.C. Dubnicki, A. Bilas, K. Li, and J. E Philbin, "Design and implementation of virtual memory-mapped communication on Myrinet," in Proceedings of l lth International Parallel Processing Symposium, pp. 388-396, April 1997.]] Google ScholarDigital Library
- 23.A. C. Dusseau, D. E. Culler, K. E. Schauser, and R. P. Martin, "Fast parallel sorting under LogP: Experience with the CM-5," IEEE Transactions on Parallel and Distributed Systems, vol. 7, no. 8, pp. 791-805, August 1996.]] Google ScholarDigital Library
- 24.J. Edler, A. Gottlieb, and J. Philbin, "The NECI LAMP: What, why, and how," in Proceedings of the NEC Research Symposium, (Berlin), May 1997. To be published.]]Google Scholar
- 25.A.V. Gerbessiotis and C. J. Siniolakis, "Deterministic sorting and randomized mean finding on the BSP model," in Eighth AnnualACM Symposium on Parallel Algorithms andArchitectares, pp. 223--232, June 1996.]] Google ScholarDigital Library
- 26.A. V. Gerbessiotis and L. G. Valiant, "Direct bulk-synchronous parallel algorithms," Journal of Parallel and Distributed Computing, vol. 22, no. 2, pp. 251- 267, August 1994.]] Google ScholarDigital Library
- 27.P.B. Gibbons, Y. Mattias, and V. Ramachandran, "Can a shared-memory model serve as a bridging model for parallel computation?," in 9th AnnualACMSymposium on ParallelAlgorithms and Architectures, (Newport, Rhode Island), pp. 72- 83, June 1997.]] Google ScholarDigital Library
- 28.M.W. Goudreau, K. Lang, S. Rao, T. Suel, and T. Tsantilas, "Towards efficiency and portability: Programming with the BSP model," in Eighth AnnualACM Symposium on Parallel Algorithms and Architectures, pp. 1-12, June 1996.]] Google ScholarDigital Library
- 29.M. W. Goudreau and S. B. Rao, "Single message vs. batch communication," in Algorithms for Parallel Processing (M. T. Heath, A. Ranade, and R. S. Schreiber, eds.), vol. 105 of IMA Volumes in Mathematics and Its Applications, pp. 61-74, Springer-Verlag, 1999.]]Google Scholar
- 30.B. Grayson, M. Dahlin, and V. Ramachandran, "Experimental evaluation of QSM, a simple shared-memory model," Tech. Rep. UTCS TR98-21, University of Texas at Austin, November 1998.]] Google ScholarDigital Library
- 31.J.M.D. Hill, B. McColl, D. C. Stefanescu, M. W. Goudreau, K. Lang, S. B. Rao, T. Suel, T. Tsantilas, and R. Bisseling, "BSPlib: The BSP programming library," Parallel Computing, vol. 24, no. 14, pp. 1947-1980, 1998.]] Google ScholarDigital Library
- 32.R.W. Hockney and J. W. Eastwood, Computer Simulation Using Panicles. New York: McGraw-Hill, 1981.]] Google ScholarDigital Library
- 33.L. Iftode, i. P. Singh, and K. Li, "Understanding application performance on shared virtual memory systems," in ,Uroceedings of the 23rd Annual International Symposium on Computer Architecture, pp. 122-133, May 1996.]] Google ScholarDigital Library
- 34.P. Keleher, A. L. Cox, S. Dwarkadas, and W. Zwaenepoel, "Tread marks: Distributed shared memory on standard workstations and operating system," in Proceedings oft he Winter 1994 USENIX Conference: January 17-21, 1994, San Francisco, CA, pp. 115-132, 1994.]] Google ScholarDigital Library
- 35.K. Li and E Hudak, "Memory coherence in shared virtual memory systems,'" in Proceedings of the 5th Annual ACM Symposium on Principles of Distributed Computing, pp. 229-239, August 1986.]] Google ScholarDigital Library
- 36.W. E McColl, "General purpose parallel computing," 'in Lectures in Parallel Computation, Proceedings 1991 ALCOM Spring School on Parallel Computation (A. M. Gibbons and P. Spirakis, eds.), pp. 337-391, Cambridge University Press, 1993.]] Google ScholarDigital Library
- 37.R. Miller and J. Reed, The Oxford BSP Library Users" Guide Version 1.0. Oxford Parallel, 1993.]]Google Scholar
- 38.E. Rothberg and A. Gupta, "An efficient block-oriented approach to parallel sparse cholesky factorization," in Supercomputing '92 Proceedings, 1992.]] Google ScholarDigital Library
- 39.E. Rothberg and A. Gupta, "Efficient sparse matrix factorization on highperformance workstations--exploiting the memory heirarchy," ACId Transactions on Mathematical Software, vol. 17, no. 3, pp. 313-334, September 1991.]] Google ScholarDigital Library
- 40.E. Rothburg and A. Gupta, "Techniques for improving the performance of sparse matrix factorization on multiprocessor workstations," in Supercomputing '90, pp. 232-241, 1990.]] Google ScholarDigital Library
- 41.D.J. Scales and M, S. Lam, "The design and evaluation of a shared object system for distributed memory machines," in First Symposium on Operating Systems Design and Implementation, 1994.]] Google ScholarDigital Library
- 42.A. Sodan, G. R. Gao, O. Maquelin, J.-U. Schultz, and X.-M. Tian, "Experiences with non-numeric applications on multithreaded architectures," in Sixth ACM Symposium on Principles and Practice of Parallel Programming, (Las Vegas, NV), pp. 124-135, June 1997.]] Google ScholarDigital Library
- 43.L.G. Valiant, "A bridging model for parallel computation," Communications of the ACM, vol. 33, no. 8, pp. 103-111, 1990.]] Google ScholarDigital Library
- 44.L.G. Valiant, "General purpose parallel architectures," in Handbook of Theoretical Computer Science (J. van Leeuwen, ed.), vol. A: Algorithms and Complexity, oh. 18, pp. 943-971, Cambridge, MA: MIT Press, 1990.]] Google ScholarDigital Library
- 45.J.-P. Vidal, "The computation of GriSbner bases on a shared memory multiprocessol-," in Design an Implementation of Symbolic Computation Systems (A. Miola, ed.), no. 429 in Lecture Notes in Computer Science, pp. 81-90, Berlin: Springer- Verlag, 1990. International Symposium DISCO '90.]] Google ScholarDigital Library
- 46.D.W. Walker, "Characterizing the parallel performance of a large-scale, particlein-cell plasma simulation code," Concurrency, Practice and Experience, vol. 2, no. 4, pp. 257-288, Dec. 1990.]] Google ScholarDigital Library
- 47.W. Weihl, E. Brewer, A. Colbrook, C. Dellarocas, W. Hsieh, A. Joseph, C. Waldspurger, and P. Wang, "Prelude: A system for portable parallel software," Tech. Rep. MIT/LCS/TR-519, MIT, October 1991.]] Google ScholarDigital Library
- 48.C.-P. Wen and K. Yelick, "Portable runtime support for asynchronous simulation," in international Conference on Parallel Processing, August 1995.]]Google Scholar
- 49.K. Yelick, S. Chakrabarti, E. Deprit, J. Jones, A. Krishnamurthy, and C.-P. Wen, "Data structures for irregular applications," in DIMACS Workshop on Parallel Algorithms for Unstructured and Dynamic Problems, (Piscataway, N J), June 1993.]]Google Scholar
Index Terms
- BOS is boss: a case for bulk-synchronous object systems
Recommendations
Managing your Boss
You've often told your colleagues, "If those guys upstairs knew what we know down here, they'd do things completely differently." Now here's your chance. Your boss brings you to a meeting with a bunch of C-level executives. Instead of just throwing you ...
Comments