Abstract
In this work we investigate how different machine settings for a hardware Distributed Shared Memory (DSM) architecture affect the performance of parallel logic programming (PLP) systems. We use execution-driven simulation of a DASH-like multiprocessor to study the impact of the cache block size, the cache size, the network bandwidth, the write buffer size, and the coherence protocol on the performance of Andorra-I, a PLP system capable of exploiting implicit parallelism in Prolog programs. Among several other observations, we find that PLP systems favour small cache blocks regardless of the coherence protocol, while they favour large cache sizes only in the case of invalidate-based coherence. We conclude that the cache block size, the cache size, the network bandwidth, and the coherence protocol have a significant impact on the performance, while the size of the write buffer is somewhat irrelevant.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Anthony Beaumont, S. Muthu Raman, and Péter Szeredi. Flexible Scheduling of Or-Parallelism in Aurora: The Bristol Scheduler. In Aarts, E. H. L. and van Leeuwen, J. and Rem, M., editor, PARLE91: Conference on Parallel Architectures and Languages Europe, volume 2, pages 403–420. Springer Verlag, June 1991. Lecture Notes in Computer Science 506.
R. Bianchini and L. I. Kontothanassis. Algorithms for categorizing multiprocessor communication under invalidate and update-based coherence protocols. In Proceedings of the 28th Annual Simulation Symposium, April 1995.
J. A. Crammond. The Abstract Machine and Implementation of Parallel Parlog. Technical report, Dept. of Computing, Imperial College, London, June 1990.
M. Dubois, J. Skeppstedt, L. Ricciulli, K. Ramamurthy, and P. Stenstrom. The detection and elimination of useless misses in multiprocessors. In Proceedings of the 20th ISCA, pages 88–97, May 1993.
I. C. Dutra. Strategies for Scheduling And-and Or-Work in Parallel Logic Programming Systems. In Proceedings of the 1994 International Logic Programming Symposium, pages 289–304. MIT Press, 1994. Also available as technical report CSTR-94-09, from the Department of Computer Science, University of Bristol, England.
I. C. Dutra. Distributing And-and Or-Work in the Andorra-I Parallel Logic Programming System. PhD thesis, University of Bristol, Department of Computer Science, February 1995. available at http://www.cos.ufrj.br/~ines.
James R. Goodman. Using cache memory to reduce processor-memory traffic. In Proceedings of the 10th International Symposium on Computer Architecture, pages 124–131, 1983.
Markus Hitz and Erich Kaltofen, editors. Proceedings of the Second International Symposium on Parallel Symbolic Computation, PASCO’97, July 1997.
D. Lenoski, J. Laudon, K. Gharachorloo, A. Gupta, and J. Hennessy. The directory-based cache coherence protocol for the DASH multiprocessor. Proceedings of the 17th ISCA, pages 148–159, May 1990.
D. Lenoski, J. Laudon, T. Joe, D. Nakahira, L. Stevens, A. Gupta, and J. Hennessy. The dash prototype: Logic overhead and performance. IEEE Transactions on Parallel and Distributed Systems, 4(1):41–61, Jan 1993.
Ewing Lusk, David H. D. Warren, Seif Haridi, et al. The Aurora Or-parallel Prolog System. New Generation Computing, 7(2,3):243–271, 1990.
E. M. McCreight. The Dragon Computer System, an Early Overview. In NATO Advanced Study Institute on Microarchitecture of VLSI Computers, July 1984.
Johan Montelius and Seif Haridi. An evaluation of Penny: a system for fine grain implicit parallelism. In Proceedings of 2nd International Symposium on Parallel Symbolic Computation8, July 1997.
S. Raina, D. H. D. Warren, and J. Cownie. Parallel Prolog on a Scalable Multiprocessor. In Peter Kacsuk and Michael J. Wise, editors, Implementations of Distributed Prolog, pages 27–44. Wiley, 1992.
V. Santos Costa, Bianchini, and I. C. Dutra. Parallel Logic Programming Systems on Scalable Multiprocessors. In Proceedings of the 2nd International Symposium on Parallel Symbolic Computation, PASCO’97 [8], pages 58–67, July 1997.
V. Santos Costa, R. Bianchini, and I. C. Dutra. Evaluating the impact of coherence protocols on parallel logic programming systems. In Proceedings of the 5th EUROMICRO Workshop on Parallel and Distributed Processing, pages 376–381, 1997. Also available as technical report ES-389/96, COPPE/Systems Engineering, May, 1996.
V. Santos Costa and Bianchini R. Optimising Parallel Logic Programming Systems for Scalable Machines. In Proceedings of the EUROPAR’98, Sep 1998.
V. Santos Costa, D. H. D. Warren, and R. Yang. Andorra-I: A Parallel Prolog System that Transparently Exploits both And-and Or-Parallelism. In Third ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming, pages 83–93. ACM press, April 1991. SIGPLAN Notices vol 26(7), July 1991.
Evan Tick. Memory Performance of Prolog Architectures. Kluwer Academic Publishers, Norwell, MA 02061, 1987.
J. E. Veenstra and R. J. Fowler. Mint: A front end for efficient simulation of shared-memory multiprocessors. In Proceedings of the 2nd International Workshop on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS’ 94), 1994.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Silva, M.G., Dutra, I.C., Bianchini, R., Costa, V.S. (1998). The Influence of Architectural Parameters on the Performance of Parallel Logic Programming Systems. In: Gupta, G. (eds) Practical Aspects of Declarative Languages. PADL 1999. Lecture Notes in Computer Science, vol 1551. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49201-1_9
Download citation
DOI: https://doi.org/10.1007/3-540-49201-1_9
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65527-5
Online ISBN: 978-3-540-49201-6
eBook Packages: Springer Book Archive