Proceedings of the 8th international conference on Supercomputing

ICS '94: Proceedings of the 8th international conference on Supercomputing

July 1994

1994 Proceeding

Chairmen:
John Gurd
Univ. of Manchester, Manchester, UK
,
William Jalby
Univ. de Versailles, France

Publisher:

Association for Computing Machinery
New York
NY
United States

Conference:

ICS94: International Conference on Supercomputing '94 Manchester England July 11 - 15, 1994

ISBN:

978-0-89791-665-3

Published:

16 July 1994

Sponsors:

SIGARCH

Get Alerts for this ConferenceAlerts Save to BinderBinder

Save to Binder

Create a New Binder

Name

Export CitationCitation

Share on

Bibliometrics

Citation count

502

Downloads (6 weeks)

112

Downloads (12 months)

826

Downloads (cumulative)

14,361

Sections

ICS '94: Proceedings of the 8th international conference on Supercomputing

1994

Previous Next

Abstract

No abstract available.

Select All

Export Citations Save to Binder

Article

Free

Distributed storage control unit for the Hitachi S-3800 multivector supercomputer

Katsuyoshi Kitai,
Tadaaki Isobe,
Tadayuki Sakakibara,
Shigeko Yazawa,
Yoshiko Tamaki,
Teruo Tanaka,
Kouichi Ishii

pp 1–10https://doi.org/10.1145/181181.181183

This paper discusses the storage control unit of the Hitachi S-3800 supercomputer series, which is capable of achieving 8 GFLOPS in each of up to four shared-memory multiprocessors. This storage control unit is distributed to the V-SCs (vector-processor-...

- 11
- 285
Metrics
Total Citations11
Total Downloads285
Last 12 Months23
Last 6 weeks3

Abstract
View online with eReader
PDF

Article

Free

A model for dataflow based vector execution

W. Marcus Miller,
Walid A. Najjar,
A. P. Wim Böhm

pp 11–22https://doi.org/10.1145/181181.181197

Although the dataflow model has been shown to allow the exploitation of parallelism at all levels, research of the past decade has revealed several fundamental problems: Synchronization at the instruction level, token matching, coloring and re-labeling ...

- 2
- 367
Metrics
Total Citations2
Total Downloads367
Last 12 Months17
Last 6 weeks2

Abstract
View online with eReader
PDF

Article

Free

Synchronized access to streams in SIMD vector multiprocessors

Montse Peiron,
Mateo Valero,
Eduard Ayguadé

pp 23–32https://doi.org/10.1145/181181.181204

The synchronized and simultaneous access to several vectors that form a single stream is typical in SIMD vector multiprocessors as well as in MIMD superscalar multiprocessors with decoupled access. In this paper we propose a block-interleaved storage ...

- 6
- 274
Metrics
Total Citations6
Total Downloads274
Last 12 Months22
Last 6 weeks7

Abstract
View online with eReader
PDF

Article

Free

The privatizing DOALL test: a run-time technique for DOALL loop identification and array privatization

Lawrence Rauchwerger,
David Padua

pp 33–43https://doi.org/10.1145/181181.181254

Current parallelizing compilers cannot identify a significant fraction of fully parallel loops because they have complex or statically insufficiently defined access patterns. For this reason, we have developed the Privatizing DOALL test—a technique for ...

- 72
- 336
Metrics
Total Citations72
Total Downloads336
Last 12 Months28
Last 6 weeks3

Abstract
View online with eReader
PDF

Article

Free

Reducing data communication overhead for DOACROSS loop nests

Peiyi Tang,
John N. Zigman

pp 44–53https://doi.org/10.1145/181181.181261

If the iterations of a loop nest cannot be partitioned into independent tasks, data communication for data dependence is inevitable in order to execute them on parallel machines. This kind of loop nest is referred to as a DOACROSS loop nest.

This paper ...

- 19
- 357
Metrics
Total Citations19
Total Downloads357
Last 12 Months19
Last 6 weeks2

Abstract
View online with eReader
PDF

Article

Free

Evaluating automatic parallelization for efficient execution on shared-memory multiprocessors

Kathryn S. McKinley

pp 54–63https://doi.org/10.1145/181181.181265

We present a parallel code generation algorithm for complete applications and a new experimental methodology that tests the efficacy of our approach. The algorithm optimizes for data locality and parallelism, reducing or eliminating false sharing. It ...

- 12
- 284
Metrics
Total Citations12
Total Downloads284
Last 12 Months9
Last 6 weeks0

Abstract
View online with eReader
PDF

Article

Free

An evaluation of directory protocols for medium-scale shared-memory multiprocessors

Shubhendu S. Mukherjee,
Mark D. Hill

pp 64–74https://doi.org/10.1145/181181.181271

This paper considers alternative directory protocols for providing cache coherence in shared-memory multiprocessors with 32 to 128 processors, where the state requirements of Dir_N may be considered too large. We consider Dir_iB, i=1,2,4, Dir_N, Tristate (...

- 22
- 254
Metrics
Total Citations22
Total Downloads254
Last 12 Months11
Last 6 weeks2

Abstract
View online with eReader
PDF

Article

Free

An evaluation of a compiler optimization for improving the performance of a coherence directory

Farnaz Mounes-Toussi,
David J. Lilja,
Zhiyuan Li

pp 75–84https://doi.org/10.1145/181181.181281

Both hardware-controlled and compiler-directed mechanisms have been proposed for maintaining cache coherence in large-scale shared-memory multiprocessors, but both of these approaches have significant limitations. We examine the potential performance ...

- 3
- 231
Metrics
Total Citations3
Total Downloads231
Last 12 Months13
Last 6 weeks1

Abstract
View online with eReader
PDF

Article

Free

Parallelisation of the SDEM distinct element stress analysis code on the KSR-1

G. K. Egan,
G. D. Riley,
J. M. Bull

pp 85–92https://doi.org/10.1145/181181.181289

The SDEM code models systems of interacting blocks of rock using the distinct element (DE) method, which represents these systems as discontinuums with each block acting under Newton's laws of motion. The data structures associated with the DE method ...

- 0
- 247
Metrics
Total Citations0
Total Downloads247
Last 12 Months8
Last 6 weeks1

Abstract
View online with eReader
PDF

Article

Free

Ultrasonic wave propagation on parallel machines

C. Domain,
J.-P. Grégoire,
B. Thomas

pp 93–97https://doi.org/10.1145/181181.181307

“ULTSON” is a 2D code which solves the elastodynamic equations in a regular structured mesh. It has been developed at EDF to be used for non-destructive testing of nuclear power plants. Today, the code runs on classical architectures like Cray (YMP or ...

- 0
- 259
Metrics
Total Citations0
Total Downloads259
Last 12 Months11
Last 6 weeks1

Abstract
View online with eReader
PDF

Article

Free

An efficient approach to computing fixpoints for complex program analysis

Li-Ling Chen,
Williams Ludwell Harrison

pp 98–106https://doi.org/10.1145/181181.181308

A chief source of inefficiency in program analysis using abstract interpretation comes from the fact that a large context (i.e., problem state) is propagated from node to node during the course of an analysis. This problem can be addressed and largely ...

- 8
- 583
Metrics
Total Citations8
Total Downloads583
Last 12 Months29
Last 6 weeks4

Abstract
View online with eReader
PDF

Article

Free

Optimal local register allocation for a multiple-issue machine

Waleed M. Meleis,
Edward S. Davidson

pp 107–116https://doi.org/10.1145/181181.181318

This paper presents an algorithm that allocates registers optimally for straight-line code running on a generic multi-issue computer. On such a machine, an optimal register allocation is one that minimizes the number of issue slots that the code ...

- 3
- 301
Metrics
Total Citations3
Total Downloads301
Last 12 Months35
Last 6 weeks2

Abstract
View online with eReader
PDF

Article

Free

Scheduling reductions

Xavier Redon,
Paul Feautrier

pp 117–125https://doi.org/10.1145/181181.181319

In order to detect more parallelism in scientific programs, one may extract a parallelism relative to reductions. This paper presents such a method which schedules programs with explicit computations of reductions. We describe the way the reductions are ...

- 10
- 310
Metrics
Total Citations10
Total Downloads310
Last 12 Months23
Last 6 weeks4

Abstract
View online with eReader
PDF

Article

Free

A dominating set model for broadcast in all-port wormhole-routed 2D mesh networks

Yih-jia Tsai,
Philip K. McKinley

pp 126–135https://doi.org/10.1145/181181.181323

A new model for broadcast in wormhole-routed networks is proposed. The model uses and extends the concept of dominating sets in order to systematically develop efficient broadcast algorithms for all-port wormhole-routed systems, in which each node can ...

- 15
- 329
Metrics
Total Citations15
Total Downloads329
Last 12 Months10
Last 6 weeks0

Abstract
View online with eReader
PDF

Article

Free

The interaction between virtual channel flow control and adaptive routing in wormhole networks

Swaminathan Ramany,
Derek Eager

pp 136–145https://doi.org/10.1145/181181.181327

Multiprocessor interconnection networks based on low dimensional mesh or torus topologies and employing wormhole switching have become increasingly popular. Two concepts that have been proposed to improve the performance of such networks are Virtual ...

- 6
- 311
Metrics
Total Citations6
Total Downloads311
Last 12 Months9
Last 6 weeks1

Abstract
View online with eReader
PDF

Article

Free

Fault-tolerant wormhole routing in tori

Suresh Chalasani,
Rajendra V. Boppana

pp 146–155https://doi.org/10.1145/181181.181330

We present a method to enhance wormhole routing algorithms for deadlock-free fault-tolerant routing in tori. We consider arbitrarily-located faulty blocks and assume only local knowledge of faults. Messages are routed via shortest paths when there are ...

- 33
- 343
Metrics
Total Citations33
Total Downloads343
Last 12 Months30
Last 6 weeks1

Abstract
View online with eReader
PDF

Article

Free

Performance of the CM-5 scalable file system

Thomas T. Kwan,
Daniel A. Reed

pp 156–165https://doi.org/10.1145/181181.181332

Assessing the performance and software interactions of emerging parallel input/output systems is a critical first step in input/output software tuning. Moreover, understanding the system response to well-understood, synthetic input/output patterns is ...

- 11
- 220
Metrics
Total Citations11
Total Downloads220
Last 12 Months17
Last 6 weeks2

Abstract
View online with eReader
PDF

Article

Free

Communication in the KSR1 MPP: performance evaluation using synthetic workload experiments

Eric L. Boyd,
Edward S. Davidson

pp 166–175https://doi.org/10.1145/181181.181334

We have developed an automatic technique for evaluating the communication performance of massively parallel processors (MPPs). Both communication latency and the amount of communication are investigated as a function of a few basic parameters that ...

- 3
- 220
Metrics
Total Citations3
Total Downloads220
Last 12 Months15
Last 6 weeks1

Abstract
View online with eReader
PDF

Article

Free

Architecture implications of high-speed I/O for distributed-memory computers

Thomas Gross,
Peter Steenkiste

pp 176–185https://doi.org/10.1145/181181.181335

We consider the problem of high-speed I/O for a single application running on multiple nodes of a distributed-memory parallel computer. Our model is that the parallel system is connected to an I/O system that provides the interface between the internal ...

- 3
- 262
Metrics
Total Citations3
Total Downloads262
Last 12 Months16
Last 6 weeks2

Abstract
View online with eReader
PDF

Article

Free

Combining static and dynamic scheduling on distributed-memory multiprocessors

Oscar Plata,
Francisco F. Rivera

pp 186–195https://doi.org/10.1145/181181.181336

Loops are a large source of parallelism for many numerical applications. An important issue in the parallel execution of loops is how to schedule them so that the workload is well balanced among the processors. Most existing loop scheduling algorithms ...

- 9
- 437
Metrics
Total Citations9
Total Downloads437
Last 12 Months39
Last 6 weeks4

Abstract
View online with eReader
PDF

Article

Free

An optimal upper bound on the minimal completion time in distributed supercomputing

Lars Lundberg,
Håkan Lennerstad

pp 196–203https://doi.org/10.1145/181181.181339

We first consider an MIMD multiprocessor configuration with n processors. A parallel program, consisting of n processes, is executed on this system—one process per processor. The program terminates when all processes are completed. Due to ...

- 11
- 164
Metrics
Total Citations11
Total Downloads164
Last 12 Months7
Last 6 weeks0

Abstract
View online with eReader
PDF

Article

Free

Compiler techniques for maximizing fine-grain and coarse-grain parallelism in loops with uniform dependences

Yeong-Sheng Chen,
Sheng-De Wang,
Chien-Min Wang

pp 204–213https://doi.org/10.1145/181181.181349

In this paper, an approach to the problem of exploiting parallelism within nested loops is proposed. The proposed method first finds out all the initially independent computations, and then, based on them, identifies the valid partitioning bases to ...

- 0
- 246
Metrics
Total Citations0
Total Downloads246
Last 12 Months21
Last 6 weeks7

Abstract
View online with eReader
PDF

Article

Free

Data and program restructuring of irregular applications for cache-coherent multiprocessor

Karen A. Tomko,
Santosh G. Abraham

pp 214–225https://doi.org/10.1145/181181.181351

Applications with irregular data structures such as sparse matrices or finite element meshes account for a large fraction of engineering and scientific applications. Domain decomposition techniques are commonly used to partition these applications to ...

- 8
- 240
Metrics
Total Citations8
Total Downloads240
Last 12 Months8
Last 6 weeks0

Abstract
View online with eReader
PDF

Article

Free

Nonzero structure analysis

Aart J. C. Bik,
Harry A. G. Wijshoff

pp 226–235https://doi.org/10.1145/181181.181538

Because the efficiency of sparse codes is very much dependent on the size and structure of input data, peculiarities of the nonzero structures of sparse matrices must be accounted for in order to avoid unsatisfying performance. Usually, this implies ...

- 8
- 296
Metrics
Total Citations8
Total Downloads296
Last 12 Months17
Last 6 weeks4

Abstract
View online with eReader
PDF

Article

Free

Techniques to overlap computation and communication in irregular iterative applications

Antonio Lain,
Prithviraj Banerjee

pp 236–245https://doi.org/10.1145/181181.181539

There are many applications in CFD and structural analysis that can be more accurately modeled using unstructured grids. Parallelization of implicit methods for unstructured grids is a difficult and important problem. This paper deals with coloring ...

- 17
- 284
Metrics
Total Citations17
Total Downloads284
Last 12 Months4
Last 6 weeks1

Abstract
View online with eReader
PDF

Article

Free

Performance analysis of a synchronous, circuit-switched interconnection cached network

Vipul Gupta,
Eugen Schenfeld

pp 246–255https://doi.org/10.1145/181181.181540

In many parallel applications, each computation entity (process, thread etc.) switches the bulk of its communication between a small group of other entities. We call this phenomenon switching locality. The Interconnection Cached Network (ICN) is a ...

- 8
- 489
Metrics
Total Citations8
Total Downloads489
Last 12 Months16
Last 6 weeks1

Abstract
View online with eReader
PDF

Article

Free

An analysis model on nonblocking multirate broadcast networks

Yuanyuan Yang

pp 256–263https://doi.org/10.1145/181181.181541

Designing efficient interconnection networks with powerful connecting capability remains a key issue to parallel and distributed computing systems. Many progresses have been made in nonblocking broadcast networks which can realize all one-to-many ...

- 11
- 963
Metrics
Total Citations11
Total Downloads963
Last 12 Months17
Last 6 weeks4

Abstract
View online with eReader
PDF

Article

Free

Exploiting cache affinity in software cache coherence

Hui Li,
Kenneth C. Sevcik

pp 264–273https://doi.org/10.1145/181181.181542

Cache affinity is important to the performance of scalable shared memory multiprocessors. For multiprocessors without hardware cache coherence support, software cache coherence is the only alternative. Most existing software cache schemes ignore cache ...

- 2
- 867
Metrics
Total Citations2
Total Downloads867
Last 12 Months30
Last 6 weeks9

Abstract
View online with eReader
PDF

Article

Free

Performance evaluation of hybrid hardware and software distributed shared memory protocols

Rohit Chandra,
Kourosh Gharachorloo,
Vijayaraghavan Soundararajan,
Anoop Gupta

pp 274–288https://doi.org/10.1145/181181.181543

Hardware distributed shared memory (DSM) systems efficiently support fine grain sharing of data by maintaining coherence at the level of individual cache lines and providing automatic replication in processor caches. Software DSM systems, on the other ...

- 17
- 384
Metrics
Total Citations17
Total Downloads384
Last 12 Months25
Last 6 weeks6

Abstract
View online with eReader
PDF

Article

Free

Limited area numerical weather forecasting on a massively parallel computer

Lex Wolters,
Gerard Cats,
Nils Gustafsson

pp 289–296https://doi.org/10.1145/181181.181544

A data-parallel implementation on a SIMD platform of an operational numerical weather forecast model is presented. The performances of two popular numerical techniques within these models are discussed, namely finite difference (gridpoint) methods and ...

- 2
- 473
Metrics
Total Citations2
Total Downloads473
Last 12 Months30
Last 6 weeks3

Abstract
View online with eReader
PDF

Save to Binder

Create a New Binder

Name

Contributors

John R Gurd
The University of Manchester
- Publication Years1981 - 2009
- Publication counts55
- Citation count788
- Available for Download24
- Downloads (cumulative)11,180
- Downloads (12 months)446
- Downloads (6 weeks)62
- Average Downloads per Article466
- Average Citation per Article14
View Full Profile
William Jalby
University of Versailles Saint-Quentin-en-Yvelines
- Publication Years1985 - 2019
- Publication counts71
- Citation count1,070
- Available for Download27
- Downloads (cumulative)7,758
- Downloads (12 months)582
- Downloads (6 weeks)92
- Average Downloads per Article287
- Average Citation per Article15
View Full Profile

Index Terms

Proceedings of the 8th international conference on Supercomputing

Recommendations

UbiMob '05: Proceedings of the 2nd French-speaking conference on Mobility and ubiquity computing
Read More
UbiMob '08: Proceedings of the 4th French-speaking conference on Mobility and ubiquity computing
Read More
IHM '09: Proceedings of the 21st International Conference on Association Francophone d'Interaction Homme-Machine
Read More

Acceptance Rates

ICS '94 Paper Acceptance Rate45of114submissions,39%Overall Acceptance Rate584of2,055submissions,28%

Year	Submitted	Accepted	Rate
ICS '21	157	39	25%
ICS '15	160	40	25%
ICS '14	160	34	21%
ICS '13	202	43	21%
ICS '06	141	37	26%
ICS '03	171	36	21%
ICS '02	144	31	22%
ICS '01	133	45	34%
ICS '00	122	33	27%
ICS '99	180	57	32%
ICS '97	135	45	33%
ICS '96	116	50	43%
ICS '95	120	49	41%
ICS '94	114	45	39%
Overall	2,055	584	28%

Comments

Export Citations

Select Citation format

Please download or close your previous search result export first before starting a new bulk export.
Preview is not available.
By clicking download,a status dialog will open to start the export process. The process may takea few minutes but once it finishes a file will be downloadable from your browser. You may continue to browse the DL while the export process is in progress.
Download
- Download citation
- Copy citation

Save to Binder

Sections

Save to Binder

Index Terms

Recommendations

UbiMob '05: Proceedings of the 2nd French-speaking conference on Mobility and ubiquity computing

UbiMob '08: Proceedings of the 4th French-speaking conference on Mobility and ubiquity computing

IHM '09: Proceedings of the 21st International Conference on Association Francophone d'Interaction Homme-Machine

Acceptance Rates