Proceedings of the 6th international conference on Supercomputing

ICS '92: Proceedings of the 6th international conference on Supercomputing

August 1992

1992 Proceeding

Chairmen:
Ken Kennedy
Rice University, Houston, TX
,
Constantine D. Polychronopoulos
Kubota Pacific Computers and CSRD

Publisher:

Association for Computing Machinery
New York
NY
United States

Conference:

ICS92: ACM SIGARCH International Conference on Supercomputing Washington D. C. USA July 19 - 24, 1992

ISBN:

978-0-89791-485-7

Published:

01 August 1992

Sponsors:

SIGARCH

Get Alerts for this ConferenceAlerts Save to BinderBinder

Save to Binder

Create a New Binder

Name

Export CitationCitation

Share on

Bibliometrics

Citation count

978

Downloads (6 weeks)

218

Downloads (12 months)

1,286

Downloads (cumulative)

20,191

Sections

ICS '92: Proceedings of the 6th international conference on Supercomputing

1992

Previous Next

Abstract

No abstract available.

Select All

Export Citations Save to Binder

Article

Free

Evaluation of compiler optimizations for Fortran D on MIMD distributed memory machines

Seema Hiranandani,
Ken Kennedy,
Chau-Wen Tseng

pp 1–14https://doi.org/10.1145/143369.143372

The Fortran D compiler uses data decomposition specifications to automatically translate Fortran programs for execution on MIMD distributed-memory machines. This paper introduces and classifies a number of advanced optimizations needed to achieve ...

- 63
- 384
Metrics
Total Citations63
Total Downloads384
Last 12 Months18
Last 6 weeks4

Abstract
View online with eReader
PDF

Article

Free

Evaluation of compiler generated parallel programs on three multicomputers

Roland Rühl

pp 15–24https://doi.org/10.1145/143369.143375

Distributed memory parallel processors (DMPPs) have no hardware support for a global address space. However, conventional programs written in a sequential imperative language such as Fortran typically manipulate few, large arrays. The Oxygen compiler, ...

- 4
- 292
Metrics
Total Citations4
Total Downloads292
Last 12 Months23
Last 6 weeks3

Abstract
View online with eReader
PDF

Article

Free

Automatic data mapping for distributed-memory parallel computers

Skef Wholey

pp 25–34https://doi.org/10.1145/143369.143377

The performance of a program on a distributed-memory parallel computer depends on the algorithms employed, the structure and speed of the machine's communication network, and the ways in which data are distributed to the processors. This paper addresses ...

- 21
- 522
Metrics
Total Citations21
Total Downloads522
Last 12 Months50
Last 6 weeks13

Abstract
View online with eReader
PDF

Article

Free

Characterizing memory performance in vector multiprocessors

J. E. Smith,
W. R. Taylor

pp 35–44https://doi.org/10.1145/143369.143379

We propose a set of three memory performance measures directed at vector multiprocessors. One is the port reservation time which is closely related to the commonly-used memory bandwidth measure. The second is the vector fill time and is the latency ...

- 12
- 286
Metrics
Total Citations12
Total Downloads286
Last 12 Months21
Last 6 weeks4

Abstract
View online with eReader
PDF

Article

Free

Performance analysis of the CM-2, a massively parallel SIMD computer

Jukka Helin

pp 45–52https://doi.org/10.1145/143369.143382

The performance evaluation process for a massively parallel distributed memory SIMD computer is described generally. The performance in basic computation, grid communication, and computation with grid communication is analyzed. A practical performance ...

- 3
- 371
Metrics
Total Citations3
Total Downloads371
Last 12 Months37
Last 6 weeks5

Abstract
View online with eReader
PDF

Article

Free

Evaluation of the lock mechanism in a snooping cache

Toshiaki Tarui,
Takayuki Nakagawa,
Noriyasu Ido,
Machiko Asaie,
Mamoru Sugie

pp 53–62https://doi.org/10.1145/143369.143384

This paper discusses the design concepts of a lock mechanism for a Parallel Inference Machine (the PIM/c prototype) and investigates the performance of the mechanism in detail.

Lock operations are extremely frequent on the PIM; however, lock contention ...

- 3
- 273
Metrics
Total Citations3
Total Downloads273
Last 12 Months28
Last 6 weeks2

Abstract
View online with eReader
PDF

Article

Free

Processor allocation and loop scheduling on multiprocessor computers

Nadia Tawbi,
Paul Feautrier

pp 63–71https://doi.org/10.1145/143369.143387

This paper is concerned with the automatic exploitation of the parallelism detected in a sequential program. The target machine is a shared memory multiprocessor.

The main goal is minimizing the completion time of the program. To achieve this, one has ...

- 11
- 262
Metrics
Total Citations11
Total Downloads262
Last 12 Months16
Last 6 weeks6

Abstract
View online with eReader
PDF

Article

Free

Low level scheduling using the hierarchical task graph

David R. Wallace

pp 72–81https://doi.org/10.1145/143369.143388

This paper introduces a new efficient instruction scheduling algorithm that can schedule across basic blocks. Scheduling globally, across basic blocks, is done by using an extension of the control flow graph (CFG) that combines both data and control ...

- 0
- 398
Metrics
Total Citations0
Total Downloads398
Last 12 Months31
Last 6 weeks4

Abstract
View online with eReader
PDF

Article

Free

Deriving good transformations for mapping nested loops on hierarchical parallel machines in polynomial time

K. G. Kumar,
D. Kulkarni,
A. Basu

pp 82–92https://doi.org/10.1145/143369.143390

We present a computationally efficient method for deriving the most appropriate transformation and mapping of a nested loop for a given hierarchical parallel machine. This method is in the context of our systematic and general theory of unimodular loop ...

- 8
- 218
Metrics
Total Citations8
Total Downloads218
Last 12 Months14
Last 6 weeks2

Abstract
View online with eReader
PDF

Article

Free

ABCL/onEM-4: a new software/hardware architecture for object-oriented concurrent computing on an extended dataflow supercomputer

Masahiro Yasugi,
Satoshi Matsuoka,
Akinori Yonezawa

pp 93–103https://doi.org/10.1145/143369.143392

The trend towards object-oriented software construction is becoming more and more prevalent, and parallel programming cannot be an exception. In the context of parallel computation, it is often natural to model the computation as message passing between ...

- 29
- 254
Metrics
Total Citations29
Total Downloads254
Last 12 Months32
Last 6 weeks7

Abstract
View online with eReader
PDF

Article

Free

Tolerating data access latency with register preloading

William Y. Chen,
Scott A. Mahlke,
Wen-mei W. Hwu,
Tokuzo Kiyohara,
Pohua P. Chang

pp 104–113https://doi.org/10.1145/143369.143394

By exploiting fine grain parallelism, superscalar processors can potentially increase the performance of future supercomputers. However, supercomputers typically have a long access delay to their first level memory which can severely restrict the ...

- 12
- 594
Metrics
Total Citations12
Total Downloads594
Last 12 Months15
Last 6 weeks8

Abstract
View online with eReader
PDF

Article

Free

Supercomputing and transputers

Falk Langhammer,
Francis Wray

pp 114–128https://doi.org/10.1145/143369.143396

It will be studied which degree parallel supercomputers can be scaled to. Necessary measures to achieve a maximum scalability will be discussed, and a case-study be presented. To this purpose, a new class of “supermassively parallel architectures” is ...

- 1
- 272
Metrics
Total Citations1
Total Downloads272
Last 12 Months11
Last 6 weeks2

Abstract
View online with eReader
PDF

Article

Free

Automatic software cache coherence through vectorization

Ervan Darnell,
John M. Mellor-Crummey,
Ken Kennedy

pp 129–138https://doi.org/10.1145/143369.143398

Access latency in large-scale shared-memory multiprocessors is a concern since most (if not all) memory is one or more hops away through an interconnection network. Providing processors with one or more levels of cache is an accepted way to reduce the ...

- 14
- 291
Metrics
Total Citations14
Total Downloads291
Last 12 Months33
Last 6 weeks4

Abstract
View online with eReader
PDF

Article

Free

Life span strategy—a compiler-based approach to cache coherence

Hoichi Cheong

pp 139–148https://doi.org/10.1145/143369.143402

In this paper, a cache coherence strategy with a combined software and hardware approach is proposed for large-scale multiprocessor systems. The new strategy has the scalability advantages of existing software strategies and does not rely on shared ...

- 26
- 259
Metrics
Total Citations26
Total Downloads259
Last 12 Months17
Last 6 weeks3

Abstract
View online with eReader
PDF

Article

Free

Conflict-free access of vectors with power-of-two strides

Mateo Valero,
Tomás Lang,
Eduard Ayguadé

pp 149–156https://doi.org/10.1145/143369.143403

An address mapping and an access order is presented for conflict-free access to vectors with any initial address and power-of-two strides. We show that for this conflict-free access it is necessary that the memory be unmatched and present an ...

- 13
- 300
Metrics
Total Citations13
Total Downloads300
Last 12 Months21
Last 6 weeks4

Abstract
View online with eReader
PDF

Article

Free

Parallel program visualization using SIEVE.1

Sekhar R. Sarukkai,
Dennis Gannon

pp 157–166https://doi.org/10.1145/143369.143404

In this paper we introduce a new model for the design of performance analysis and visualization tools. The system integrates static code analysis, relational database designs and a spreadsheet model of interactive programming. This system provides a ...

- 7
- 259
Metrics
Total Citations7
Total Downloads259
Last 12 Months11
Last 6 weeks1

Abstract
View online with eReader
PDF

Article

Free

The CODE 2.0 graphical parallel programming language

Peter Newton,
James C. Browne

pp 167–177https://doi.org/10.1145/143369.143405

CODE 2.0 is a graphical parallel programming system that targets the three goals of ease of use, portability, and production of efficient parallel code. Ease of use is provided by an integrated graphical/textual interface, a powerful dynamic model of ...

- 80
- 582
Metrics
Total Citations80
Total Downloads582
Last 12 Months48
Last 6 weeks6

Abstract
View online with eReader
PDF

Article

Free

Paralex: an environment for parallel programming in distributed systems

Özalp Babaoğlu,
Lorenzo Alvisi,
Alessandro Amoroso,
Renzo Davoli,
Luigi Alberto Giachini

pp 178–187https://doi.org/10.1145/143369.143406

Modern distributed systems consisting of powerful workstations and high-speed interconnection networks are an economical alternative to special-purpose super computers. The technical issues that need to be addressed in exploiting the parallelism ...

- 28
- 413
Metrics
Total Citations28
Total Downloads413
Last 12 Months25
Last 6 weeks4

Abstract
View online with eReader
PDF

Article

Free

Exploiting heterogeneous parallelism on a multithreaded multiprocessor

Gail Alverson,
Robert Alverson,
David Callahan,
Brian Koblenz,
Allan Porterfield,
Burton Smith

pp 188–197https://doi.org/10.1145/143369.143408

This paper describes an integrated architecture, compiler, runtime, and operating system solution to exploiting heterogeneous parallelism. The architecture is a pipelined multi-threaded multiprocessor, enabling the execution of very fine (multiple ...

- 54
- 661
Metrics
Total Citations54
Total Downloads661
Last 12 Months35
Last 6 weeks8

Abstract
View online with eReader
PDF

Article

Free

An architectural framework for migration from CISC to higher performance platforms

Gabriel M. Silberman,
Kemal Ebcioğlu

pp 198–215https://doi.org/10.1145/143369.143410

We describe a novel architectural framework that allows software applications written for a given Complex Instruction Set Computer (CISC) to migrate to a different, higher performance architecture, without a significant investment on the part of the ...

- 15
- 482
Metrics
Total Citations15
Total Downloads482
Last 12 Months83
Last 6 weeks15

Abstract
View online with eReader
PDF

Article

Free

Manchester data-flow: a progress report

J. R. Gurd,
D. F. Snelling

pp 216–225https://doi.org/10.1145/143369.143412

The Manchester Data-Flow Machine, MDFM, has evolved continuously during the past decade. By the time the prototype uniprocessor hardware system was decommissioned, in 1989, the putative multi-processor architecture comprised separate Processing Elements ...

- 3
- 515
Metrics
Total Citations3
Total Downloads515
Last 12 Months42
Last 6 weeks5

Abstract
View online with eReader
PDF

Article

Free

Array abstractions using semantic analysis of trapezoid congruences

François Masdupuy

pp 226–235https://doi.org/10.1145/143369.143414

With the growing use of vector supercomputers, efficient and accurate data structure analyses are needed. What we propose in this paper is to use the quite general framework of Cousot's abstract interpretation for the particular analysis of multi-...

- 10
- 384
Metrics
Total Citations10
Total Downloads384
Last 12 Months34
Last 6 weeks4

Abstract
View online with eReader
PDF

Article

Free

A comprehensive approach to parallel data flow analysis

Yong-Fong Lee,
Barbara G. Ryder

pp 236–247https://doi.org/10.1145/143369.143415

We present a comprehensive approach to performing data flow analysis in parallel. We first identify three types of parallelism inherent in the data flow solution process: independent-problem parallelism, separate-unit parallelism and algorithmic ...

- 21
- 587
Metrics
Total Citations21
Total Downloads587
Last 12 Months44
Last 6 weeks4

Abstract
View online with eReader
PDF

Article

Free

Compile-time analysis of communicating processes

Peter Ladkin,
Barbara Simons

pp 248–259https://doi.org/10.1145/143369.143417

We present an algorithm for analyzing deadlock and for constructing sequentializations of a class of communicating sequential processes. The algorithm may be used for deadlock detection in parallel and distributed programs at compile time, or for ...

- 9
- 323
Metrics
Total Citations9
Total Downloads323
Last 12 Months21
Last 6 weeks2

Abstract
View online with eReader
PDF

Article

Free

Register requirements of pipelined processors

William Mangione-Smith,
Santosh G. Abraham,
Edward S. Davidson

pp 260–271https://doi.org/10.1145/143369.143419

To enable concurrent instruction execution, scientific computers generally rely on pipelining, which combines with faster system clocks to achieve greater throughput. Each concurrently executing instruction requires buffer space, usually implemented as ...

- 45
- 367
Metrics
Total Citations45
Total Downloads367
Last 12 Months40
Last 6 weeks8

Abstract
View online with eReader
PDF

Article

Free

Benchmarking a vector-processor prototype based on multithreaded streaming/FIFO vector (MSFV) architecture

Tetsuo Hironaka,
Takashi Hashimoto,
Keizo Okazaki,
Kazuaki Murakami,
Shinji Tomita

pp 272–281https://doi.org/10.1145/143369.143420

This paper presents the benchmark results on a vector-processor prototype based on the MSFV (multithreaded streaming/FIFO vector) architecture. The MSFV architecture is single-chip oriented, and thus its main object is to save the off-chip memory ...

- 2
- 336
Metrics
Total Citations2
Total Downloads336
Last 12 Months25
Last 6 weeks3

Abstract
View online with eReader
PDF

Article

Free

On storage schemes for parallel array access

Zhiyong Liu,
Xiaobo Li,
Jia-Huai You

pp 282–291https://doi.org/10.1145/143369.143421

In parallel matrix manipulation operations, some data patterns need to be accessed in one memory cycle without conflict. Investigating the frequently used data patterns, we propose a powerful skewing scheme which allows most frequently used data ...

- 6
- 278
Metrics
Total Citations6
Total Downloads278
Last 12 Months16
Last 6 weeks1

Abstract
View online with eReader
PDF

Article

Free

A general algorithm for data dependence analysis

Christine Eisenbeis,
Jean-Claude Sogno

pp 292–302https://doi.org/10.1145/143369.143422

With the development of ever more sophisticated data flow analysis algorithms, traditional data dependence tests based on elementary loop information will not be sufficient in the future. In this paper, quite general algorithms are presented for solving ...

- 11
- 382
Metrics
Total Citations11
Total Downloads382
Last 12 Months16
Last 6 weeks2

Abstract
View online with eReader
PDF

Article

Free

On exact data dependence analysis

Kleanthis Psarris

pp 303–312https://doi.org/10.1145/143369.143424

The GCD test and the Banerjee-Wolfe test are the two tests traditionally used to determine statement data dependence, subject to direction vectors, in automatic vectorization / parallelization of loops. In an earlier study [14] a sufficient condition ...

- 3
- 517
Metrics
Total Citations3
Total Downloads517
Last 12 Months43
Last 6 weeks12

Abstract
View online with eReader
PDF

Article

Free

Array privatization for parallel execution of loops

Zhiyuan Li

pp 313–322https://doi.org/10.1145/143369.143426

In recent experiments, array privatization played a critical role in successful parallelization of several real programs. This paper presents compiler algorithms for the program analysis for this transformation. The paper also addresses issues in the ...

- 66
- 2,235
Metrics
Total Citations66
Total Downloads2,235
Last 12 Months37
Last 6 weeks5

Abstract
View online with eReader
PDF

Save to Binder

Create a New Binder

Name

Contributors

Ken Wade Kennedy
Rice University
- Publication Years1971 - 2011
- Publication counts188
- Citation count11,023
- Available for Download110
- Downloads (cumulative)83,735
- Downloads (12 months)6,602
- Downloads (6 weeks)1,166
- Average Downloads per Article761
- Average Citation per Article59
View Full Profile
Constantine Demetrios Polychronopoulos
University of Illinois Urbana-Champaign
- Publication Years1986 - 2010
- Publication counts88
- Citation count909
- Available for Download30
- Downloads (cumulative)14,683
- Downloads (12 months)557
- Downloads (6 weeks)106
- Average Downloads per Article489
- Average Citation per Article10
View Full Profile

Index Terms

Proceedings of the 6th international conference on Supercomputing

Recommendations

UbiMob '05: Proceedings of the 2nd French-speaking conference on Mobility and ubiquity computing
Read More
UbiMob '08: Proceedings of the 4th French-speaking conference on Mobility and ubiquity computing
Read More
IHM '09: Proceedings of the 21st International Conference on Association Francophone d'Interaction Homme-Machine
Read More

Acceptance Rates

Overall Acceptance Rate584of2,055submissions,28%

Year	Submitted	Accepted	Rate
ICS '21	157	39	25%
ICS '15	160	40	25%
ICS '14	160	34	21%
ICS '13	202	43	21%
ICS '06	141	37	26%
ICS '03	171	36	21%
ICS '02	144	31	22%
ICS '01	133	45	34%
ICS '00	122	33	27%
ICS '99	180	57	32%
ICS '97	135	45	33%
ICS '96	116	50	43%
ICS '95	120	49	41%
ICS '94	114	45	39%
Overall	2,055	584	28%

Comments

Export Citations

Select Citation format

Please download or close your previous search result export first before starting a new bulk export.
Preview is not available.
By clicking download,a status dialog will open to start the export process. The process may takea few minutes but once it finishes a file will be downloadable from your browser. You may continue to browse the DL while the export process is in progress.
Download
- Download citation
- Copy citation

Save to Binder

Sections

Save to Binder

Index Terms

Recommendations

UbiMob '05: Proceedings of the 2nd French-speaking conference on Mobility and ubiquity computing

UbiMob '08: Proceedings of the 4th French-speaking conference on Mobility and ubiquity computing

IHM '09: Proceedings of the 21st International Conference on Association Francophone d'Interaction Homme-Machine

Acceptance Rates