Chapter One - VLSI for SuperComputing: Creativity in R+D from applications and algorithms to masks and chips

https://doi.org/10.1016/bs.adcom.2022.01.001Get rights and content

Abstract

This article describes a course that teaches VLSI design in an unconventional way, to steer creativity and to motivate students to research new computing paradigms, like those presented in this Volume of Advances in Computers.

Introduction

This article describes a course that teaches VLSI design in an unconventional way, to steer creativity and to motivate students to research new computing paradigms, like those presented in this Volume of Advances in Computers. The course tries to stress the following issues:

  • (a)

    Experiences of the world's top foundries for VLSI in SuperComputing like Qualcom [1], Intel [2], and Mubadala [3]. Some Toshiba experiences were also taken into consideration [4].

  • (b)

    Experiences that will teach students worldwide to become competitive for work openings in a new industry.

  • (c)

    Experiences that synergize the phases of a holistic VLSI design, namely:

    • Phase #1: From Applications to Algorithms,

    • Phase #2: From Algorithms to Masks, and

    • Phase #3: From Masks to Chips.

In this consideration, Phase #1 is best treated by mathematicians, Phase #2 is best treated by computer engineers, and Phase #3 is best treated by physical chemists. Consequently, versions of this course (dialects) where experimentally offered at three different schools of the Belgrade University: The School of Mathematics, The School of Electrical Engineering, and The School of Physical Chemistry.

In each one of its three dialects, this course is stressing the following three issues:

  • Deep professional knowledge,

  • Detailed multi-dimensional verification, and

  • Relevant inter-disciplinary management.

The rest of this article describes the major elements of the presented course. These elements are the same for each one of its dialects; what differs are the teaching examples and homework assignments.

The rest of this article concentrates on the course dialect of interest to electrical engineering; so, the stress is on the transformation from algorithms to masks. In other words, the main question is, for a given algorithm, what is the best implementation of the corresponding VLSI chip. For that question to be answered effectively, an effort has to be invested into the induction of creativity among students, so a part of this text concentrates on that issue, too.

This course is different than any other VLSI design course that the authors are aware of, because it compares different paradigms (all major paradigms nowadays) and enables students to get the selected four algorithms (from four different application areas) through all the VLSI computing paradigms covered.

The four topics covered are: mathematics, image understanding, machine learning, and tensorflow. The four different paradigms covered are: (a) controlflow (multicore vs manycore), (b) dataflow (Maxeler DFE vs Google TPU), (c) diffusionflow (IoT vs WSN), and (d) advanced paradigms (quantum vs biological computing).

For the first two paradigms with four sub-paradigms in total, students are shown examples from mathematics, image understanding, machine learning, and then their task is to play with changes of relevant parameters, to study the effects.

Another important aspect of this course is that it also teaches the techno-economical aspects. It covers: (a) the methodologies to create ideas about possible improvements of the existing state-of-the-art (Blagojevic K20, Bankovic M24), (b) the mechanisms for fund-raising needed to implement the ideas, (c) the approaches of interest for project planning (CMMI and Scrum), and (d) the essence of patenting and trademarking.

Students appreciate techno-economical issue equally as technological issues. One of the past students currently holds the intellectual property of the largest patent settlement in the history of the planet for ICT (Kavcic ref); another one was involved in one of the largest Oracle acquisitions ever (Endeca), still another one is a vice-president of Qualcomm at the time this paper was written (Milivoj Aleksic), while yet another one was a vice-president of Intel (Dado Vrsalovic).

Section snippets

From algorithms to implementations

The electrical engineering dialect of the course is divided into three parts:

  • Part #1: VLSI for ControlFlow SuperComputing,

  • Part #2: VLSI for DataFlow SuperComputing, and

  • Part #3: VLSI for WirelessFlow SuperComputing.

Part #1 treats two topics of importance for ManyCore Systems and two topics of importance for MultiCore Systems, as follows:

  • ManyCore Systems:

  •   VHDL vs Verilog (0.5 weeks)

  •   Design and programming of a 200 MHz microprocessor (2.5 weeks)

  • MultiCore Systems:

  •   MicroProcessor and

Teaching experiences

The bottom line of this course is bringing advanced industrial experience into the classroom. In the first part, the experience is oriented to DARPA's first 200 MHz GaAs microprocessor. In the second part, the experience is oriented to the currently most successful Data Flow supercomputer—Maxeler. In the third part, the experience is oriented to the EU FP7 project ProSence.

Behavior of the students is observed for three different phases of their professional lifetime:

  • (a)

    During the course,

  • (b)

    During the

Creativity in computing

After the teaching, lab exercises, and homework assignments, students are invited to join research efforts, after appropriate preparations.

The preparations include two phases:

  • (a)

    Teaching the methods to enhance creativity [13], [14], and

  • (b)

    Teaching the methods to present research results [15], [16].

Then, the students are led to create the following conclusions:

  • (a)

    If the communication delays are negligible, the control flow approach could be the optimal paradigm.

  • (b)

    If the communication delays are massive,

The ultimate DataFlow

The recent public talks and university courses of Veljko Milutinovic were concentrating on the concept of Ultimate DataFlow for BigData, its potentials (up to 2000 in speed up, up to 200 in transistor count, and up to 20 in power savings), and its essence. Consequently, this section covers the issues related to the potentials of the concept, using the programming method of Maxeler, which is still far away from the ideal Ultimate DataFlow, but does achieve considerable speedups over Intel, and

Conclusion

The major purpose of this course was to lead students through an independent design of the computing infrastructure in the core of three different computing paradigms: ControlFlow, DataFlow, and WirelessFlow. By doing designs, through the standard mechanism of homework assignments, the students were able not only to learn the state of the art, but also to create backgrounds of importance for their inventive reach out into the world of computer engineering research.

This approach is of benefit to

Acknowledgment

The authors are thankful to professors Michael Flynn of Stanford and Oskar Mencer of Imperial for extremely valuable comments related to the improvement of the course describes in this article. Also, to Nobel Laurates Martin Perl of Stanford and Jerome Friedman of MIT, for their valuable discussing during their visits to Belgrade.

Prof. Veljko Milutinović (1951) received his PhD from the University of Belgrade in Serbia, spent about a decade on various faculty positions in the USA (mostly at Purdue University and more recently at the University of Indiana in Bloomington), and was a co-designer of the DARPAs pioneering GaAs RISC microprocessor on 200MHz (about a decade before the first commercial effort on that same speed) and was a co-designer also of the related GaAs Systolic Array (with 4096 GaAs microprocessors).

References (20)

There are more references available in the full text version of this article.

Cited by (0)

Prof. Veljko Milutinović (1951) received his PhD from the University of Belgrade in Serbia, spent about a decade on various faculty positions in the USA (mostly at Purdue University and more recently at the University of Indiana in Bloomington), and was a co-designer of the DARPAs pioneering GaAs RISC microprocessor on 200MHz (about a decade before the first commercial effort on that same speed) and was a co-designer also of the related GaAs Systolic Array (with 4096 GaAs microprocessors). Later, for almost three decades, he taught and conducted research at the University of Belgrade in Serbia, for departments of EE, MATH, BA, and PHYS/CHEM. His research is mostly in datamining algorithms and dataflow computing, with the emphasis on mapping of data analytics algorithms onto fast energy efficient architectures. Most of his research was done in cooperation with industry (Intel, Fairchild, Honeywell, Maxeler, HP, IBM, NCR, RCA, etc.). For 10 of his books, forewords were written by 10 different Nobel Laureates with whom he cooperated on his past industry sponsored projects. He published 40 books (mostly in the USA), he has over 100 papers in SCI journals (mostly in IEEE and ACM journals), and he presented invited talks at over 400 destinations worldwide. He has well over 1000 Thomson-Reuters WoS citations, well over 1000 Elsevier SCOPUS citations, and about 4000 Google Scholar citations. His Google Scholar h index is equal to 36. He is a Life Fellow of the IEEE since 2003 and a Member of The Academy of Europe since 2011. He is a member of the Serbian National Academy of Engineering and a Foreign Member of the Montenegro National Academy of Sciences and Arts.

Miloš Kotlar received his B.Sc. (2016) and M.Sc. (2017) degrees in Electrical and Computer Engineering from the University of Belgrade, School of Electrical Engineering, Serbia. He is a PhD candidate at the School of Electrical Engineering, University of Belgrade. His general research interests include implementation of energy efficient tensor implementations using the dataflow paradigm (FPGA and ASIC accelerators) and meta learning approaches for anomaly detection tasks.

Dr Jakob Salom received his BSc degree from the University of Belgrade, School of Electrical Engineering. Author/co-author of three books and dozens of sections in international books. Author/co-author of more than 20 peer-review articles in journals and conferences anthologies.

Saša Stojanović is on the faculty of the Department of Computer Engineering in the School of Electrical Engineering, University of Belgrade, Serbia. His PhD thesis, defended in the year 2015, was related to software similarity. He teaches courses on Embedded Systems and Mobile Devices Programming. His current research is in the fields of Software Similarity and Reverse Engineering.

Živojin Šuštran received the BSc and MSc degrees in electrical engineering and computing from the School of Electrical Engineering, University of Belgrade, Serbia, in 2010 and 2012, respectively, where he is currently pursuing the PhD degree. He is currently a Teaching Assistant with the School of Electrical Engineering, University of Belgrade. He has been involved in the research and development of hardware and software solutions in industry and academia for ten years, with expertise in computer architecture, cache memory design, systems programming, operating systems, and FPGA acceleration. He has coauthored two journal articles and gave talks at conferences in Europe. His current research interests include cache coherence and shared memory algorithms, hardware transactional memory, multicore architectures, and with special emphasis on asymmetric multiprocessors.

Aleksandar Veljković, Teaching Assistant, Faculty of Mathematics, University of Belgrade.

Jelena Marković, Teaching Assistant, Faculty of Mathematics, University of Belgrade.

Ali R. Hurson is a professor of Departments of Computer Science, and Electrical and Computer Engineering at Missouri S&T. For the period of 2008–2012 he served as the computer science department chair. Before joining Missouri S&T, he was a professor of Computer Science and Engineering department at The Pennsylvania State University. His research for the past 35 years has been supported by NSF, DARPA, Department of Education, Department of Transportation, Air Force, Office of Naval Research, NCR Corp., General Electric, IBM, Lockheed Martin, Penn State University, and Missouri S&T. He has published over 330 technical papers in areas including computer architecture/organization, cache memory, parallel and distributed processing, Sensor and Ad Hoc Networks, dataflow architectures, VLSI algorithms, security, Mobile and pervasive computing, database systems, multidatabases, global information sharing processing, application of mobile agent technology, and object-oriented databases.

Professor Hurson has been active in various IEEE/ACM Conferences and has given tutorials on global information sharing, database management systems, supercomputer technology, data/knowledge-based systems, dataflow processing, scheduling and load balancing, parallel computing, pervasive computing, green computing, and sustainability. He served as an IEEE editor, IEEE distinguished speaker, and an ACM distinguish lecturer. Currently, he is Editor-in-Chief of Advances in Computers, editor of The CSI Journal of Computer Science and Engineering, and editor of Computing Journal.

View full text