### **DEPARTMENT: MICRO LAW**

# Microarchitecture Patents Over Time and Interesting Early Microarchitecture Patents

Joshua J. Yi <sup>10</sup>, The Law Office of Joshua J. Yi, PLLC, Austin, TX, 78750, USA

n honor of the 50th anniversary of the microprocessor, this article analyzes the number of microarchitectural (core+memory) patents over the last 50 years and presents the earliest patent that I could find on the microprocessor, branch prediction, out-of-order execution, and next-line prefetching.

## MICROARCHITECTURE PATENTS OVER TIME

Figure 1 depicts the number of microarchitectural patents issued, based on when the application for those patents were filed.<sup>a</sup>

The purple line in Figure 1—which is labeled as "core"—corresponds the 712 patent class in the U.S. Patent Classification System, which covers "Electrical computers and digital processing systems: Processing architectures and instruction processing (e.g., processors)" and represents patents on the microprocessor core. The green line—which is labeled as "memory"—corresponds to the 711 patent class, which covers "Electrical computers and digital processing systems: Memory" and represents patents on a microprocessor's memory hierarchy. The red line is the sum of the numbers of core and memory patents, which could represent microarchitecture patents. The remainder of this article may interchangeably refer to

memory patents as 711 patents and core patents as 712 patents.

IN HONOR OF THE 50TH ANNIVERSARY
OF THE MICROPROCESSOR, THIS
ARTICLE ANALYZES THE NUMBER OF
MICROARCHITECTURAL (CORE
+MEMORY) PATENTS OVER THE LAST
50 YEARS AND PRESENTS THE
EARLIEST PATENT THAT I COULD FIND
ON THE MICROPROCESSOR, BRANCH
PREDICTION, OUT-OF-ORDER
EXECUTION, AND NEXT-LINE
PREFETCHING.

In January 2013, the U.S. Patent and Trademark Office (PTO) discontinued using the U.S. Patent Classification System in favor of using the Cooperative Patent Classification (CPC) system. b Therefore, starting in 2013, the PTO's full-text database does not have any information regarding how many 711 or 712 issued patents there were per year. To estimate the number of 711 and 712 patents for 2013 to 2020, I first determined the total number of patents in several subcategories of a different classification system (the CPC system) that appeared to be closely related to the 711 and 712 patent classes—albeit with a significantly narrower in scope in 2012. The number of patents in these CPC subcategories in 2012 was 404. I then compared the number of 711 and 712 patents in 2012 (1985 and 262, respectively) with the number of patents in the CPC subcategories in 2012 (404) to determine multipliers for the 711 and 712 patent

<sup>&</sup>lt;sup>a</sup>One advantage of presenting the data by the application date is that it reveals how quickly after the debut of the first microprocessor that inventors started filing microprocessor patents. There are several other reasons to categorize patents by the application year, as opposed to issue year. Some of those reasons can be found in the *Micro Law* article in the September–October issue of *IEEE Micro*.

<sup>0272-1732 © 2021</sup> IEEE Digital Object Identifier 10.1109/MM.2021.3118437 Date of current version 19 November 2021.

<sup>&</sup>lt;sup>b</sup>There does not appear to be mapping between U.S. Patent Classification System and the CPC system. Furthermore, the subcategories of the latter are divided in such a way that I was not able to determine a group of subcategories that substantially corresponded to either the 711 or 712 patent classes.



**FIGURE 1.** Number of core, memory, and total (core+memory) U.S. patents issued between 1971 and 2020 based on the application date. The dotted lines between 2013 and 2020 indicate that the numbers for that year are estimated.

classes. The multiplier for the 711 patent class was 4.91 (1985/404), whereas the multiplier for the 712 patent class was 0.65 (262/404). I then multiplied the number of patents in the CPC subcategories in 2013 to 2020 by each multiplier to estimate the number of patents in the 711 and 712 patent classes. For example, in 2013, the number of patents in the CPC subcategories was 450. Multiplying that number by respective multipliers for the 711 and 712 patent classes provided an estimate of 2211 711 patents and 292 712 patents. Figure 1 uses dotted lines between 2013 and 2020 to indicate that the numbers are estimated.

The sharp decrease in the number of issued patents in 2019 and 2020 (and, to a lesser extent, 2018) is due to arranging the patents according to their application year (as compared to the issue year). Because there are still many pending applications that were filed in 2019 and 2020 that have not yet been issued, the number of issued patents with 2019 and 2020 application dates is lower than it will be in a few years from now (i.e., after the pending applications have issued or been abandoned).

The first result in Figure 1 is that both the number of core patents and the number of memory patents increased between 1971 and 1997, although the number of core patents increased more slowly than the number of memory patents increased. For example, the number of issued core patents did not exceed 100 until 1979, whereas the number of issued memory patents was greater than 100 in 1976. After 1997, the number of issued core patents decreased from a high of 1310 in 1997 to a low of 250 in 2014, before slightly rebounding to an average of around 300 per year between 2014 and 2018. By

contrast, the number of memory patents continued to generally increase, achieving a peak of 3830 in 2008.

For both core and memory patents, the number of patents decreased after 2008. For example, between 2008 and 2014, the number of core patents decreased from 787 to 251—a 68.1% decrease—whereas the number of memory patents decreased from 3830 to 1906—a 50.2% decrease. The likely reason for this decrease is the Great Recession, which started around 2008. Due to financial hardships, companies may have been much more selective—perhaps for several years—in which patent applications to pursue, in addition to potentially scaling back their research and development. (Note: Some of the decrease in number of 711 and 712 patents in 2012, as compared to 2011, may also be due to the PTO beginning to switch from the U.S. Patent Classification System to the CPC System.)

Since 2014, however, the numbers of core and memory patents have rebounded from their 2014 lows. For example, the number of core patents increased from 251 in 2014 to 333 in 2017, whereas the number of memory patents increased from 1906 in 2014 to 2853 in 2017. This increase may be the result of increased research and development budgets, as well as improved financial situations for companies.

Finally, although the numbers of core and memory patents increased between 2014 and 2017, these numbers are significantly lower than their pre-Great Recession numbers. While these lower numbers may be the result of decreased innovation directed toward the core and memory or those companies have been less aggressive in pursuing core and memory patents, the apparent

November/December 2021 IEEE Micro 173

**TABLE 1.** Comparison of memory and core patents on a decade granularity.

| Decade    | Number of<br>memory<br>patents | Number of core patents | Memory/<br>core ratio |
|-----------|--------------------------------|------------------------|-----------------------|
| 1971–1980 | 846                            | 634                    | 1.33                  |
| 1981–1990 | 2241                           | 1793                   | 1.25                  |
| 1991–2000 | 14,222                         | 8718                   | 1.63                  |
| 2001–2010 | 27,348                         | 7695                   | 3.55                  |
| 2011–2020 | 19,470                         | 2607                   | 7.47                  |

decrease from the pre-Great Recession highs may not actually reflect less computer architecture innovation. In particular, rather than occurring in the core and memory, the innovation may have simply "migrated" to areas "adjacent" to the core and memory, e.g., SoC, multiprocessor, accelerators, etc., to a greater extent that prior to the Great Recession.

The next result from Figure 1 is that there are significantly more issued memory patents than core patents. Overall, between 1971 and 2020, there were 2.99 times more memory patents than core patents (64,127 memory patents as compared to 21,447 core patents). Most of this difference appears to have occurred in the last 20 years. To illustrate this result, Table 1 presents the results on a decade-by-decade basis.

THE REMAINDER OF ARTICLE
PRESENTS A FEW PATENTS FROM THE
EARLY YEARS OF MICROPROCESSOR
AND COMPUTER ARCHITECTURE.

The results in Table 1 show that between 1971 and 2000, ratio of memory to core patents was between 1.25 and 1.63 per decade, and 1.55 overall. By contrast, between 2001 and 2020, the ratio was 3.55 to 7.47, and 4.54 overall. The increase in the overall ratio from 1.55 between 1971 and 2000 to 4.54 between 2001 and 2020, and the fact that the number of core patents steadily decreased from 1997 to 2014 could indicate that microprocessor companies found it more difficult to come up with inventions related to the core and/or saw that memory patents were more valuable due to significantly increasing memory latency.

## INTERESTING EARLY MICROARCHITECTURE PATENTS

The remainder of article presents a few patents from the early years of microprocessor and computer architecture. For each patent, this article lists the assignee, filing and issue dates, inventors, title, and abstract. Furthermore, this article also includes a representative figure (if helpful) and a representative claim.

To find the following patents, I first searched for the particular microarchitectural feature in the PTO's full-text database. But, unfortunately, this database only contains the full text of patents that were issued in 1976 or after. Therefore, if the oldest patents in that database for a particular microarchitectural feature did not appear to be the first patent on that feature, I then reviewed the patents that were listed as prior art on the oldest patents for that feature. And, if necessary, I iteratively reviewed the patents listed on the prior art patents until I found what could be the earliest patent on a particular microarchitectural feature.

In addition to presenting what may be the first microprocessor patent and the first branch predictor patent, the other patents appear to potentially cover out-of-order execution and prefetching, and both patents predate the first microprocessor.

The first patent may be the original patent on the Intel's 4004 microprocessor:

Assignee: Intel Corporation

**Number**: 3,821,715 **Filed**: January 22, 1973 **Issued**: June 28, 1974

Inventors: Marcian Edward Hoff, Jr., Stanley Mazor,

Federico Faggin

Title: Memory system for a multi chip digital

computer

Abstract: A general purpose digital computer which comprises a plurality of metal-oxide-semiconductor (MOS) chips. Random-access memories (RAMs) and read-only memories (ROMs) used as part of the computer are coupled to common bi-directional data buses to a central processing unit (CPU) with each memory including decoding circuitry to determine which of the plurality of memory chips is being addressed by the CPU. The computer is fabricated using chips mounted on standard 16 pin dual inline packages allowing additional memory chips to be added to the computer.

Claim 1 from the patent recites the following.

- A general purpose digital computer comprising the following:
  - a) a central processor disposed on a first semiconductor chip;
  - b) a plurality of bidirectional data bus lines;
  - c) at least a separate first and second semiconductor memory chip each defining a memory and each including a chip decoding circuit for

174 IEEE Micro November/December 2021



FIGURE 2. Figure from U.S. Patent No. 3,821,715, which was assigned to Intel.

recognizing a different predetermined code on said bidirectional data bus lines and for activating a portion of said memory upon receipt of said predetermined code, said data bus lines interconnecting said processor and said first and second memory chips for communicating said different predetermined codes from said processor to at least one of said first and second memory chips and for communicating data signals for one of said first and second memory chips to said processor; whereby said processor may communicate signals to said first and second memory chips and said decoding circuits shall determine which memory is being addressed.

Figure 1 from the patent (shown as Figure 2 in this article) shows a general block diagram of the disclosed computer with central processing unit, a single RAM, and a single ROM.

The next patent is the earliest branch predictorrelated patent that I found:

Assignee: Control Data Corporation

Number: 4,370,711 Filed: October 21, 1980 Issued: January 25, 1983 Inventors: James E. Smith

Title: Branch predictor using random access

memory

**Abstract**: A system is provided for predicting in advance the result of a conditional branch instruction in a computer system. The system includes a hash mechanism, a random access memory, an address buffer, a branch outcome result receiving means and a counter

buffer. The hash mechanism and memory use the input branch instruction address to produce a count which in effect is a way of weighting recent branch history to predict the branch decision. The counts are stored in the random access memory (RAM). The random access memory is addressed by the hashed branch instruction address to produce the system result.

#### Claim 1 from the patent recites the following.

- A branch prediction mechanism comprising the following:
  - a) a prediction memory;
  - b) an instruction address receiving means;
  - c) a hash address device connected to receive instruction addresses from said instruction address receiving means;
  - d) an address register connected with said prediction memory and to said hash address device for receiving hash addresses from said hash address device;
  - e) an address buffer connected to said hash address device for receiving hash addresses from said hash address device;
  - f) a write hash address register connected to said address buffer for receiving hash addresses from said address buffer and connected to said prediction memory;
  - g) a count register connected to said prediction memory for receiving the output from said prediction memory;
  - a count buffer connected to said count register for receiving output counts from said count register;

November/December 2021 IEEE Micro 175



FIGURE 3. Figure from U.S. Patent No. 4,370,711, which was also assigned to Control Data Corporation.

- i) an increment-decrement unit connected to said count buffer for receiving input counts from said count buffer and having an input for receiving branch outcome information for updating the count according to the branch outcome;
- j) a write data register connected to said increment-decrement unit for receiving the updated count information from said increment-decrement unit said write data register being connected to said prediction memory to provide updated count information corresponding to the hash address in said write hash address register;
- k) wherein said branch prediction device operates to predict branch instructions on a continuously updated history of recently executed branch instructions.

Figure 2 from the patent (shown as Figure 3 in this article) is a block diagram of the claimed branch prediction system.

The next patent is the earliest patent I found that relates to out-of-order execution and scoreboarding:

Assignee: Control Data Corporation

**Patent:** 3,346,851 **Filed:** July 8, 1964 **Issued:** October 10, 1967

Inventors: James E. Thornton, Seymour R. Cray

Title: Simultaneous multiprocessing computer

system

**Abstract:** A digital computer central processor is disclosed having a plurality of arithmetic or functional units and a scoreboard for instruction control which enables simultaneous execution of a plurality of instructions from a single program. This invention relates to a digital computer central processor and more particularly to a method and apparatus which control, in an orderly sequence, simultaneous operations of functional units in a high speed digital computer.

Claim 1 from the patent recites the following.

- A data processing system having a storage section, an input-output section, a control section, and a function section, the invention being characterized by the following:
  - a) the function section comprising the following:
    - i) a plurality of registers (1.60) connected to the storage section and capable of holding numerical data received from the storage section and transmitting data to the storage section;
    - ii) a plurality of functional units (2.12) connected to the registers, each unit being capable of performing arithmetic and logical operations on data received from the registers and transmitting results to the registers;
  - b) the control section comprising the following:
    - i) a single instruction source (1.76) connected to the storage section and capable of receiving a plurality of instruction commands sequentially from the storage comprising;

176 IEEE Micro November/December 2021



FIGURE 4. Figure from U.S. Patent No. 3,346,851, which was assigned to Control Data Corporation.

ii) a scoreboard logic network (1.66) connected to the instruction source, the functional units, and the registers, and responsive to signals from the instruction source to reserve selected functional units and selected registers for numerical data and results, the scoreboard logic network providing control signal to allow concurrent operation of a plurality of functional units in cooperation with the selected registers.

Figure 2 from the patent (shown as Figure 4 in this article) is a block diagram that illustrates the components of the digital data central processor.

The patent describes the operation of the scoreboard as follows.

"Thereafter, an instruction is issued from the instruction source 1.76 to the scoreboard 1.66 via line 1.80. The scoreboard 1.66 reserves the functional unit necessary to perform the instruction calculation and reserves certain of the operational register for use by that functional unit. While the functional unit is performing the first instruction, the scoreboard 1.66 allows another instruction to issue, and then proceeds to have that instruction performed simultaneously with the initial instruction. The scoreboard will allow instructions to

be issued and performed simultaneously until a conflict of functional units or of registers is encountered. Thus, the scoreboard tends to keep the functional units in a high degree of simultaneous operations." — U.S. Patent No. 3,346,851 at 5:54–67.

The next patent could be the earliest prefetchingrelated patent, specifically on next-line prefetching.

Assignee: International Business Machines

Corporation

**Number**: 3,654,622 **Filed**: December 31, 1969 **Issued**: April 4, 1972

Inventors: William F. Beausoleil

Title: Auxiliary storage apparatus with continuous

data transfer

**Abstract**: An electronic bulk storage having the characteristics of a sequential access storage device. Data are stored parallel by word in a plurality of electronically rotatable memory elements selectable by a memory selection matrix. Each element has a feedback loop for recirculating data and when selected, a group of elements at an address N is read in parallel a word at a time by electronically rotating data bits stored in the selected memory elements at an address. Controls are provided to select memory elements N+1 whenever elements at address N are selected by the selection matrix. First data is read out of the elements at address N and then data is

November/December 2021 IEEE Micro 1777

read out of the elements at address N+1 without any time lost for reselection of memory elements.

Claim 13 from the patent recites the following:

#### 13. A memory comprising the following:

- a plurality of multibit memory elements arranged in columns and rows in memory planes, one plane for each bit position of a word;
- address decoding means for selecting a column and a row to thereby select a first memory element location on each plane, and to automatically select a next sequential memory element location;
- 3) means for electronically rotating the bits in said selected memory elements in unison;
- 4) means for reading out words in parallel, or writing words in parallel from the first selected group of memory elements; and

5) means operative when a word boundary for the first selected elements is reached for halting the reading or writing for the second selected elements whereby no delay is imposed for deselection and reselection of memory elements.



JOSHUA J. YI is a solo practitioner who serves as a court-appointed Technical Advisor for the Honorable Alan D Albright, U.S. District Judge for the Western District of Texas, Waco Division, Waco, TX, USA. His research interests include microarchitecture

and performance methodology. Yi received a Ph.D. degree in electrical engineering from the University of Minnesota, Minneapolis, Minneapolis, MN, USA, and a J.D. degree from the University of Texas at Austin, Austin, TX, USA. Contact him at josh@joshuayipatentlaw.com.



178 IEEE Micro November/December 2021