#### CS 110 Computer Architecture Synchronous Digital Systems

Instructor: Sören Schwertfeger

http://shtech.org/courses/ca/

School of Information Science and Technology SIST

ShanghaiTech University

Slides based on UC Berkley's CS61C

#### Compiling, Assembling, Linking, Loading (CALL) a Program



# Compiler

- Input: High-Level Language Code (e.g., foo.c)
- Output: Assembly Language Code (e.g., foo.s for MIPS)
- Note: Output *may* contain pseudo-instructions
- <u>Pseudo-instructions</u>: instructions that assembler understands but not in machine For example:

-move  $\$s1,\$s2 \Rightarrow add \$s1,\$s2,\$zero$ 

#### Assembler

- Input: Assembly Language Code (MAL) (e.g., **foo.s** for MIPS)
- Output: Object Code, information tables (TAL) (e.g., foo.o for MIPS)
- Reads and Uses Directives
- Replace Pseudo-instructions
- Produce Machine Language
- Creates Object File

## Linker

- Input: Object code files, information tables (e.g., foo.o,libc.o for MIPS)
- Output: Executable code (e.g., a.out for MIPS)
- Combines several object (.o) files into a single executable ("<u>linking</u>")
- Step 1: combine text segments from.o files
- Step 2: combine data segments from .o files
- Step 3: Resolve references:
  - Go through Relocation Table; handle each entry => Resolve absolute addresses

#### Loader Basics

- Input: Executable Code (e.g., a.out for MIPS)
- Output: (program is run)
- Executable files are stored on disk
- When one is run, loader's job is to load it into memory and start it running
- In reality, loader is the operating system (OS)

loading is one of the OS tasks

#### Static vs Dynamically linked libraries

- What we've described is the traditional way: statically-linked approach
  - The library is now part of the executable, so if the library updates, we don't get the fix (have to recompile if we have source)
  - It includes the <u>entire</u> library even if not all of it will be used
  - Executable is self-contained
- An alternative is dynamically linked libraries (DLL), common on Windows (.dll) & UNIX (.so) (shared object) platforms

en.wikipedia.org/wiki/Dynamic\_linking

# **Dynamically linked libraries**

#### • Space/time issues

+ Storing a program requires less disk space

+ Sending a program requires less time

+ Executing two programs requires less memory (if they share a library)

- At runtime, there's time overhead to do link

#### Upgrades

+ Replacing one file (libXYZ.so) upgrades every program that uses library "XYZ"

- Having the executable isn't enough anymore

Overall, dynamic linking adds quite a bit of complexity to the compiler, linker, and operating system. However, it provides many benefits that often outweigh these 8

## **Dynamically linked libraries**

- The prevailing approach to dynamic linking uses machine code as the "lowest common denominator"
  - The linker does not use information about how the program or library was compiled (i.e., what compiler or language)
  - This can be described as "linking at the machine code level"
  - This isn't the only way to do it ...

#### In Conclusion...

- Compiler converts a single HLL file into a single assembly language file.
- Assembler removes pseudoinstructions, converts what it can to machine language, and creates a checklist for the linker (relocation table). A . s file becomes a . o file.
  - Does 2 passes to resolve addresses, handling internal forward references
- Linker combines several .o files and resolves absolute addresses.
  - Enables separate compilation, libraries that need not be compiled, and resolves remaining addresses
- Loader loads executable into memory and begins execution.



#### Levels of Representation/Interpretation



temp = v[k]; v[k] = v[k+1]; v[k+1] = temp;

| lw | \$t0, 0(\$2)         |
|----|----------------------|
| lw | \$t1, 4(\$2)         |
| SW | \$t1 <i>,</i> 0(\$2) |
| SW | \$t0 <i>,</i> 4(\$2) |

Anything can be represented as a *number*, i.e., data or instructions

0000100111000110101011110101100010101111010110000000100111000110110001101010111101011000000010010101100000001001110001101111



#### You are Here!

- Software
   Parallel Requests

   Assigned to computer
   e.g., Search "Katz"
- Parallel Threads
   Assigned to core
   e.g., Lookup, Ads
- Parallel Instructions

   >1 instruction @ one time
   e.g., 5 pipelined instructions
- Parallel Data

>1 data item @ one time e.g., Add of 4 pairs of words

- Hardware descriptions All gates @ one time
- Programming Languages



#### Hardware Design

- Next several weeks: how a modern processor is built, starting with basic elements as building blocks
- Why study hardware design?
  - Understand capabilities and limitations of HW in general and processors in particular
  - What processors can do fast and what they can't do fast (avoid slow things if you want your code to run fast!)
  - Background for more in-depth HW courses
  - Hard to know what you'll need for next 30 years
  - There is only so much you can do with standard processors: you may need to design own custom HW for extra performance
    - Even some commercial processors today have customizable hardware!
    - E.g. Google Tensor Processing Unit (TPU)

# Synchronous Digital Systems

Hardware of a processor, such as the MIPS, is an example of a Synchronous Digital System

Synchronous:

- All operations coordinated by a central clock
  - "Heartbeat" of the system!

Digital:

- Represent all values by discrete values
- Two binary digits: 1 and 0
- Electrical signals are treated as 1's and 0's
  - 1 and 0 are complements of each other
- High /low voltage for true / false, 1 / 0

#### Switches: Basic Element of Physical Implementations

 Implementing a simple circuit (arrow shows action if wire changes to "1" or is *asserted*):



# Switches (cont'd)

Compose switches into more complex ones (Boolean functions):



#### **Historical Note**

- Early computer designers built ad hoc circuits from switches
- Began to notice common patterns in their work: ANDs, ORs, ...
- Master's thesis (by Claude Shannon, 1940) made link between work and 19<sup>th</sup> Century Mathematician George Boole

   Called it "Boolean" in his honor
- Could apply math to give theory to hardware design, minimization, ...



#### Transistors

- High voltage (V<sub>dd</sub>) represents 1, or true
  - In modern microprocessors, Vdd ~ 1.0 Volt
- Low voltage (0 Volt or Ground) represents 0, or false
- Pick a midpoint voltage to decide if a 0 or a 1
  - Voltage greater than midpoint = 1
  - Voltage less than midpoint = 0
  - This removes noise as signals propagate a big advantage of digital systems over analog systems
- If one switch can control another switch, we can build a computer!
- Our switches: CMOS transistors

#### **CMOS Transistor Networks**

- Modern digital systems designed in CMOS
  - MOS: Metal-Oxide on Semiconductor
  - C for complementary: use *pairs* of normally-*on* and normally-*off* switches
- CMOS transistors act as voltage-controlled switches
  - Similar, though easier to work with, than electromechanical relay switches from earlier era
  - Use energy primarily when switching

# **CMOS** Transistors

- Source \_\_\_\_\_Gate \_\_\_\_Drain
- Three terminals: source, gate, and drain
  - Switch action:

if voltage on gate terminal is (some amount) higher/lower than source terminal then conducting path established between drain and source terminals (switch is closed)



n-channel transitor

off when voltage at Gate is low on when:

voltage (Gate) > voltage (Threshold) (**High** resistance when gate voltage **Low**, **Low** resistance when gate voltage **High**)



**High** resistance when gate voltage **High**)

field-effect transistor (FET) => CMOS circuits use a combination of p-type and n-type metal-oxide-semiconductor field-effect transistors =>

MOSFET



#### Intel 14nm Technology

1 nm = 1 / 1,000,000,000 m; wavelength visible light: 400 – 700 nm



Plan view of transistors

#### Sense of Scale



1 nm = 1 / 1,000,000,000 m; wavelength visible light: 400 – 700 nm

Source: Mark Bohr, IDF14

#### **CMOS Circuit Rules**

- Don't pass weak values => Use Complementary Pairs
  - N-type transistors pass weak 1's (V<sub>dd</sub> V<sub>th</sub>)
  - N-type transistors pass strong 0's (ground)
  - Use N-type transistors only to pass 0's (N for negative)
  - Converse for P-type transistors: Pass weak 0s, strong 1s
    - Pass weak 0's ( $V_{th}$ ), strong 1's ( $V_{dd}$ )
    - Use P-type transistors only to pass 1's (P for positive)
  - Use pairs of N-type and P-type to get strong values
- Never leave a wire undriven
  - Make sure there's always a path to  $V_{\rm dd}$  or GND
- Never create a path from V<sub>dd</sub> to GND (ground)
  - This would short-circuit the power supply!

#### **CMOS Networks**

*p-channel transistor* on when voltage at Gate is low off when: voltage(Gate) > voltage (Threshold)



*n-channel transitor* off when voltage at Gate is low on when: voltage(Gate) > voltage (Threshold) what is the relationship between x and y?

| Х               | Y            |
|-----------------|--------------|
| 0 Volt<br>(GND) | 1 Volt (Vdd) |
| 1 Volt<br>(Vdd) | 0 Volt (GND) |

#### Called an *inverter* or *not gate*

#### **Two-Input Networks**



| what is the relationship between x, y and z? |        |        |        |  |  |  |
|----------------------------------------------|--------|--------|--------|--|--|--|
|                                              | Х      | Υ      | Z      |  |  |  |
|                                              | 0 Volt | 0 Volt | 1 Volt |  |  |  |
|                                              | 0 Volt | 1 Volt | 1 Volt |  |  |  |
|                                              | 1 Volt | 0 Volt | 1 Volt |  |  |  |
|                                              | 1 Volt | 1 Volt | 0 Volt |  |  |  |

Called a NAND gate (NOT AND)

#### Question



|                               | Х      | Y      | Z |   |   |          |       |
|-------------------------------|--------|--------|---|---|---|----------|-------|
|                               |        |        | Α | В | С | D        |       |
| 0 Volt 1 Volt 0 1 0 1 Volts   | 0 Volt | 0 Volt | 0 | 0 | 1 | <u>1</u> | Volts |
|                               | 0 Volt | 1 Volt | 0 | 1 | 0 | 1        | Volts |
| 1 Volt 0 Volt 0 1 0 1 Volts   | 1 Volt | 0 Volt | 0 | 1 | 0 | 1        | Volts |
| 1 Volt 1 Volt 1 1 1 0 0 Volts | 1 Volt | 1 Volt | 1 | 1 | 0 | 0        | Volts |

#### **Combinational Logic Symbols**

 Common combinational logic systems have standard symbols called logic gates



#### Remember...

# • AND-



# Admin

- Midterm I: April 19<sup>th</sup>!
  - Allowed material: 1 <u>hand-written by you</u> English double-sided A4 cheat sheet.
    - Not copied original hand written everything
    - Violations:
      - Found before midterm: confiscate cheat sheet
      - During/ after: 0 pts in midterm
  - MIPS green card provided by us!
  - No electronic devices no <u>Calculator</u>!
  - Content: Number representation, C, MIPS, CALL
  - Review session on April 17<sup>th</sup>.
- Project 1.1 autograder

#### **Boolean Algebra**

• Use plus "+" for OR

- "logical sum" 1+0 = 0+1 = 1 (True); 1+1=2 (True); 0+0 = 0 (False)

- Use product for AND (a•b or implied via ab)
  - "logical product"
    0\*0 = 0\*1 = 1\*0 = 0 (False); 1\*1 = 1 (True)
- "Hat" to mean complement (NOT)
- Thus

 $ab + a + \overline{c}$ 

- $= a \cdot b + a + \overline{c}$
- = (a AND b) OR a OR (NOT c)







Exhaustive list of the output value generated for each combination of inputs

How many logic functions can be defined with N inputs?

| a | b | c | d | У          |
|---|---|---|---|------------|
| 0 | 0 | 0 | 0 | F(0,0,0,0) |
| 0 | 0 | 0 | 1 | F(0,0,0,1) |
| 0 | 0 | 1 | 0 | F(0,0,1,0) |
| 0 | 0 | 1 | 1 | F(0,0,1,1) |
| 0 | 1 | 0 | 0 | F(0,1,0,0) |
| 0 | 1 | 0 | 1 | F(0,1,0,1) |
| 0 | 1 | 1 | 0 | F(0,1,1,0) |
| 0 | 1 | 1 | 1 | F(0,1,1,1) |
| 1 | 0 | 0 | 0 | F(1,0,0,0) |
| 1 | 0 | 0 | 1 | F(1,0,0,1) |
| 1 | 0 | 1 | 0 | F(1,0,1,0) |
| 1 | 0 | 1 | 1 | F(1,0,1,1) |
| 1 | 1 | 0 | 0 | F(1,1,0,0) |
| 1 | 1 | 0 | 1 | F(1,1,0,1) |
| 1 | 1 | 1 | 0 | F(1,1,1,0) |
| 1 | 1 | 1 | 1 | F(1,1,1,1) |

Truth Table Example #1: y = F(a,b): 1 iff  $a \neq b$ a b y 000 0 1 1 1 0 1 1 1 0



| Truth Table Example #3:<br>32-bit Unsigned Adder |       |        |               |  |  |
|--------------------------------------------------|-------|--------|---------------|--|--|
| Α                                                | В     | C      | _             |  |  |
| 000 0                                            | 000 0 | 000 00 | -             |  |  |
| 000 0                                            | 000 1 | 000 01 |               |  |  |
| •                                                | •     | •      | How           |  |  |
| •                                                | •     | •      | Many<br>Rows? |  |  |
| •                                                | •     | •      |               |  |  |
| 111 1                                            | 111 1 | 111 10 |               |  |  |

#### Truth Table Example #4: 3-input Majority Circuit

#### Y =

This is called *Sum of Products* form; Just another way to represent the TT as a logical expression

More simplified forms (fewer gates and wires)

b a C ()

## Boolean Algebra: Circuit & Algebraic Simplification



original circuit

equation derived from original circuit

algebraic simplification

simplified circuit

# Representations of Combinational Logic (groups of logic gates)



## Laws of Boolean Algebra

| $X \overline{X} = 0$                          | $X + \overline{X} = 1$                         |
|-----------------------------------------------|------------------------------------------------|
| X 0 = 0                                       | X + 1 = 1                                      |
| X 1 = X                                       | X + 0 = X                                      |
| X X = X                                       | X + X = X                                      |
| X Y = Y X                                     | X + Y = Y + X                                  |
| (X Y) Z = Z (Y Z)                             | (X + Y) + Z = Z + (Y + Z)                      |
| X (Y + Z) = X Y + X Z                         | X + Y Z = (X + Y) (X + Z)                      |
| X Y + X = X                                   | (X + Y) X = X                                  |
| $\overline{X}Y + X = X + Y$                   | $(\overline{X} + Y) X = X Y$                   |
| $\overline{XY} = \overline{X} + \overline{Y}$ | $\overline{X + Y} = \overline{X} \overline{Y}$ |

Complementarity Laws of 0's and 1's Identities **Idempotent Laws** Commutativity Associativity Distribution Uniting Theorem Uniting Theorem v. 2 DeMorgan's Law

## Boolean Algebraic Simplification Example

y = ab + a + c

- -

· · ·

. .

## Boolean Algebraic Simplification Example

$$y = ab + a + c$$

= a(1) + c

= a + c

= a(b+1) + c

- abcy
- 0000
- 0011
- 0100
- 0111
- 1001
- 1011
- 1101
- 1111

distribution, identity law of 1's identity

## Question

• Simplify  $Z = A + BC + \overline{A}(\overline{BC})$ 

- A: Z = 0
- B:  $Z = \overline{A(1 + BC)}$
- C: Z = (A + BC)
- D: Z = BC
- E: Z = 1

## News (2017): Open Compute Project Summit: Google & ST Microelectronics: 48V to Chip

- Point-of-Load-(PoL) Converter
- 48V to 0.5V .. 1V .. up to 12V > 300 W @ 1V!
- Efficiency: 230V AC 89.3%; <u>48V DC 92.1%</u>





#### **Typical Conversion Efficiency**

System Efficiency







#### Signals and Waveforms: Grouping



#### Signals and Waveforms: Circuit Delay



### Sample Debugging Waveform

| <mark>wave – default<br/>File <u>E</u>dit <u>C</u>ursor <u>Z</u>oom <u>B</u>ookm</mark> | ark F <u>o</u> rma | t <u>W</u> indow |        |                    |        |        |        |     |      |     |        |        |        |        | _ 8 |
|-----------------------------------------------------------------------------------------|--------------------|------------------|--------|--------------------|--------|--------|--------|-----|------|-----|--------|--------|--------|--------|-----|
| 🗃 🖬 🖨 🕴 👗 🖻 🛍 🕴 📐 ,                                                                     | 🕅 🗠 🛃              | 「   <b>@</b> , € | Q Q Q  | . <mark>B</mark> H |        | 1. 1.  | X      |     |      |     |        |        |        |        |     |
| /tb/DBG_00[10]                                                                          | St0                |                  |        |                    |        |        |        |     |      |     |        |        |        |        |     |
| 🥙 /tb/DBG_00[9]                                                                         | St0                |                  |        |                    |        |        |        |     |      |     |        |        |        |        |     |
| 🥙 /tb/DBG_00[8]                                                                         | St0                |                  |        |                    |        |        |        |     |      |     |        |        |        |        |     |
| 🥑 /tb/DBG_00[7]                                                                         | St1                |                  |        |                    |        |        |        |     |      |     |        |        |        |        |     |
| 🥙 /tb/DBG_00[6]                                                                         | St0                | Π                | Π      | 1                  |        |        | Π      | h n |      |     | Π      | n n    | П      |        | Π   |
| /tb/DBG_00[5]                                                                           | St0                |                  |        |                    |        |        |        |     |      |     |        |        |        |        |     |
| <pre>/tb/DBG_00[4]</pre>                                                                | St0                |                  |        |                    |        |        |        |     |      |     |        |        |        |        |     |
| /tb/DBG_00[3]                                                                           | St0                |                  |        |                    |        |        |        |     |      |     |        |        |        |        |     |
| <pre>/tb/DBG_00[2]</pre>                                                                | St0                |                  |        |                    |        |        |        |     |      |     |        |        |        |        |     |
| <pre>/tb/DBG_00[1]</pre>                                                                | St0                |                  |        |                    |        |        |        |     |      |     |        |        |        |        |     |
| <pre>/tb/DBG_00[0]</pre>                                                                | St0                | ΠΠ               | ΠΠ     |                    | ΠΠ     | ΠΠ     | ΠΠ     | ΠΠ  | ΠΠ   |     | ΠΠ     | ΠΠ     | ΠΠ     | ΠΠ     | Π   |
| ⊡<br>⊡/tb/A                                                                             | 0000               | 1003             | 4.lfef | 0035 0             | 038,00 | 36 003 | 8,0037 | -   |      | 003 | 9.1fee | 003a.1 | fee 00 | 3b/1fe | d   |
| ⊞-@ /tb/IB                                                                              | 3a                 | 3a               |        |                    |        | 3e     |        |     |      |     |        |        |        |        |     |
| ⊞-@ /tb/ROMAD                                                                           | 0000               | 1fef             |        | 0038               |        |        |        |     | (1   | fee |        |        | 1      | fed    |     |
| <b>⊡-</b> ⊘_/tb/D                                                                       | ff                 | (f f             |        |                    |        |        |        |     |      |     | 00     | ff     |        | 39     |     |
|                                                                                         | 0                  | 2                | 3      | 1 2                |        |        |        | 3 4 | 5 1  | 2   |        |        | 3 1    | 2      |     |
| 🥘 /tb/0E_n                                                                              | St0                |                  |        |                    |        |        |        |     |      |     |        |        |        |        |     |
| /tb/RAMCS_n                                                                             | St1                |                  |        | <u>⊢</u>           |        |        |        |     |      |     |        |        |        |        |     |
| <pre>     /tb/ROMCS_n     /ul /# </pre>                                                 | St0                |                  |        |                    |        |        |        |     |      |     |        |        |        |        |     |
| ⊘ /tb/₩E_n ⊘ /tb/X_0E_n                                                                 | St1<br>St0         |                  |        |                    |        |        |        |     |      |     |        |        |        |        |     |
| <pre>/tb/X_UE_n /tb/X_RAMCS_n</pre>                                                     | Stu<br>St1         |                  |        |                    |        |        |        |     |      |     |        |        |        |        |     |
| <pre>/tb/X_ROMCS_n</pre>                                                                | StO                |                  |        |                    |        |        |        |     |      |     |        |        |        |        |     |
| /tb/ReadVRAM                                                                            | St0                |                  |        |                    |        |        |        |     |      |     |        |        |        |        |     |
| /tb/CSyncX                                                                              | St0                |                  |        |                    |        |        |        |     |      |     |        |        |        |        |     |
|                                                                                         |                    |                  |        |                    |        |        |        |     |      |     |        |        |        |        |     |
|                                                                                         | 0 ps<br>0 ps       | 98               |        |                    | ) us   | 102    | 2 us   |     | l us | 100 |        |        | 8 us   |        | Dus |
| •                                                                                       |                    |                  |        |                    |        |        |        |     |      |     |        |        |        |        |     |
| 96986540 ps to 111169300 p                                                              |                    |                  |        |                    |        |        |        |     |      |     |        |        |        |        |     |

# Type of Circuits

- *Synchronous Digital Systems* consist of two basic types of circuits:
  - Combinational Logic (CL) circuits
    - Output is a function of the inputs only, not the history of its execution
    - E.g., circuits to add A, B (ALUs)
  - Sequential Logic (SL)
    - Circuits that "remember" or store information
    - aka "State Elements"
    - E.g., memories and registers (Registers)