# CS 110 Computer Architecture Synchronous Digital Systems 

Instructor:
Sören Schwertfeger
http://shtech.org/courses/ca/
School of Information Science and Technology SIST
ShanghaiTech University
Slides based on UC Berkley's CS61C

## Levels of Representation/Interpretation



## You are Here!

- Parallel Requests

Assigned to computer e.g., Search "Katz"

- Parallel Threads

Assigned to core e.g., Lookup, Ads

## Harness

 Parallelism \& Achieve High PerformanceHardware

Warehouse
Scale Computer


- Parallel Instructions
>1 instruction @ one time
e.g., 5 pipelined instructions
- Parallel Data
>1 data item @ one time
e.g., Add of 4 pairs of words
- Hardware descriptions

All gates @ one time

- Programming Languages


## Hardware Design

- Next several weeks: how a modern processor is built, starting with basic elements as building blocks
- Why study hardware design?
- Understand capabilities and limitations of HW in general and processors in particular
- What processors can do fast and what they can't do fast (avoid slow things if you want your code to run fast!)
- Background for more in-depth HW courses
- Hard to know what you'll need for next 30 years
- There is only so much you can do with standard processors: you may need to design own custom HW for extra performance
- Even some commercial processors today have customizable hardware!


## Synchronous Digital Systems

Hardware of a processor, such as the MIPS, is an example of a Synchronous Digital System
Synchronous:

- All operations coordinated by a central clock
- "Heartbeat" of the system!

Digital:

- Represent all values by discrete values
- Two binary digits: 1 and 0
- Electrical signals are treated as 1's and 0's
- 1 and 0 are complements of each other
- High /low voltage for true / false, 1 / 0


## Switches: Basic Element of Physical Implementations

- Implementing a simple circuit (arrow shows action if wire changes to " 1 " or is asserted):


$$
Z \equiv A
$$

## Switches (cont'd)

- Compose switches into more complex ones (Boolean functions):

AND $\quad \downarrow_{\bullet}^{A} \downarrow^{B} \quad \downarrow^{B} A$ and $B$


## Historical Note

- Early computer designers built ad hoc circuits from switches
- Began to notice common patterns in their work: ANDs, ORs, ...
- Master's thesis (by Claude Shannon, 1940) made link between work and 19th Century Mathematician George Boole
- Called it "Boolean" in his honor
- Could apply math to give theory to hardware design, minimization, ...


## Transistors

- High voltage ( $\mathrm{V}_{\mathrm{dd}}$ ) represents 1 , or true
- In modern microprocessors, Vdd ~ 1.0 Volt
- Low voltage ( 0 Volt or Ground) represents 0 , or false
- Pick a midpoint voltage to decide if a 0 or a 1
- Voltage greater than midpoint = 1
- Voltage less than midpoint $=0$
- This removes noise as signals propagate - a big advantage of digital systems over analog systems
- If one switch can control another switch, we can build a computer!
- Our switches: CMOS transistors


## CMOS Transistor Networks

- Modern digital systems designed in CMOS
- MOS: Metal-Oxide on Semiconductor
- C for complementary: use pairs of normally-on and normally-off switches
- CMOS transistors act as voltage-controlled switches
- Similar, though easier to work with, than electromechanical relay switches from earlier era
- Use energy primarily when switching


## CMOS Transistors

- Three terminals: source, gate, and drain
- Switch action:
if voltage on gate terminal is (some amount) higher/lower than source terminal then conducting path established between drain and source terminals (switch is closed)

n-channel transitor
off when voltage at Gate is low on when:
voltage(Gate) > voltage (Threshold)

p-channel transistor
on when voltage at Gate is low off when:
voltage(Gate) > voltage (Threshold)
field-effect transistor (FET) => CMOS circuits use a combination of $p$-type and n-type metal-oxide-semiconductor field-effect transistors =>



## Intel 14nm Technology



Side view of wiring layers

## Sense of Scale



Source: Mark Bohr, IDF14

## CMOS Circuit Rules

- Don't pass weak values => Use Complementary Pairs
- N-type transistors pass weak 1's ( $\left.\mathrm{V}_{\mathrm{dd}}-\mathrm{V}_{\mathrm{th}}\right)$
- N-type transistors pass strong 0's (ground)
- Use N-type transistors only to pass 0's (N for negative)
- Converse for P-type transistors: Pass weak 0s, strong 1s
- Pass weak 0's $\left(\mathrm{V}_{\mathrm{th}}\right)$, strong 1's $\left(\mathrm{V}_{\mathrm{dd}}\right)$
- Use P-type transistors only to pass 1's (P for positive)
- Use pairs of N-type and P-type to get strong values
- Never leave a wire undriven
- Make sure there's always a path to $\mathrm{V}_{\mathrm{dd}}$ or GND
- Never create a path from $\mathrm{V}_{\mathrm{dd}}$ to GND (ground)
- This would short-circuit the power supply!


## CMOS Networks

p-channel transistor
on when voltage at Gate is low off when:
voltage(Gate) > voltage (Threshold)

what is the
relationship
between $x$ and $y$ ?


Called an inverter or not gate

## Two-Input Networks



## Called a NAND gate (NOT AND)

## Clickers/Peer Instruction




## Combinational Logic Symbols

- Common combinational logic systems have standard symbols called logic gates



## Remember...

## AND-



## Admin

- Project 1.1 will be published soon
- Send your Lab TA your additional email - you will not be able to submit your project to gradebot without!
- Midterm I: April $6^{\text {th }}$ !
- Allowed material: 1 hand-written English doublesided A4 cheat sheet.
- MIPS green card provided by us!
- Content: Number representation, C, MIPS
- Review session on March 30th.


## Boolean Algebra

- Use plus " + " for OR
- "logical sum" $1+0=0+1=1$ (True); $1+1=2$ (True); $0+0=0$ (False)
- Use product for AND ( $a \bullet b$ or implied via $a b$ )
- "logical product" $\quad 0^{*} 0=0^{*} 1=1^{*} 0=0$ (False); $1^{*} 1=1$ (True)
- "Hat" to mean complement (NOT)
- Thus
$a b+a+\bar{c}$
$=a \bullet b+a+\bar{c}$
$=(\mathrm{a}$ AND b) OR a OR (NOT c )


## Truth Tables

## for Combinational Logic



Exhaustive list of the output value generated for each combination of inputs How many logic functions can be defined with N inputs?

| a | b | c | d | y |
| :--- | :--- | :--- | :--- | :--- |
| 0 | 0 | 0 | 0 | $\mathrm{~F}(0,0,0,0)$ |
| 0 | 0 | 0 | 1 | $\mathrm{~F}(0,0,0,1)$ |
| 0 | 0 | 1 | 0 | $\mathrm{~F}(0,0,1,0)$ |
| 0 | 0 | 1 | 1 | $\mathrm{~F}(0,0,1,1)$ |
| 0 | 1 | 0 | 0 | $\mathrm{~F}(0,1,0,0)$ |
| 0 | 1 | 0 | 1 | $\mathrm{~F}(0,1,0,1)$ |
| 0 | 1 | 1 | 0 | $\mathrm{~F}(0,1,1,0)$ |
| 0 | 1 | 1 | 1 | $\mathrm{~F}(0,1,1,1)$ |
| 1 | 0 | 0 | 0 | $\mathrm{~F}(1,0,0,0)$ |
| 1 | 0 | 0 | 1 | $\mathrm{~F}(1,0,0,1)$ |
| 1 | 0 | 1 | 0 | $\mathrm{~F}(1,0,1,0)$ |
| 1 | 0 | 1 | 1 | $\mathrm{~F}(1,0,1,1)$ |
| 1 | 1 | 0 | 0 | $\mathrm{~F}(1,1,0,0)$ |
| 1 | 1 | 0 | 1 | $\mathrm{~F}(1,1,0,1)$ |
| 1 | 1 | 1 | 0 | $\mathrm{~F}(1,1,1,0)$ |
| 1 | 1 | 1 | 1 | $\mathrm{~F}(1,1,1,1)$ |

## Truth Table Example \#1: $y=F(a, b): 1$ iff $a \neq b$

| $a$ | $b$ | $y$ |
| :---: | :---: | :---: |
| 0 | 0 | 0 |
| 0 | 1 | 1 |
| 1 | 0 | 1 |
| 1 | 1 | 0 |

## Truth Table Example \#2:

 2-bit Adder| A | B | C |
| :---: | :---: | :--- |
| $a_{1} a_{0}$ | $b_{1} b_{0}$ | $c_{2} c_{1} c_{0}$ |



How
Many
Rows?

## Truth Table Example \#3: 32-bit Unsigned Adder

| A | B | C |  |
| :---: | :---: | :--- | :--- |
| $000 \ldots 0$ | $000 \ldots 0$ | $000 \ldots 00$ |  |
| $000 \ldots 0$ | $000 \ldots 1$ | $000 \ldots 01$ |  |
| . | . | $\cdot$ | How |
| . | . | $\cdot$ | Many |
| . | . | $\cdot$ | Rows? |
| $111 \ldots 1$ | $111 \ldots 1$ | $111 \ldots 10$ |  |

## Truth Table Example \#4: 3-input Majority Circuit

$$
Y=
$$

This is called Sum of Products form; Just another way to represent the TT as a logical expression

More simplified forms
(fewer gates and wires)

| a | b | c | y |
| :---: | :---: | :---: | :---: |
| 0 | 0 | 0 | 0 |
| 0 | 0 | 1 | 0 |
| 0 | 1 | 0 | 0 |
| 0 | 1 | 1 | 1 |
| 1 | 0 | 0 | 0 |
| 1 | 0 | 1 | 1 |
| 1 | 1 | 0 | 1 |
| 1 | 1 | 1 | 1 |

## Boolean Algebra: Circuit \& Algebraic Simplification



Representations of Combinational Logic (groups of logic gates)


## Laws of Boolean Algebra

$$
\begin{array}{cr}
X \bar{X}=0 & X+\bar{X}=1 \\
X 0=0 & X+1=1 \\
X 1=X & X+0=X \\
X X=X & X+X=X \\
X Y=Y X & X+Y=Y+X \\
(X Y) Z=Z(Y Z) & (X+Y)+Z=Z+( \\
X(Y+Z)=X Y+X Z & X+Y Z=(X+Y)( \\
X Y+X=X & (X+Y) X=X \\
\bar{X} Y+X=X+Y & (\bar{X}+Y) X=X \\
\overline{X Y}=\bar{X}+\bar{Y} & \overline{X+Y}=\bar{X} \bar{Y}
\end{array}
$$

Complementarity
Laws of O's and 1's Identities
Idempotent Laws Commutativity
Associativity
Distribution
Uniting Theorem
Uniting Theorem v. 2
DeMorgan's Law

# Boolean Algebraic Simplification Example 

$y=a b+a+c$

## Boolean Algebraic Simplification

## Example

$$
y=a b+a+c
$$

$\mathrm{abcy}=a(b+1)+c$ distribution, identity $0000=a(1)+c \quad$ law of 1 's
0011
$=a+c$
identity
0100
0111
1001
1011
1101
1111

## Question

- Simplify $Z=A+B C+\bar{A}(\overline{B C})$
- A: $Z=0$
- $B: Z=\overline{A(1+B C)}$
- $C: Z=(A+B C)$
- $D: Z=B C$
- $\mathrm{E}: ~ \mathrm{Z}=1$


## News:

## Open Compute Project Summit:

## Google \& ST Microelectronics: 48V to Chip

- Point-of-Load-(PoL) Converter
- 48 V to 0.5 V .. 1 V .. up to $12 \mathrm{~V}>300 \mathrm{~W} @ 1 \mathrm{~V}$ !
- Efficiency: 230V AC 89.3\%; 48V DC 92.1\%



## Latest 100 A+ VTM

 (and 200 A turbo mode) consumes only $13 \times 23 \mathrm{~mm}$ area

Single VTM replaces multiple conventional DRMOS
and inductor stages.

## Typical Conversion Efficiency

System Efficiency



Signals and Waveforms


Signals and Waveforms: Grouping


Signals and Waveforms: Circuit Delay


$$
\begin{aligned}
& A=\left[a_{3}, a_{2}, a_{1}, a_{0}\right] \\
& B=\left[b_{3}, b_{2}, b_{1}, b_{0}\right]
\end{aligned}
$$



## Sample Debugging Waveform

File Edit 保 - default
Zoom Bookmark Format Window

$/ \mathrm{tb} /$ DBG＿00［10］
2）$/ \mathrm{tb} / \mathrm{DBG} \_00[9]$
2）$/ \mathrm{tb} / \mathrm{DBG} \_00[8]$
2．$/ \mathrm{tb} / \mathrm{DBG}$＿00［7］
9）／tb／DBG＿00［6］
2）$/ \mathrm{tb} / \mathrm{DBG}$＿00［5］
o）$/ \mathrm{tb} /$ DBG＿00［4］
2）$/ \mathrm{tb} / \mathrm{DBG}$＿00［3］
2）／tb／DBG＿00［2］
2）$/ \mathrm{tb} /$ DBG＿00［1］
2．$/ \mathrm{tb} /$ DBG＿00［0］
（－）$/ \mathrm{tb} / \mathrm{A}$
田 $/ \mathrm{tb} / \mathrm{IB}$
（⿴囗十）／tb／ROMAD
（－9）$/ \mathrm{tb} / \mathrm{D}$
（⿴囗十）／tb／TState
2．$/ \mathrm{tb} / 0 \mathrm{E} \_\mathrm{n}$
2）／tb／RAMCS＿n
2）／tb／ROWCS＿n
9．$/ \mathrm{tb} / \mathrm{WE} \mathrm{E}$＿n
2）$/ \mathrm{tb} / \mathrm{X}_{2} 0 \mathrm{E}_{1} \mathrm{n}$
2）$/ \mathrm{tb} / \mathrm{X}$＿RAMCS＿n
D）$/ \mathrm{tb} / \mathrm{X}$＿ROMCS＿n
． tb ／ReadVRAM
2．／tb／CSyncX

$|$| $S t 0$ |  |
| :--- | :--- |
| $S t 0$ |  |
| $S t 0$ |  |
| $S t 1$ |  |
| $S t 0$ |  |
| $S t 0$ |  |
| $S t 0$ |  |
| $S t 0$ |  |
| $S t 0$ |  |
| $S t 0$ |  |
| $S t 0$ |  |
| 0000 |  |
| $3 a$ |  |
| 0000 |  |
| $f f$ |  |
| 0 |  |
| $S t 0$ |  |
| $S t 1$ |  |
| $S t 0$ |  |
| $S t 1$ |  |
| $S t 0$ |  |
| $S t 1$ |  |
| $S t 0$ |  |
| $S t 0$ |  |
| $S t 0$ |  |
| 0 | $p s$ |
| 0 ps |  |
| 1 |  |



96986540 ps to 111169300 ps

## Type of Circuits

- Synchronous Digital Systems consist of two basic types of circuits:
- Combinational Logic (CL) circuits
- Output is a function of the inputs only, not the history of its execution
- E.g., circuits to add A, B (ALUs)
- Sequential Logic (SL)
- Circuits that "remember" or store information
- aka "State Elements"
- E.g., memories and registers (Registers)


## Uses for State Elements

- Place to store values for later re-use:
- Register files (like \$1-\$31 in MIPS)
- Memory (caches and main memory)
- Help control flow of information between combinational logic blocks
- State elements hold up the movement of information at input to combinational logic blocks to allow for orderly passage


## Accumulator Example

Why do we need to control the flow of information?


Want:

$$
\begin{aligned}
& S=0 ; \\
& \text { for } \quad(i=0 ; i<n ; i++) \\
& \quad S=S+X_{i}
\end{aligned}
$$

Assume:

- Each $X$ value is applied in succession, one per cycle
- After n cycles the sum is present on $S$


## First Try: Does this work?



No!
Reason \#1: How to control the next iteration of the 'for' loop?
Reason \#2: How do we say: 'S=0’?

## Second Try: How About This?



Register is used to hold up the transfer of data to adder

Square wave clock sets when things change


Rough
High (1)
timing ... Low (0) S
High (1)
Low (0)
] $x_{0}$ x $\left.\left.x_{0}+x_{1}\right] x_{0}+x_{1}^{1}+x_{2}\right]\left(x_{0}+x_{1}^{1}+x_{2}+\cdots\right)$

Xi must be ready before clock edge due to adder delay


