Name (Pinyin): Email (Prefix):

# Computer Architecture I

# Homework 6

2021 Spring April 22

Instructions:

Homework 6 covers the content of caches, please refer to the lecture slides. You can print it out, write on it and <u>scan</u> it into a pdf, or you can edit the PDF directly, just remember: you must create a <u>PDF</u> and upload to the <u>Gradescope</u>. Please assign the questions properly on Gradescope, otherwise you will lose 25% of points.

## [30 points] Question Set 1. Direct Mapped Cache

In a 32-bit machine (word size is 32 bit), the clock frequency is 2GHz. We have a direct mapped cache with properties as follows:

- 1. Cache size is 32 Bytes;
- 2. Block size is 8 Bytes;
- 3. Cache hit time is 2 cycles;
- 4. Cache miss penalty is 100 cycles;

1-A. What is the width (in bits) of each field of following address bit assignment?

| TAG: 27 | Set index: 2 | Block offset: 3 |
|---------|--------------|-----------------|
|         |              |                 |

Please provide info about how you came to this result. (Answer 6pt + Analysis 4pt)

1–B. We will access the data of addresses as follows. Fill in the blanks. It is about T/I/O(tag/index/offset), whether there is a hit. (each line worth 1 pt.)

| Addresses (serially access) | Т/I/О | Miss or Hit {"Miss", "Hit"} |
|-----------------------------|-------|-----------------------------|
|                             |       |                             |
| 0x0000004                   | 0/0/4 | Miss                        |
| 0x0000005                   | 0/0/5 | Hit                         |
| 0x0000068                   | 3/1/0 | Miss                        |
| 0x00000c8                   | 6/1/0 | Miss                        |
| 0x0000068                   | 3/1/0 | Miss                        |
| 0x00000dd                   | 6/3/5 | Miss                        |
| 0x0000045                   | 2/0/5 | Miss                        |
| 0x0000004                   | 0/0/4 | Miss                        |
| 0x00000c8                   | 6/1/0 | Miss                        |
| 0x0000004                   | 0/0/4 | Hit                         |

1–C. Calculations. (Show progress, worth 50% pts)

1-C-i: miss rate: (4 pt.)

#### 0.8

1-C-ii: AMAT (ns): (3 pt.)

1 + 0.8 \* 50 = 41 ns

1-C-iii: AMAT if we don't have this cache (ns): (3 pt.)

#### 50ns

## [30 points] Question Set 2. Four-Way Set Associative Cache

From Q1 (32bit machine, clock frequency is 2GHz), we implemented a four–way set associative cache. The parameters are shown as follows:

- 1. Cache size is 32 B;
- 2. Block size is 8 Bytes;
- 3. Cache hit time is 2 cycles;
- 4. Cache miss penalty is 100 cycles;

2–A. What the width of each field of following address bit assignment:

| TAG: 29 | Set index: 0 | Block offset: 3 |
|---------|--------------|-----------------|
|         |              |                 |

Give me the progress how you think about it. (Answer 6pt + Analysis 4pt)

2–B. We will access the data of addresses as follows. Fill in the blanks. It is about

T/I/O(tag/index/offset), whether there is a hit. Use LRU algorithm. (each line worth 1 pt.)

| Addresses (serially | Т/I/О  | Miss or Hit {"Miss", "Hit"} |
|---------------------|--------|-----------------------------|
| access)             |        |                             |
| 0x0000004           | 0/0/4  | Miss                        |
| 0x0000005           | 0/0/5  | Hit                         |
| 0x0000068           | 13/0/0 | Miss                        |
| 0x00000c8           | 25/0/0 | Miss                        |
| 0x0000068           | 13/0/0 | Hit                         |
| 0x00000dd           | 27/0/5 | Miss                        |
| 0x0000045           | 8/0/5  | Miss                        |
| 0x0000004           | 0/0/4  | Miss                        |
| 0x00000c8           | 25/0/0 | Miss                        |
| 0x0000004           | 0/0/4  | Hit                         |

2–C. Calculations.2–C–i. Miss rate: (5 pt.)

### 0.7

2-C-ii. Calculate the AMAT in ns. (5 pt.)

1 + 0.7 \* 50 = 36

### [22 points] Question Set 3. Cache Friendly Programming

This C program is run on a processor with a direct–mapped data cache with a size of 1KiB and a block size of 16 bytes. Assume that your cache begins cold and no optimization. (32bit machine)

```
int i, j, A[256*256];
sum = 0;
for (i = 0; i < 256; i++) {
  for (j = 0; j < 256; j++) {
    sum += A[256*i + j];
  }
}
```

Assume no optimization and sizeof(int) == 4 and A = 0x4000. 3–A–i. Make use of what kind of locality (2 pt.) Spacial locality.

3-A-ii. Calculate the miss rate. Show progress (10 pt.)

Every four miss one. Miss rate = 0.25

3-B. Calculate the miss rate.

```
int i, j, A[256*256];
sum = 0;
for (i = 0; i < 256; i++) {
  for (j = 0; j < 256; j++) {
     sum += A[256*j + i];
  }
}
```

Show your progress (10 pt.)

All miss. Miss rate = 1

## [18 points] Question Set 4. AMAT

4–A. Impact of increasing associativity with fixed block size and number of sets. Fill in with increase, decrease or unchange. (3 pt)

| Hit time     | increase |
|--------------|----------|
| Miss rate    | decrease |
| Miss penalty | unchange |

4–B. Suppose your L1\$ has a hit time of 2 cycles and a local miss rate of 20%. Your L2\$ has a hit time of 15 cycles and a global miss rate of 5%. Your main memory access needs 100 cycles.

4-B-i. Write down your local miss rate of L2\$. (5 pt)

5 / 20 = 25%

4-B-ii. Write down the AMAT of the system. (5 pt)

2 + 0.2 \* (15 + 0.25 \* 100) = 10 cycles

4–B–iii. Suppose your newly added L3\$ has a hit time of 30 cycles and you want reduce your AMAT of the system to 8 cycle, what is the largest local miss rate? (5 pt)

2 + 0.2 \* (15 + 0.25 \* (30 + m \* 100)) <= 8 M <= 30%