## Homework 6

Computer Architecture I ShanghaiTech University
HW5 HW6 HW7

## Goals

This assignment will cover caches and floating point numbers.

## Exercises

### Problem 1: Cache Associativity (4 pts)

One Associative Cache is formed by 64 blocks divided into 8 sets and each block has 256 bytes. The main memory has 4096 blocks in total.
1. How many bits are required for the main memory.
2. Please write down the format of the main memory.

### Problem 2: Cache Basic (8 pts)

Assume we have 16 bit byte-addresses. A very tiny direct-mapped cache holds a total of 32 bytes and each cache block is two words (== 2 x 16 bits) wide.
1. How many bits are used for the tag?
2. Which other bytes would share a block with the byte at address 0x75? Please write the bytes in hex (ex: 0xAA) in your answer, and separate multiple answers by single spaces, if you wish to include more then one byte in your answer (ex: 0xAB 0xFF)
3. The block containing the byte at 0x75 is in the cache. What memory accesses are guaranteed to get a cache miss? Specify what the value of the index bits equate to, in decimal.
4. The hit time is one clock cycle, and the miss penalty is 100. If the miss rate is 50%, what is the AMAT? Please specify your answer in decimal (ex: 12.5)

### Problem 3: Cache Friendly Programming (6 pts)

Assume we have 16 bit memory addresses. Our cache holds a total of 64 bytes, and each cache block is 8 bytes. Examine the following code. Assume all array accesses are valid (array is large enough), the cache starts empty, and that our data cache is direct-mapped. v need not be block aligned. What is the maximum and minimum data cache hit rate for a single call to shiftarray (remember to account for loads AND stores) if...

• j = 1?
• j = 8?
• j = 64?
```int j = VALUE_SPECIFIED_ABOVE;
void shiftarray(char* v) {
for (int i = 0; i < 10; i++)
v[i] = v[i+j];
}
```

### Problem 4: Cache Friendly Programming (6 pts)

The following C program is run (with no optimization) on a processor with a direct-mapped data cache with a size of 1KiB and a block size of 32 bytes.

```int i, j, array[256*256];

/* ... */

for (i = 0 ; i < 255 ; i++) {
for (j = 0 ; j < 256 ; j++) {
array[256*j] += array[256*j + i + 1];
}
}
```

Assume sizeof(int) == 4 and array == 0x4000.

1. Calculate the miss rate.
2. Rewrite the code to improve the performance (don't use cache blocking). What is the miss rate in this case? Give a fraction or three significant digits for the percentage.
3. Will a write-back cache be helpful with the code in (b)? Explain briefly, in one or two sentences.

### Problem 5: Floating Point Numbers (10 pts)

For the following questions, we will be referring to the IEEE 32-bit floating point representation except with a 7 bit exponent (bias of 2^7/2 - 1 = 63) and a denorm implicit exponent of -62.