# Homework 3 - Computer Architecture I - ShanghaiTech University

## Project 1.2 RISC-V Disassembler

### Introduction

In project 1.2, you will implement a disassembler which supports a part of the RISC-V instructions. The disassembler will translate 32-bits machine code to RISC-V code. This project is easy as long as you took the courses, Lab 3 and Lab 4 seriously.

Implementing a real disassembler which supports all RISC-V instructions is exhausting. So in this project we only need to disassemble the part of the RISC-V instructions which were mentioned in project 1.1. The instruction set is listed below.

#### Instruction Set Your Disassembler Should Support

INSTRUCTION TYPE OPCODE FUNCT3 FUNCT7 / IMM OPERATION
add R 0x33 0x0 0x00 R[rd] ← R[rs1] + R[rs2]
or R 0x33 0x6 0x00 R[rd] ← R[rs1] | R[rs2]
slt R 0x33 0x2 0x00 R[rd] ← (R[rs1] < R[rs2]) ? 1 : 0
sltu R 0x33 0x3 0x00 R[rd] ← (U(R[rs1]) < U(R[rs2])) ? 1 : 0
sll R 0x33 0x1 0x00 R[rd] ← R[rs1] << R[rs2]
jalr I 0x67 0x0 R[rd] ← PC + 4
PC ← R[rs1] + imm
addi I 0x13 0x0 R[rd] ← R[rs1] + imm
ori I 0x13 0x6 R[rd] ← R[rs1] | imm
lb I 0x03 0x0 R[rd] ← SignExt(Mem(R[rs1] + offset, byte))
lbu I 0x03 0x4 R[rd] ← U(Mem(R[rs1] + offset, byte))
lw I 0x03 0x2 R[rd] ← Mem(R[rs1] + offset, word)
sb S 0x23 0x0 Mem(R[rs1] + offset) ← R[rs2][7:0]
sw S 0x23 0x2 Mem(R[rs1] + offset) ← R[rs2]
beq SB 0x63 0x0 if(R[rs1] == R[rs2])
PC ← PC + {offset, 1b’0}
bne SB 0x63 0x1 if(R[rs1] != R[rs2])
PC ← PC + {offset, 1b’0}
blt SB 0x63 0x4 if(R[rs1] < R[rs2])
PC ← PC + {offset, 1b’0}
bge SB 0x63 0x5 if(R[rs1] >= R[rs2])
PC ← PC + {offset, 1b’0}
jal UJ 0x6f R[rd] ← PC + 4
PC ← PC + {imm, 1b’0}
lui U 0x37 R[rd] ← {offset, 12b’0}

For further reference, here are the bit lengths of the instruction components.

inst rd, rs1, rs2

R-TYPE funct7 rs2 rs1 funct3 rd opcode
Bits 7 5 5 3 5 7
I-TYPE: imm[11:0] rs1 funct3 rd opcode
Bits 12 5 3 5 7
S-TYPE imm[11:5] rs2 rs1 funct3 imm[4:0] opcode
Bits 7 5 5 3 5 7
SB-TYPE imm[12] imm[10:5] rs2 rs1 funct3 imm[4:1] imm[11] opcode
Bits 1 6 5 5 3 4 1 7
U-TYPE imm[31:12] rd opcode
Bits 20 5 7
UJ-TYPE imm[20] imm[10:1] imm[11] imm[19:12] rd opcode
Bits 1 10 1 8 5 7

The reference is RISC-V Green Card. If there are some mistakes above, the RISC-V Green Card would prevail.

#### Named Registers

When you disassemble the RISC-V instructions, you have to use named registers. The registers’ names are defined as below. If you don’t use the defined name, you may fail in some testcases.

REGISTER x0 x1 x2 x3 x4 x5-x7 x8 x9 x10-x11 x12-x17 x18-x27 x28-x31
NAME x0 ra sp gp tp t0-t2 s0 s1 a0-a1 a2-a7 s2-s11 t3-t6

Here is a very simple template to begin with.

#### Input

A reference input is already provided to you in the `input.S` file. The final tests’ input have the same format as the provided input except the number of machine instructions and the contents of machine code. You may find that the format of input is very similar to Homework 3.

``````.data

# Constant integer specifying the lines of machine codes

# DO NOT MODIFY THIS VARIABLE
.globl lines_of_machine_codes
lines_of_machine_codes:
.word 8

# 32-bits machine codes
# A 32-bits hexadecimal number represents one line of machine code.
# You can suppose all of the input machine codes are valid.

# DO NOT MODIFY THIS VARIABLE
.globl machine_codes
machine_codes:
.word 0x000502B3    # add  t0, a0, x0
.word 0x00100313    # addi t1, x0, 1
.word 0x00028863    # beq  t0, x0, 16
.word 0x01DE13B3    # sll  t2, t3, t4
.word 0xFFF28293    # addi t0, t0, -1
.word 0xFF5FF06F    # jal  x0, -12    Here we use offset instead of label.
.word 0x00600533    # add  a0, x0, t1
.word 0x00008067    # jalr x0, ra, 0``````

You may assume that all the machine codes are valid and each of them can be disassembled to one of the instructions mentioned above.

#### Output

It’s usually the duty of the supervisor (operating system) to deal with input/output and halting program execution. Venus, being a simple emulator, does not offer us such luxury, but supports a list of primitive environmental calls. A snippet of assembly of how to exit with error code 0 is already provided to you in `disassembler.S`. The output line ‘ Exited with error code 0’ is neccessary. So please use ID17(exit2) environmental environmental call instead of ID10(exit) environmental call.

In this project, you need to output RISC-V instructions. We will compare your output to the correct RISC-V code. Although the output format isn’t so strict, there are also some rules.

• Your output should be recognized as correct input to venus. Your output code may not be able to run by venus, it depends on the testcases. But venus must be able to translate your output to machine code.

• Each line can only have one instruction and don’t put semicolon at the end of each instruction.

• When dealing with SB type instructions, you don’t need to create labels. You just need to put the offset there instead of label. e.g. `bge a0, x0, -12`.

• The commas between registers and immdiate are necessary. e.g. `addi x0, a1, 4` `lw s0, 4(sp)` are accepted; `addi x0 a1 4` `lw s0 4(sp)` will not be accepted;

• The immediate and offset should be represented as a decimal number. The other such as hex and binary won’t be accepted.

• For R-Type (inst rd, rs1, rs2), I-Type (inst rd, rs1, imm / inst rd, offset(rs1))，UJ-Type (inst rd, imm),

U-Type (inst rd, imm). There must be at least one whitespace between `inst` and `rd`. Besides that, you can add any whitespace before or after `rd`, `rs1`, `rs2`, `offset` and `imm` for your convinience. e.g. `inst rd, offset( rs1)` `inst rd, rs1 , imm` both will be accpeted.

• For S-Type (inst rs2, offset(rs1)), SB-Type (inst rs1, rs2, offset). There must be at least one whitespace between `inst` and `rs1`/ `rs2`. The other rules are same as above.

• When we test your output, we will transform your output to our predefined format. This work is done by us. So don’t worry so much about the format. The format we will use are R-Type (inst rd, rs1, rs2), I-Type (inst rd, rs1, imm / inst rd, offset(rs1))，UJ-Type (inst rd, imm), U-Type (inst rd, imm), SB-Type (inst rs1, rs2, offset) and S-Type(inst rs2, offset(rs1)). It means that we may only remove some whitespace from your output.

``````e.g. for the final format. (This work is done by us.)
lw    a2, 4( a3 )   ->   lw a2,4(a3)``````

The command that we use to test your program’s correctness is

``diff <your_transformed_output> <reference_output>``

You can also test your result using this command.

### Running

Make sure that `venus-jvm-latest.jar` `disassembler.S` and `input.S` reside in the same directory. To run your program locally and write the output to `disassembler.output`

``java -jar venus-jvm-latest.jar disassembler.S >> disassembler.output``

To debug your program online, you might want to replace `.import input.S` in `disassembler.S` with the content of input.S.

### Tips

• You can use any RISC-V instruction as long as venus can recognize them.

• To interact with the input file, you can use the global labels defined in `input.S`. For example, you can get the `lines_of_machine_codes` with the following code:

``la    a7, lines_of_machine_codes``
• Handwritten assembly are postfixed with extension `.S` to distinguish from compiler generated assembly `.s`

• You can learn how to output a string, int or char from Lab 3 and Lab 4.

• Actually almost all things you need can be learnt from venus Wiki.

• Learn save and load from memory using RISC-V.

• Be careful about the calling convention, it will make life easier.

• The test cases are very friendly! Don’t focus too much on the edge cases, focus on the correctness on the common cases.

### Execution

We will test your program using RISC-V emulator venus. You probably want to read this before you started.

### Submission

• You need to follow the RISC-V integer register convention and the RISC-V integer calling convention.
• You need to have meaningful comments not less than 25% of the total lines of code you wrote.
• A comment is defined by a sentence followed by `#`.
• We will use gitlab with the according `autolab.txt`, as in Project 1.1. From your gitlab we will only use `disassembler.S`. Do NOT include venus or any other quite big files into the git - you are welcome to use the git to also share your test input and other small support files with your teammate.