Project 1-2: RISC-V linker utilities

Computer Architecture I ShanghaiTech University
Project 1.1 Project 1.2 Project 2.1

The Linker Utilities

In part 1 of this project, we wrote an assembler in C. Now, we will continue where we left off by implementing a partition of linker in RISC-V. The linker processes object files (.out files) and generates an executable file.

The linker has two main tasks, combining code and relocating symbols. Code from each input file's .text segment is merged together to create an executable. This also determines the absolute address of each symbol (recall that the assembler outputs a symbol table containing the relative address of each symbol). Since the absolute address is known, instructions that rely on absolute addressing can have the addresses filled in.

In this project, you will be asked to implement three partitions. Each partition will be a utility tool of linker.

The skeleton files contain many lines of code, and it can be easy to get lost in the details. Here is a overview of how the linker functions:

  1. Create an empty (global) symbol table. This table will contain absolute addresses.
  2. For each input file, create a separate relocation table. This table will contain relative addresses (why?).
  3. Open each input file. For each input, parse through line-by-line and look for .text, .symbol and where to relocate.
    • If the .text section is found, count the number of instructions in this section and determine the number of bytes the instructions will take.
    • If the .symbol section is found, read each symbol and store it into the symbol table. Convert the local addresses of each symbol to an absolute address (how do you do this?).
    • If the line need to be relocated is found, read each symbol and store it into the input file's relocation table.
  4. Open the output file.
  5. For each input file, find the .text section and read one instruction at a time. Check whether the instruction requires relocation. If it does, use the symbol table and the relocation table for this input file to relocate. Then, write it into the next line of the output file. If the instruction does not require relocation, write it into the output file directly.

For the sake of simplicity, we will skip many of the error checking steps that a linker would normally perform. Because RISC-V's linker is more compiler level which should not be covered in this course, therefore, we only have to write some tools to help linker. The checks that you do need to perform are stated in the instructions.

Implementation Steps

Step 0: Obtaining the Files

  1. Download the files here. Then extract files in it.

Step 1: String Utilities

In string.s, implement streq(), strlen(), strncpy(), copy_of_str() and dec_to_str(). copy_of_str() should allocate memory dynamically using ecall 9, and it is recommended that you use strlen() and strncpy() in its implementation. Taking a second thought what is the size should be allocated to a string. The main label is about how to test it, we integrated some sample test case in it to help you debug.

Note that if you have not implemented a function, the tester may crash. You can comment out test cases for functions that you have not yet implemented.

And you should implement these functions first, because latter tasks will need these functions. And even our final test case will use some functions you wrote. Therefore, if you cannot pass some string test cases, you will lose some scores in other partitions.

Step 2: Symbol List

In symbol_list.s, complete the implementation of SymbolList, which serves the same purpose that SymbolTable did in project 1-1. You should implement function new_node(), symbol_for_addr(), addr_for_symbol() and add_to_list(). SymbolList uses a linked list to keep track of (symbol addr, symbol name) pairs. An empty SymbolList is simply a pointer to NULL. When an (addr, name) pair is added to a SymbolList, a new list node is created and added to the front of the list. If the SymbolList list node structure were to be declared in C, it would be:

typedef struct symbollist { 
    int addr;
    char* name;
    struct symbollist* next; 
} SymbolList;

Because this data structure is used to store addr and name pairs, so, the addr in comments means the int value addr, not the address of the symbol list node.

Test cases has been integrated in code. You do NOT need to free the list, as Venus has no free ecall. Also, print_symbol() and print_list() have been implemented for you, if you are stuck in some function, take a look at these functions may help you.

Step 3: Parser tool

Before you continue, it may be a good idea to look at the slides CALL lecture as a quick refresher.

This step requires you to make changes in parsetools.s. In this file, you will be asked to implement hex_to_str() and parse_int(). The first function writes a 32-bit number in hexadecimal format. The second function parses strings in base 10 or 16 and generate corresponding integrates. It will be really useful to take a look at ASCII table.

We also integrated test cases in the file, you can debug with it.

Suggested Division of Work

The string.s is the most important partition of the project. And, other functions will work on it, you should implement functions in it firstly. And parsetools.s has some similar functions, so, it is better to make one of group member to implement these two files. Then, the other partition (aka symbol_list.s) can be written by another teammate.

Coding Restrictions

  1. Every function should not be longer than 100 lines.
  2. You are only allowed to use following registers:
    1. a0-a3
    2. s2-s5
    3. t0-t5
    4. sp, ra, zero
  3. You are not allowed to use any numbered registers like x19
  4. Also you are required to write a meaningful comment in English in at least every second line you added!


You should submit a compressed file named as proj1_2.tar to autolab.

The directory tree of your submission should look like the following :

    |-- string
        |- streq.s
        |- strlen.s
        |- strncpy.s
        |- copy_of_str.s
        |- dec_to_str.s
    |-- parsetools
        |- hex_to_str.s
        |- parse_int.s
    |-- symbol_list
        |- new_node.s
        |- symbol_for_addr.s
        |- addr_for_symbol.s
        |- add_to_list.s


Every .s file only contains the corresponding name of function, DO NOT add any other code in the file (including any assembler directives like .text). Also you are required to write a meaningful comment in English in at least every two lines you in these files.

Note: Autolab grading results:

Note: your submission should NOT contain main() function

Autolab will open later.