MIPS assembler

Data structure

std::map<std::string,int> labelMap is the map between label name and the sequence number of the labeled instruction.
map<string, int> registerNumber stores the map between register name and register number.

struct InstructionInfo stores the instruction’s information including its name, type, opcode and funct and includes a function named convert to convert the lexer::instruction struct to int.
map<string, InstructionInfo> instructInfoMap stores map between instruction name and InstructionInfo, it is initialized in function addInstructInfo.

functions and algorithm

In the main.cpp main function, it reads each file and gets a vector instruction, then it adds all label names and their associated sequence number to labelMap using instruction vector. Then it iterates instruction vector, using function convertInstruction in myUtil.h to convert it to int and print in hex format to output file.

function convertInstruction using instructInfoMap to get the instructInfo by name and invoke the convert function of the class to convert instruction to int.

convert function of class InstructionInfo processes the instruction according to the types of it. It gets rs, rt, rd, immediate and other information from the lexer::instruction, and in the end invoke constructR or constructI to integrate parts to an int and return. Under the R type, it splits instructions into two kinds further, one like “add $t1 $t2 $t3”, and another like “sll $t1 $t2 20”. Each kind has its own way to compute rd, rs and rt. Under the I type, it splits ins tructions into two kinds further, 1) like “addi $t1 $t2 100”; 2) like “lbu $t1 -20($t2)”; 3) like “lui $t1 20”; 4) like “beq $t1 $t2 label”. Different kinds infect how to get rt, rs and immediate from lexer::instruction’s args vector. The 4th kind of I type including “beq bne” branch commands need special processing. From “std::map<std::string,int> labelMap” gets the target’s sequence number, and the immediate is computed by subtracting current instruction’s sequence number from target’s sequence number, and further minus 1. If the result is negative, we need to convert to 16 bit two’s complement representation by plus it with 65536.

The function constructR constructs the R type instruction from opcode, rs, rt,rd, shamt and funct.

The function constructI constructs the R type instruction from opcode, rs, rt and immediate.