Lexical analysis
I define a struct, the lexe Lexeme meType records the lexeme’s type which is either Identifier or String, the str records the string of the lexeme. The lexical analysis result will be stored in an array of the Lexeme structs.
struct Lexeme
{
enum LexemeType lexemeType;
char str[100];
};
I use the regular expression function provided by
Data structures
In stack.h, I define the data structure stack used.
struct Stack {
int elements[MAX_STACK_SIZE];
int size;
};
In functions.h, the data stack, type stack, return stack, symbol table, string memory and virtual machine are defined.
Stack * data_stack = NULL;
Stack * type_stack = NULL;
Stack * return_stack = NULL;
int virtual_machine[MAX_VM_SIZE];
int mm_str[MAX_VM_SIZE];
typedef void (*FuncType)(void);
struct Function {
char name[100];
FuncType fun;
int vmLocation;
};
typedef struct Function Function;
Function functions[1000];
The struct Function record the name of the function, the function pointer and the VM location of the function. So I can search in the functions array by function name to find its function pointer or VM location.
Functions
In addAllBasicFunction, the built-in functions are added to the system. In initVM, the virtual machine are initialized for the built-in functions. The C functions such as void str(void), void printTop(void), void plus(void) and others are defined for LAC built-in functions “str”, “.”, “+” and others.
There are various math operations such as “+”, “-“, “*”, “/”, “=”. To reduce code duplication, I define a helper function void mathOp(char op), and the function plus, sub, mult, divFunc and equal simply invoke this function.
The functions are mainly to manipulate the stacks. For math operation, it pops the two operands from the stack, and apply operation on them to get the result and then push the resul to the stack. The dup, swap, drop are simple, dup is to push the same element same with top element to the stack. The swap is to swap the top 2 elements in the stack, the drop is to remove the top element. The function cr is just to print a new line and the function exit is to exit the system.
The count and type function are for string value, they are more complex. The count function removes the string length from string memory and push the string length to the data stack. And the type function pops the string length and string position out and print the string in string memory to console.
The remaining functions lit, fin, str, if, then, else are used only in compiler mode. Functions described previously only manipulate data stack and type stack. These Functions mainly manipulate the return stack. They enable the system not only can execute command in sequence, but also can have condition branches, define new function and call other functions. Function Function lit increase the top element of return stack by 1 and push the VM value indexed by the top element to the data stack.
Function fin pops the return stack.
Function ifFunc pops the data stack, if the value is true, then increase the top element of the return stack by 1, so that the system can execute the branch just after if. If the value is false, It pops the top element of return stack, and push the VM value indexed by the top element plus 1 to the return stack.
Function elseFunc is also to pops the top element of return stack, and push the VM value indexed by the top element plus 1 to the return stack.
Function thenFunc is to increase the top element of the return stack by 1.
Interpreter
First, use initSystem to initialize the system, this includes create all stacks, set their size to 0, add built-in functions of LAC to the system.
Then convert the LAC codes to lexemes. For each lexeme, if it is a Identifier, search in the Functions array by name, if found,
then it is a function name and call the corresponding function.
Otherwise it is a number, push to the data stack. If it is a string, store it to the string memory and push the position in the memory to the data stack.
Compiler
The compiler has two phases, in the first phase, it converts the lexmes and process each lexeme to construct the VM. In the second phase, it runs the VM to execute the LAC program.
The same with interpreter to initialize the system, in addition, it needs to record the VM location of built-in function lit, fin, str, if and else.
It also converts the LAC codes to lexemes. Then it processes each lexeme. If the lexeme is a number, it adds the VM location of lit and the number itself to the VM. If the lexeme is a string, it stores the string to string memory and adds the VM location of str and the position of the string in the memory to the VM. If it is identifier “:” which means the start of user defined function, it adds the function to functions array and adds number 1 to VM to indicate that is is user defined instead of 0 for built-in function. If the lexeme is “;” which indicates the end of user defined function, it adds the VM location of fin to the VM. If the lexeme is “if”, it adds the VM location of “if” to the VM. If the lexeme is “else”, it adds the VM location of “else” to the VM. If the lexeme is “then”, if sets the value of start of branch location in VM according to whether there is “else” branch.
After processing all the lexemes, the VM is complete and the system can run the LAC program according to the VM. The function runVM has a int value to record the location of the VM, it is like the PC register to record the next instruction position. It runs the VM until the position is larger than the VM size. It runs according to whether the function is user-defined or built-in.
System Usage
compile the project: make
usage:
interactive interpreter: ./run
compile file and run: ./run
this command will also generate .out compiled binary file in the same directory with lac file
run compiled .out file: ./run
interpret file: ./run -i
Testing
Test files: factorielle.lac, increase.lac, interpretTest.lac and mytest.lac. Test both compile mode and interpret mode.
increase.lac
: incr 1 + ;
: 2+. incr incr . ;
10 2+.
interpretTest.lac
3 5 6 + 10 * 9 – . .
” good” count type cr
10 5 swap – dup + .
20 30 drop 40 + .
mytest.lac
: 2+. 2 + . ;
5 2+.
” good” .
0 1 = .
1 2 swap – .
/docProps/thumbnail.jpeg