程序设计综合实践
需求
任务
实现一个简单指令集的模拟器
DLX
实现语言及开发平台:C/C++
Linux平台(Ubuntu 14.04 64位)
模拟器
指令相关文本文件
结果文本文件
程序设计综合实践
2020年秋
DLX指令集
Deluxe
32位的RISC架构指令集
John Hennessy and David Patterson
Load/Store 体系结构
3种指令
I、R、J
核心指令(课程要求)
扩展指令:浮点..(高级要求)
程序设计综合实践
2020年秋
deluxe |dəˈlʌks,-ˈlʊks|
1990,
寄存器
32个32位寄存器
R0…R31
特殊要求
R0永远是0,并且禁止被修改
特殊寄存器
PC
SR、CA、EPC、ESR、ECA、EMAR(不作要求)
程序设计综合实践
2020年秋
内存布局
每个内存单元8位(1个字节)
内存可以以字节、字的形式访问
字访问的地址必须被4整除
每条指令4个字节
指令地址必须被4整除
数据大端布局
程序设计综合实践
2020年秋
指令
不作要求
程序设计综合实践
2020年秋
I类型指令
内存指令
lb、lw、sw、sb
算术及位运算指令
addi、subi、andi、ori、xori
测试指令
sgti、seqi、sgei、slti、snei、slei
控制指令
beqz、bnez、jr
语法:beqz RS1 imm、beqz RS1 label1
语法:jr RS1、jr label1
语法:
语法:
语法:
程序设计综合实践
2020年秋
R类型指令
算术及位运算指令
add、sub、and、or、xor
语法: add RD RS1 RS2
移位指令
slli、srli
语法: slli RD RS1 SA
测试指令
sgt、seq、sge、slt、sne、sle
语法: sgt RD RS1 RS2
程序设计综合实践
2020年秋
特殊指令
停机
halt
输出(10进制)
输出寄存器的内容:op RS
输出内存内容:op RS imm
在屏幕上输出的同时,也换行输出到结果文件中
程序设计综合实践
2020年秋
例子
addi R1 R0 0 ;
addi R2 R0 0 ;
add R1 R1 R2 ;
addi R2 R2 1 ;
slei R3 R2 100 ;
bnez R3 loop ;
op R1 ;
halt
loop :
程序设计综合实践
2020年秋
例子
addi R1 R0 0 ;
addi R2 R0 0 ;
add R1 R1 R2 ;
addi R2 R2 1 ;
slei R3 R2 100 ;
bnez R3 -4;
op R1 ;
halt
loop :
程序设计综合实践
2020年秋
例子
addi R1 R0 0 ;
addi R2 R0 0 ;
add R1 R1 R2 ;
addi R2 R2 1 ;
slei R3 R2 100 ;
bnez R3 -1;
op R1 ;
halt
loop :
程序设计综合实践
2020年秋
例子
addi R1 R0 0 ;
addi R2 R0 0 ;
add R1 R1 R2 ;
addi R2 R2 1 ;
slei R3 R2 100 ;
bnez R3 1;
op R1 ;
halt
loop :
程序设计综合实践
2020年秋
例子
addi R1 R0 0 ;
addi R2 R0 0 ;
add R1 R1 R2 ;
addi R2 R2 1 ;
slei R3 R2 100 ;
jr RS1;
op R1 ;
halt
loop :
jr RS1
jr 0
jr 1
jr -1
jr -3
程序设计综合实践
2020年秋
基本要求
输入文本格式:
指令序列,使用“;”分隔开
每个指令可以有一个唯一标记(标识符),以方便跳转
lable1 : add RD RS1 RS2
指令操作名大小写都可以,寄存器名字大小写都可以
内存文件
每行代表一个内存单元的值,地址:值
例如:0x00000000:0x11
输出
文本格式结果文件
程序设计综合实践
2020年秋
基本要求
程序输入
simulator
-i
-m
-o
程序设计综合实践
2020年秋
可能的优化
优化1:控制流和数据流依赖分析
addi R4 R0 1 ;
addi R1 R0 0 ;
addi R2 R0 0 ;
add R1 R1 R2 ;
addi R2 R2 1 ;
slei R3 R2 10000000000;
bnez R3 loop ;
op R4 ;
halt
loop :
程序设计综合实践
2020年秋
可能的优化
优化1:控制流和数据流依赖分析
addi R4 R0 1 ;
addi R1 R0 0 ;
addi R2 R0 0 ;
add R1 R1 R2 ;
addi R2 R2 1 ;
slei R3 R2 10000000000;
bnez R3 loop ;
op R4 ;
halt
loop :
程序设计综合实践
2020年秋
可能的优化
优化2:并发执行
addi R1 R0 0 ; addi R2 R0 0 ;
addi R4 R0 0 ; addi R5 R0 0 ;
add R1 R1 R2 ; addi R2 R2 1 ;
slei R3 R2 1000000 ; bnez R3 loop ;
add R4 R4 R5 ; addi R5 R5 1 ;
slei R6 R5 1000000 ; bnez R6 loop1 ;
op R1 ; op R4;
halt
loop :
loop1 :
程序设计综合实践
2020年秋
可能的优化
优化2:并发执行
addi R1 R0 0 ; addi R2 R0 0 ;
addi R4 R0 0 ; addi R5 R0 0 ;
add R1 R1 R2 ; addi R2 R2 1 ;
slei R3 R2 1000000 ; bnez R3 loop ;
add R4 R4 R5 ; addi R5 R5 1 ;
slei R6 R5 1000000 ; bnez R6 loop1 ;
op R1 ; op R4;
halt
loop :
loop1 :
程序设计综合实践
2020年秋
加分项目
灵活的语法
支持浮点指令
支持高级语言的翻译
执行效率优化设计
如有需要可联系老师或教辅
已与王挺老师商议好,可以作为编译原理课程综合实验
程序设计综合实践
2020年秋
作品评分网站
每个小组
可重复提交源代码压缩包(必须tar格式)
代码包要求
必须包含Makefile文件
解压make后生成可执行文件
可执行文件名要求:DLXSimulator+组号
后台自动运行和综合评分
定期自动查重,严打抄袭
程序设计综合实践
2020年秋
自学内容
Linux基本操作
Ubuntu Linux 14.04 64位
GNU Make的基本使用
熟悉DLX指令
程序设计综合实践
2020年秋
5
Data FormatData Format
n Byte ordering adheres to the Big Endian ordering
Ë The most significant byte is always in the lowest byte
address in a word or halfword
mem[0] ←← 0xAABBCCDD
DD
CC
BB
AA
AA
BB
CC
DD
3
2
1
0
Big Endian Little Endian
byte
address
AddressingAddressing
n Memory is byte addressable
ËStrict address alignment is enforced
n Halfword memory accesses are restricted to
even memory address
Ëaddress = address & 0xfffffffe
n Word memory accesses are restricted to
memory addresses divisible by 4
Ëaddress = address & 0xfffffffc
5
Data FormatData FormatnByte ordering adheres to the Big Endian orderingËThe most significant byte is always in the lowest byte address in a word or halfwordmem[0]¬¬0xAABBCCDDDD
CC
BB
AA
AA
BB
CC
DD
3
2
1
0
Big Endian Little Endian
byte
address
Addressing
Addressing
nMemory is byte addressable
ËStrict address alignment is enforced
nHalfword memory accesses are restricted to
even memory address
Ëaddress = address & 0xfffffffe
nWord memory accesses are restricted to
memory addresses divisible by 4
Ëaddress = address & 0xfffffffc
5
Data FormatData Format
n Byte ordering adheres to the Big Endian ordering
Ë The most significant byte is always in the lowest byte
address in a word or halfword
mem[0] ←← 0xAABBCCDD
DD
CC
BB
AA
AA
BB
CC
DD
3
2
1
0
Big Endian Little Endian
byte
address
AddressingAddressing
n Memory is byte addressable
ËStrict address alignment is enforced
n Halfword memory accesses are restricted to
even memory address
Ëaddress = address & 0xfffffffe
n Word memory accesses are restricted to
memory addresses divisible by 4
Ëaddress = address & 0xfffffffc
5
Data FormatData FormatnByte ordering adheres to the Big Endian orderingËThe most significant byte is always in the lowest byte address in a word or halfwordmem[0]¬¬0xAABBCCDD
DD
CC
BB
AA
AA
BB
CC
DD
3
2
1
0
Big Endian Little Endian
byte
address
Addressing
Addressing
nMemory is byte addressable
ËStrict address alignment is enforced
nHalfword memory accesses are restricted to
even memory address
Ëaddress = address & 0xfffffffe
nWord memory accesses are restricted to
memory addresses divisible by 4
Ëaddress = address & 0xfffffffc
Appendix A
DLX INSTRUCTION
SET
ARCHITECTURE
520
Table A.1 Special purpose registers of the DLX fixed point core
I register
II usage
PC program counter points to the next instruction
SR status register holds interrupt masks (among others)
CA cause register records pending interrupts
EPC, exception registers on a jump to the interrupt service rou-
ESR, tine they backup the current value of
ECA, PC, SR, CA respectively the current
EMAR memory address
6 5 5 16
I-type
opcode RSl RD immediate
6 5 5 5 5 6
R-type
opcode RSl
RS2 RD SA function
6 26
J-type
opcode
PC offset
Figure A.1 The three instruction formats of the DLX design. The fields RS 1 and
RS2 specify the source registers, and the field RD specifies the destination regis-
ter. Field SA specifies a special purpose register or an immediate shift amount.
Function field is an additional 6-bit opcode.
A.I.1 Instruction Formats
All three instruction formats (figure A.I) have a 6-bit primary opcode and
specify up to three explicit operands. The I-type (Immediate) format spec-
ifies two registers and a I6-bit constant. That is the standard layout for
instructions with an immediate operand. The J-type (Jump) format is used
for control instructions. They require no explicit register operand and profit
from the larger 26-bit immediate operand. The third format, R-type (Regis-
ter) format, provides an additional 6-bit opcode (junction). The remaining
20 bits specify three general purpose registers and a field SA which spec-
ifies a 5-bit constant or a special purpose register. A 5-bit constant, for
example, is sufficient as shift amount.
Appendix A
DLX INSTRUCTION
SET
ARCHITECTURE
520
Table A.1 Special purpose registers of the DLX fixed point core
I register
II usage
PC program counter points to the next instruction
SR status register holds interrupt masks (among others)
CA cause register records pending interrupts
EPC, exception registers on a jump to the interrupt service rou-
ESR, tine they backup the current value of
ECA, PC, SR, CA respectively the current
EMAR memory address
6 5 5 16
I-type
opcode RSl RD immediate
6 5 5 5 5 6
R-type
opcode RSl
RS2 RD SA function
6 26
J-type
opcode
PC offset
Figure A.1 The three instruction formats of the DLX design. The fields RS 1 and
RS2 specify the source registers, and the field RD specifies the destination regis-
ter. Field SA specifies a special purpose register or an immediate shift amount.
Function field is an additional 6-bit opcode.
A.I.1 Instruction Formats
All three instruction formats (figure A.I) have a 6-bit primary opcode and
specify up to three explicit operands. The I-type (Immediate) format spec-
ifies two registers and a I6-bit constant. That is the standard layout for
instructions with an immediate operand. The J-type (Jump) format is used
for control instructions. They require no explicit register operand and profit
from the larger 26-bit immediate operand. The third format, R-type (Regis-
ter) format, provides an additional 6-bit opcode (junction). The remaining
20 bits specify three general purpose registers and a field SA which spec-
ifies a 5-bit constant or a special purpose register. A 5-bit constant, for
example, is sufficient as shift amount.
38 CHAPTER 7. A SIMPLIFIED DLX
• Special Registers: MAR, MDR, A, B and C;
The main modules in the datapath of the simplified DLX are as follows:
• The GPR-File is a dual-port RAM which supports either two reads or one write.
• The ALU supports 2’s complement integer addition, subtraction, comparison and bitwise
logical operations.
• The Shifter supports logical left and right shifts by one position.
7.1.3 Instruction Formats
There are two instruction formats:
1. An instruction in the I-Type-Format is divided into four fields depicted below.
Opcode RS1 RD immediate
6 5 165
2. An instruction in the R-Type-Format is divided into five fields depicted below. Note that
the 5 bit field used in the DLX-Architecture for the shift amount (SA) is not used in the
simplified DLX.
Opcode RS1 RDRS2 Function
6 5 65 5 5
7.1.4 Instruction Set
We list below the instruction set of the simplified DLX. Note that imm denotes the value of the
immediate field in an I-Type-Instruction. sext(imm) denotes the 2’s complement sign extension of
imm to 32 bits.
• Load/Store-instructions:
Load/Store Semantics
lw RD RS1 imm RD := M(sext(imm)+RS1)
sw RD RS1 imm M(sext(imm)+RS1) := RD
• Immediate–instructions:
Instruction Semantics
addi RD RS1 imm RD := RS1 + sext(imm)
• Shift–/Compute–Instructions:
38CHAPTER7.ASIMPLIFIEDDLX•SpecialRegisters:MAR,MDR,A,BandC;ThemainmodulesinthedatapathofthesimplifiedDLXareasfollows:•TheGPR-Fileisadual-portRAMwhichsupportseithertworeadsoronewrite.•TheALUsupports2’scomplementintegeraddition,subtraction,comparisonandbitwiselogicaloperations.•TheShiftersupportslogicalleftandrightshiftsbyoneposition.7.1.3InstructionFormatsTherearetwoinstructionformats:1.AninstructionintheI-Type-Formatisdividedintofourfieldsdepictedbelow.OpcodeRS1RDimmediate65 1652.AninstructionintheR-Type-Formatisdividedintofivefieldsdepictedbelow.Notethatthe5bitfieldusedintheDLX-Architecturefortheshiftamount(SA)isnotusedinthesimplifiedDLX.OpcodeRS1RDRS2 Function6 565557.1.4InstructionSetWelistbelowtheinstructionsetofthesimplifiedDLX.NotethatimmdenotesthevalueoftheimmediatefieldinanI-Type-Instruction.sext(imm)denotesthe2’scomplementsignextensionofimmto32bits.•Load/Store-instructions:Load/Store Semantics
lwRDRS1immRD:=M(sext(imm)+RS1)
swRDRS1immM(sext(imm)+RS1):=RD
•Immediate–instructions:
Instruction Semantics
addiRDRS1immRD:=RS1+sext(imm)
•Shift–/Compute–Instructions:
38 CHAPTER 7. A SIMPLIFIED DLX
• Special Registers: MAR, MDR, A, B and C;
The main modules in the datapath of the simplified DLX are as follows:
• The GPR-File is a dual-port RAM which supports either two reads or one write.
• The ALU supports 2’s complement integer addition, subtraction, comparison and bitwise
logical operations.
• The Shifter supports logical left and right shifts by one position.
7.1.3 Instruction Formats
There are two instruction formats:
1. An instruction in the I-Type-Format is divided into four fields depicted below.
Opcode RS1 RD immediate
6 5 165
2. An instruction in the R-Type-Format is divided into five fields depicted below. Note that
the 5 bit field used in the DLX-Architecture for the shift amount (SA) is not used in the
simplified DLX.
Opcode RS1 RDRS2 Function
6 5 65 5 5
7.1.4 Instruction Set
We list below the instruction set of the simplified DLX. Note that imm denotes the value of the
immediate field in an I-Type-Instruction. sext(imm) denotes the 2’s complement sign extension of
imm to 32 bits.
• Load/Store-instructions:
Load/Store Semantics
lw RD RS1 imm RD := M(sext(imm)+RS1)
sw RD RS1 imm M(sext(imm)+RS1) := RD
• Immediate–instructions:
Instruction Semantics
addi RD RS1 imm RD := RS1 + sext(imm)
• Shift–/Compute–Instructions:
38CHAPTER7.ASIMPLIFIEDDLX•SpecialRegisters:MAR,MDR,A,BandC;ThemainmodulesinthedatapathofthesimplifiedDLXareasfollows:•TheGPR-Fileisadual-portRAMwhichsupportseithertworeadsoronewrite.•TheALUsupports2’scomplementintegeraddition,subtraction,comparisonandbitwiselogicaloperations.•TheShiftersupportslogicalleftandrightshiftsbyoneposition.7.1.3InstructionFormatsTherearetwoinstructionformats:1.AninstructionintheI-Type-Formatisdividedintofourfieldsdepictedbelow.OpcodeRS1RDimmediate65 1652.AninstructionintheR-Type-Formatisdividedintofivefieldsdepictedbelow.Notethatthe5bitfieldusedintheDLX-Architecturefortheshiftamount(SA)isnotusedinthesimplifiedDLX.OpcodeRS1RDRS2 Function6 565557.1.4InstructionSetWelistbelowtheinstructionsetofthesimplifiedDLX.NotethatimmdenotesthevalueoftheimmediatefieldinanI-Type-Instruction.sext(imm)denotesthe2’scomplementsignextensionofimmto32bits.•Load/Store-instructions:Load/Store SemanticslwRDRS1immRD:=M(sext(imm)+RS1)swRDRS1immM(sext(imm)+RS1):=RD•Immediate–instructions:Instruction Semantics
addiRDRS1immRD:=RS1+sext(imm)
•Shift–/Compute–Instructions:
7.2. IMPLEMENTATION 39
Instruction Semantics
slli RD RS1 RD := RS1 << 1
srli RD RS1 RD := RS1 >> 1
add RD RS1 RS2 RD := RS1 + RS2
sub RD RS1 RS2 RD := RS1 − RS2
and RD RS1 RS2 RD := RS1 ∧ RS2
or RD RS1 RS2 RD := RS1 ∨ RS2
xor RD RS1 RS2 RD := RS1 ⊕ RS2
• Test–Instructions:
Instruction Semantics
sreli RD RS1 imm RD := 1, if condition is satisfied,
RD := 0 otherwise
if rel =lt test if RS1 < sext(imm)
if rel =eq test if RS1 = sext(imm)
if rel =gt test if RS1 > sext(imm)
if rel =le test if RS1 ≤ sext(imm)
if rel =ge test if RS1 ≥ sext(imm)
if rel =ne test if RS1 ̸= sext(imm)
• Jump–instructions:
Instruction Semantics
beqz RS1 imm PC = PC + 1 + sext(imm), if RS1 = 0
PC = PC + 1, if RS1 ̸= 0
bnez RS1 imm PC = PC + 1, if RS1 = 0
PC = PC + 1 + sext(imm), if RS1 ̸= 0
jr RS1 PC = RS1
jalr RS1 R31 = PC+1; PC = RS1
• Miscellaneous–instructions:
Instruction Semantics
special-nop causes transition to Init/Fetch states
halt causes transition to HALT state
Encoding of the Instruction Set
Tables 7.1 and 7.2 specify the binary encoding of the instructions.
7.2 Implementation
7.2.1 Datapath
The datapath of the simplified DLX is depicted in Figure 7.1. The proposed datapath is not a
typical datapath; it lacks busses and drivers. There are two reasons for this: (a) the interface with
the I/O Control Logic does not requires busses; and (b) to protect the hardware we do no allow
7.2.IMPLEMENTATION 39Instruction SemanticsslliRDRS1RD:=RS1<<1srliRDRS1RD:=RS1>>1addRDRS1RS2RD:=RS1+RS2subRDRS1RS2RD:=RS1−RS2andRDRS1RS2RD:=RS1∧RS2orRDRS1RS2RD:=RS1∨RS2xorRDRS1RS2RD:=RS1⊕RS2•Test–Instructions:InstructionSemantics
sreliRDRS1immRD:=1,ifconditionissatisfied,
RD:=0otherwise
ifrel=lt testifRS1
ifrel=le testifRS1≤sext(imm)
ifrel=ge testifRS1≥sext(imm)
ifrel=ne testifRS1̸=sext(imm)
•Jump–instructions:
InstructionSemantics
beqzRS1immPC=PC+1+sext(imm),ifRS1=0
PC=PC+1,ifRS1̸=0
bnezRS1immPC=PC+1,ifRS1=0
PC=PC+1+sext(imm),ifRS1̸=0
jrRS1PC=RS1
jalrRS1R31=PC+1;PC=RS1
•Miscellaneous–instructions:
InstructionSemantics
special-nop causestransitiontoInit/Fetchstates
halt causestransitiontoHALTstate
EncodingoftheInstructionSet
Tables7.1and7.2specifythebinaryencodingoftheinstructions.
7.2Implementation
7.2.1Datapath
ThedatapathofthesimplifiedDLXisdepictedinFigure7.1.Theproposeddatapathisnota
typicaldatapath;itlacksbussesanddrivers.Therearetworeasonsforthis:(a)theinterfacewith
theI/OControlLogicdoesnotrequiresbusses;and(b)toprotectthehardwarewedonoallow