CHAPTER 6 – ASSEMBLY LANGUAGE PROGRAM FORMAT AND DATA DEFINITION
*
Chapter 10
Assembly Language Format
SSK 3207 – Chapter 10
*
Topics
10.1 Introduction
10.2 Assembly Language Program Format
10.3 Data Definition
10.4 DEBUG Program
SSK 3207 – Chapter 10
SSK3207 – Chapter 10
*
10.1 Introduction
Levels of Programming Languages
Machine Language
Consists of individual instructions that will be executed by the CPU one at a time.
Assembly Language (Low Level Language)
Designed for a specific family of processors (different processor groups/family has different Assembly Language)
SSK3207 – Chapter 10
*
High Level Languages
e.g. C, C++ and Vbasic
Designed to eliminate the technicalities of a particular computer.
Statements compiled in a high level language typically generate many low-level or machine instructions.
Consists of symbolic instructions directly related to machine language instructions one-to-one and are assembled into machine language.
SSK3207 – Chapter 10
*
Advantages of Assembly Language
Shows how program interfaces with the processor, operating system, and BIOS.
Shows how data is represented and stored in memory and on external devices.
Clarifies how processor accesses and executes instructions and how instructions access and process data.
Clarifies how a program accesses external devices.
SSK3207 – Chapter 8
*
Reasons for using Assembly Language
A program written in Assembly Language requires considerably less memory and execution time than one written in a high –level language.
Assembly Language gives a programmer the ability to perform higly technical tasks that would be difficult, if not impossible in a high-level language.
SSK3207 – Chapter 8
*
Although most software specialists develop new applications in high-level languages, which are easier to write and maintain, a common practice is to recode in assembly language those sections that are time-critical.
Resident programs (that reside in memory while other program execute) and interrupt service routines (that handle input and output) are almost always develop in Assembly Language.
10.2) Assembly Language Program Format
Assembly language can be written by using either .COM or .EXE format.
.COM format:
consist of one segment that contains code, data and the stack.
useful as a small utility program or as a
resident program.
.EXE format:
consist of separate code, data and stack segments.
used for more serious programs.
*
SSK 3207 – Chapter 10
*
General structure of an assembly language using .COM and .EXE are shown in Diagram 1 and Diagram 2.
Observe that each segment (data, code and stack) of the .EXE program is defined separately whereas in .COM, no separate segment for data and stack.
The statement ASSUME in .COM states that the CS, DS, SS and ES registers will have the same starting address for code segment.
*
SSK 3207 – Chapter 10
*
SSK 3207 – Chapter 10
*
SSK 3207 – Chapter 10
*
SSK 3207 – Chapter 10
Ending Program Execution
INT 21H is the commonly used interrupt service. It used the function code in the AH register to determine the next action.
INT 21H can also be used to control input from the
keyboard, control the screen, disk I/O and output to the
printer.
INT 21H with function code 4CH is used to terminate program execution. The function code 4CH must be priory entered into AH.
*
SSK 3207 – Chapter 10
Examples of .EXE and .COM programs
Table 1 and Table 2 are examples of assembly language using .EXE and .COM format.
In the example, the value 215 and 125 are defined in the data segment (FLDD and FLDE) with the size of one word (DW = Define Word = 16 bit or 2 bytes).
The result is kept in FLDF, which its size is also one word. The AX register is used to hold operand 1 and also the result.
*
SSK 3207 – Chapter 10
Table 1: Example of .EXE program
*
SSK 3207 – Chapter 10
NOTE: Table 1 and Table 2 above are the conventional ways of writing both .EXE and .COM programs. There are simpler ways to write programs of both formats (Simplified Segment Directives). Students are required to learn and experiment with it.
*
SSK 3207 – Chapter 10
Program Comments
Symbol : ‘;’ (semicolon).
All characters written on the right side of the semicolon is considered as a comment.
Below are some examples of comments:
In a row of its own
; Example of Comment
In the same line with other commands
ADD AX, BX ; Adds the value of AX and BX registers
Note: Comments will not be changed into machine code, hence the length of a comment will not influence the size of the program in machine code
*
SSK 3207 – Chapter 10
Reserved Words
Certain names are reserved for their own purposes and to be used only under special conditions.
Categories of reserved words include:
Instructions: such as MOV and ADD which are operations that the computer can execute.
Directives: such as END and SEGMENT which you use to provide information to the assembler.
Operators: such as FAR and SIZE which used in expressions
Predefined Symbols: such as @Data, @Model which return information to the program during assembly
*
SSK 3207 – Chapter 10
Identifiers
*
Is a programmer-chosen name. It might identify a variable, a constant, a procedure, or a code label.
2 types of identifiers: name and label.
Examples:
Name: refers to the address of a data item.
DATA1 DB 12
Label: refers to the address of an instruction, procedure or segment.
MAIN PROC FAR
B30: ADD BL, 25
SSK 3207 – Chapter 10
*
SSK 3207 – Chapter 10
Directives
Directives are statements that enable the programmer to determine how the source program is arranged. It will not be changed into machine code.
eg. Defining logical statements, choosing a memory model, defining variables, creating procedure, and so on.
The following are examples of directives: PAGE and TITLE directives.
*
SSK 3207 – Chapter 10
*
SSK 3207 – Chapter 10
*
Instructions
is a statement that is executed by the processor at runtime after the program has been loaded into memory and started.
has 4 basic part:
Label (optional)
Instruction mnemonic (required)
Operand(s) (usually required)
Comment (optional)
SSK 3207 – Chapter 10
Label: Mnemonic Operand (s) ;Comment
*
Label
is an identifier that acts as a place marker for either instruction or data.
Instruction Mnemonic
is a short word that identifies the operation carried out by an instruction.
eg. ADD (add two values)
Operands
assembly language instructions can have between zero and three operands, each of which can be a register, memory operand, constant expression, or I/O port.
SSK 3207 – Chapter 10
10.3) Data Definition
Assembler offers a few directives that enable programmers to define data according to its type and length. Format for data definition:
[name]
Data names are optional because in assembly language programming, data is not necessarily reference by its name.
Dn Directive
Next slide are the common directives to define data and also directives used in MASM 6.0
*
SSK 3207 – Chapter 10
*
SSK 3207 – Chapter 10
The following are some examples of numeric and character data definition
Page 60, 132
TITLE A04DEFIN (EXE) Define data directives
.MODEL SMALL
.DATA
; DB – Define Bytes:
; ————————
BYTE1 DB ? ; Uninitialized
BYTE2 DB 48 ; Decimal constant
BYTE3 DB 30H ; Hex constant
BYTE4 DB 01111010B ; Binary constant
BYTE5 DB 10 DUP (0) ; Ten zeros
BYTE6 DB ‘PC FAIR’ ; Character string
BYTE7 DB ‘12345’ ; Number as characters
BYTE8 DB 01, ‘Jan’, 02, ‘Feb’ ; Table of months
*
SSK 3207 – Chapter 10
; DW – Define Words:
; ————————-
WORD1 DW 0FFF0H ; Hex constant
WORD2 DW 01111010B ; Binary constant
WORD3 DW BYTE8 ; Address constant
WORD4 DW 2, 4, 6, 7, 9 ; Table of 5 constants
WORD5 DW 6 DUP (0) ; Six zeros
; DQ – Define Doublewords:
; ———————————
DWORD1 DD ? ; Uninitialized
DWORD2 DD 41562 ; Decimal value
DWORD3 DD 24, 48 ; Two constants
DWORD4 DD BYTE3 – BYTE2 ; Difference between ; addresses
*
SSK 3207 – Chapter 10
DB or BYTE
to define item with the size of one byte. The range of its value is stated in the Table 3.
DW or WORD
to define item with the size of one word or 2 bytes. The range of its value is stated in the Table 3.
Assembler will change numeric constants into binary object code and kept in the memory in a reverse bytes. For instance, if the real value is 3039H it will be kept as 3930H in the data segment.
*
SSK 3207 – Chapter 10
*
DD or DWORD
to define item with the size of 4 bytes. Its range of values is stated in the table 3.
As in DW definition, the data is kept in reverse byte or reverse sequence. For example, if the data is 00BC614EH it will be kept as 4E61BC00H.
SSK 3207 – Chapter 10
Expressions
Expressions in operand may specify an uninitialized value or a constant value.
Example:
DATAX DB ? ; Uninitialized item, ; size of 1 bait
DATAY DB 25 ; Initialized item, ; DATAY with value 25
*
SSK 3207 – Chapter 10
*
Uninitialized item is used to store a value which size is defined.
The value of a data can be used and edited to suit the program’s needs.
Expressions can contain a few constants that is separated by the sign ‘,’ and the quantity is limited to the row length.
Example:
DATAZ DB 21, 22, 23, 24, 25, 26, …
SSK 3207 – Chapter 10
*
The assembler defines the above constant byte by byte , from left to right.
DATAZ or DATAZ+0 contains the value 21, DATAZ+1 contains 22, DATAZ+2 contains 23 and so forth.
Example of instruction
MOV AL, DATAZ+3
will enter the value 24 into the AL register
SSK 3207 – Chapter 10
Expressions also allows duplication of constants using the format below:
Example:
DW 10 DUP(?) ; Ten words, ; uninitialized
DB 5 DUP(12) ; Five bytes ; containing ; 0C0C0C0C0C
DB 3 DUP(5 DUP(4)) ; Fifteen 4s
*
SSK 3207 – Chapter 10
Character Strings
Character strings are defined either using single quotes like ‘name’ or double quotes, “name”.
The content in quotes will be kept as object code in ASCII format Example:
DB “Crazy Sam’s CD Emporium” ; double quotes for ; string, single quotes ; for apostrophe.
DB ‘Crazy Sam’’s CD Emporium’ ; single quotes for ; string, two single
; quotes for ; apostrophe.
*
SSK 3207 – Chapter 10
Numeric Constants
Numeric constant is used to define the numeric value and the memory address. Below are a few numeric format:
Binary: use binary digit 0 and 1 followed with radix specifier B. Example: 01001100B.
Decimal: use decimal digits of 0 to 9, followed by radix specifier D or none. Example 125D or 125.
*
SSK 3207 – Chapter 10
*
Hexadecimal: use hexadecimal digits of 0 to 9 and A till F, followed by radix specifier H.
Real: it is decimal or hexadecimal constant followed by the radix specifier R. Assembler will change the value into floating point format.
SSK 3207 – Chapter 10
SSK3207 – Chapter 10
*
10.4) DEBUG Program
The DEBUG program is used for testing and debugging executable programs which include to:
viewing or display the content of the main memory (MM)
enter programs in memory (only programs in machine language and assembly language)
trace the execution of a program
SSK3207 – Chapter 10
*
DEBUG also provides a single-step mode, which allows you to execute a program one instruction at a time, so that you can view the effect of each instruction on memory locations and registers.
SSK3207 – Chapter 10
*
DEBUG Commands
The following are some DEBUG commands :
A : Assemble symbolic instructions into machine code
D : Display the contents of an area of memory in hex
format
E : Enter data into memory, beginning at a specific
location
G : Run the executable program in memory (G means
“go”)
H : Perform hexadecimal arithmetic
N : Name a program
SSK3207 – Chapter 10
*
P : Proceed or execute a set of related instructions
Q : Quit the DEBUG session
R : Display the contents of one or more registers in hex format
T : Trace the execution of one instruction
U : Unassemble (or disassemble) machine code into symbolic code
Note : refer appendix C (from main reference) pg 513-519 for complete DEBUG commands
SSK3207 – Chapter 10
*
Display the content of main memory
To display the content in segment FE0016 beginning from the first byte of the segment, in the DEBUG mode type:
D FE00:0
or
d fe00:0
first byte of the segment
segment fe00
display command
Note: all command can be written in lowercase or uppercase letter
SSK3207 – Chapter 10
*
c:\>DEBUG
-d fe00:0
FE00:0000 41 77 61 72 64 20 53 6f-66 74 77 61 72 65 49 42 Award SoftwareIB
FE00:0010 4D 20 43 4F 4D 50 41 54-49 42 4C 45 20 34 38 36 M COMPATIBLE 486 FE00:0020 20 42 49 4F 53 20 43 4F-50 59 52 49 47 48 54 20 BIOS COPYRIGHT
FE00:0030 41 77 61 72 64 20 53 6F-66 74 77 61 72 65 20 49 Award Software I
FE00:0040 6E 63 2E 6F 66 74 77 61-72 65 20 49 6E 63 2E 20 nc.oftware Inc.
FE00:0050 41 77 03 0C 04 01 01 6F-66 74 77 E9 11 14 20 43 Aw…..oftw… C FE00:0060 18 41 77 61 72 64 20 4D-6F 64 75 6C 61 72 20 42 .Award Modular B
FE00:0070 49 4F 53 20 76 36 2E 30-00 A6 32 EC 33 EC 35 EC IOS v6.0..2.3.5.
Address
Hexadecimal Representation
ASCII Code
the command d or D (D Display) will display 8 rows of data and each row contains 16 bytes (32 digit hex) which up to a total of 128 bytes (8 rows) beginning from the address given
SAK3207 – Chapter 5
*
Enter the program in machine language
Using Immediate and Register Addressing Mode
The DEBUG program can also be used to enter the program in machine language into a memory and trace its execution.
The following is an example of a program in machine language (in hexadecimal) and assembly language (in symbolic code) together with description about the instructions
SSK3207 – Chapter 10
*
The first and second instructions in the above program are using immediate addressing mode (the real data value is in the address field) and the others are using register addressing mode.
Machine
Instruction
Assembly Language
Instruction
Explanation
B82301
052500
8BD8
03D8
8BCB
2BC8
2BC0
EBEE
MOV AX,0123
ADD AX,0025
MOV BX,AX
ADD BX,AX
MOV CX,BX
SUB CX,AX
SUB AX,AX
JMP 100
Move value 0123H to AX
Add value 0025H to AX
Move contents of AX to BX
Add contents of AX to BX
Move contents of BX to CX
Subtract contents of AX from CX
Subtract AX from AX
Go back to the start
SAK3207 – Chapter 5
*
To enter the instructions in machine language into a memory (code segment), the “e” or “E” command is used (Table 1).
Use “r” or “R” to view the content of the CPU registers and “t” or “T” to trace the execution of the program.
-e CS:100 B8 23 01 05 25 00
-e CS:106 8B D8 03 D8 8B CB
-e CS:10C 2B C8 2B C0 EB EE
Table 1
SSK3207 – Chapter 5
*
-r
AX=0000 BX=0000 CX=0000 DX=0000 SP=FFEE BP=0000 SI=0000 DI=0000
DS=2090 ES=2090 SS=2090 CS=2090 IP=0100 NV UP EI PL NZ NA PO NC
2090:0100 B82301 MOV AX,0123
-t
AX=0123 BX=0000 CX=0000 DX=0000 SP=FFEE BP=0000 SI=0000 DI=0000
DS=2090 ES=2090 SS=2090 CS=2090 IP=0103 NV UP EI PL NZ NA PO NC
2090:0103 052500 ADD AX,0025
-t
AX=0148 BX=0000 CX=0000 DX=0000 SP=FFEE BP=0000 SI=0000 DI=0000
DS=2090 ES=2090 SS=2090 CS=2090 IP=0106 NV UP EI PL NZ NA PE NC
2090:0106 8BD8 MOV BX,AX
-t
AX=0148 BX=0148 CX=0000 DX=0000 SP=FFEE BP=0000 SI=0000 DI=0000
DS=2090 ES=2090 SS=2090 CS=2090 IP=0108 NV UP EI PL NZ NA PE NC
2090:0108 03D8 ADD BX,AX
-t
AX=0148 BX=0290 CX=0000 DX=0000 SP=FFEE BP=0000 SI=0000 DI=0000
DS=2090 ES=2090 SS=2090 CS=2090 IP=010A NV UP EI PL NZ AC PE NC
2090:010A 8BCB MOV CX,BX
-t
AX=0148 BX=0290 CX=0000 DX=0000 SP=FFEE BP=0000 SI=0000 DI=0000
DS=2090 ES=2090 SS=2090 CS=2090 IP=010A NV UP EI PL NZ AC PE NC
2090:010A 8BCB MOV CX,BX
-t
AX=0148 BX=0290 CX=0290 DX=0000 SP=FFEE BP=0000 SI=0000 DI=0000
DS=2090 ES=2090 SS=2090 CS=2090 IP=010C NV UP EI PL NZ AC PE NC
2090:010C 2BC8 SUB CX,AX
-t
AX=0148 BX=0290 CX=0148 DX=0000 SP=FFEE BP=0000 SI=0000 DI=0000
DS=2090 ES=2090 SS=2090 CS=2090 IP=010E NV UP EI PL NZ AC PE NC
2090:010E 2BC0 SUB AX,AX
-t
AX=0000 BX=0290 CX=0148 DX=0000 SP=FFEE BP=0000 SI=0000 DI=0000
DS=2090 ES=2090 SS=2090 CS=2090 IP=0110 NV UP EI PL ZR NA PE NC
2090:0110 EBEE JMP 0100
SSK3207 – Chapter 10
*
to view the instructions that entered in the code segment, use the “d” or “D” command followed by cs:100 (the first instruction only can start at byte with address of 100 in the code segment)
-d cs:100
2090:0100 8B 23 01 05 25 00 8B D8-03 D8 8B CB 2B C8 2B C0 .#..%…….+.+.
2090:0110 EB EE E8 59 00 5F 5E 59-5B 58 5A 1F 34 00 7F 20 …Y._^Y[Xz.4..
2090:0120 D5 E2 00 74 F7 1E 0E 1F-BE D5 E2 E8 98 02 2E A1 …t…………
2090:0130 2F E7 BB 40 00 BA 01 00 33 FF CD 21 1F 72 0B 8B /..@….3..!.r..
2090:0140 D8 B0 FF 86 47 18 A2 18-00 C3 0E 1F E8 D2 00 3D ….G……….=
2090:0150 41 00 74 07 0B FF 74 06-BA 39 82 E9 CD FC 2E C6 A.t…t..9……
2090:0160 06 67 E1 01 BA 69 E1 2E-A3 69 E1 E9 BD FC 80 3E .g…i…i…..>
2090:0170 E7 04 00 75 03 E9 9A 00-BE E7 04 E8 48 02 80 3E …u……..H..>
Table 3
SSK3207 – Chapter 10
*
Using Direct Addressing Mode
In this case, data needs to be entered into the Data Segment. Assume the position of data in the Data segment is as Table 4 and the instructions are in Table 5.
Table 4
DS Offset
Contents (Hex)
0200H
2301H
0202H
2500H
0204H
0000H
0206H
2A2A2AH
SSK3207 – Chapter 10
*
Table 5
Enter instructions (Table 5) and data (Table 4) above using the “E” or “e” command as in Table 6 below.
Instruction
Description
A10002
Move the word (two bytes) beginning at DS offset 0200H into AX
03060202
Add the content of the word beginning at DS offset 0202H into AX
A30402
Move the contents of AX to the word beginning at DS offset 0204H
EBF4
Jump to start of program
SSK3207 – Chapter 10
*
-E CS : 100 A1 00 02 03 06 02 02
-E CS : 107 A3 04 02 EB F4
-E DS : 200 23 01 25 00 00 00
-E DS : 206 2A 2A 2A
Table 6
The first 2 rows are the instructions which start at byte 100H in a Code Segment (CS) whereas the last 2 rows are the data which start at byte 200 in a Data Segment (DS).
SSK3207 – Chapter 10
*
Assembly Language program can be written or entered into the memory using the command “A” or “a” in the DEBUG program.
Example:
MOV CL, 42 (enter the value of 42H into the CL register)
MOV DL, 2A (enter the value of 2AH into the DL register)
ADD CL, DL (add the value in the CL register with the value in the DL register and store the result in the CL register)
Enter the program in assembly language
SSK3207 – Chapter 10
*
To enter the above program into a memory:
Enter the DEBUG program
Type ‘a 100’ (because an instruction only allowed starting from byte 100 and above in a code segment)
The address of the code segment and the offset address ‘100’ will appear on the screen.
Type the instruction as in Table 7 below.
SSK3207 – Chapter 10
*
-A 100
2090 : 0100 MOV CL, 42
2090 : 0102 MOV DL, 2A
2090 : 0104 ADD CL, DL
2090 : 0106 JMP 100
2090 : 0108
To view machine code for the assembly language entered, use the “u” or “U” command. (U Un-assemble)
Table 7
SAK3207 – Chapter 5
*
-U 100, 107
2090 : 0100 B142 MOV CL, 42
2090 : 0102 B22A MOV DL, 2A
2090 : 0104 00D1 ADD CL, DL
2090 : 0106 EBF8 JMP 0100
(the machine code)
To execute the above program, as usual, use the “r” or “R” command followed by the “T” or “t” command.
SAK3207 – Chapter 5
*
Using the INT instruction
DEBUG program also can be used to request information about system by using INT (interrupt) instruction.
INT instruction will exit from a program, enter a DOS or BIOS routine, performs the requested function, and return to a program.
SAK3207 – Chapter 5
*
Example 1: Getting the Current Date and Time
the instruction to access the current date is INT 21H function code 2AH. The function code 2AH must be moved to AH register. The instructions are as the following:
MOV AH, 2A
INT 21
JMP 100
Note: use command A to enter the above instructions into the code segment.
SAK3207 – Chapter 5
*
type R to display the registers and T to execute the MOV.
type P to proceed directly through the interrupt routine; the operation stops at the JMP.
the registers contain the following information in hex format:
AL: Day of the week, where 0 = Sunday
CX: Year (for example, 07D4H = 2004)
DH: Month (01H through 0CH)
DL: Day of the month (01H through 1FH)
SAK3207 – Chapter 5
*
Example 2: Displaying
to display data on screen.
enter the following instructions using A 100 command.
100 MOV AH, 09
102 MOV DX, 109 Starting address of the
105 INT 21 data to display
107 JMP 100
109 DB ‘ MY NAME IS YANI’, ‘$’
SAK3207 – Chapter 5
*
key in R to display the registers and first instruction, and key in T commands for the two MOVs. Key in P to execute INT 21 and MY NAME IS YANI will display on the screen.
SAK3207 – Chapter 5
*
Example 3: Keyboard Input
to accept characters from the keyboard.
Type in the DEBUG command A and then these assembly instructions:
100 MOV AH, 10
102 INT 16
104 JMP 100
the first instruction, MOV, provides function code 10H that tells INT 16H to accept data from the keyboard.
SAK3207 – Chapter 5
*
The operation delivers the character from the keyboard to the AL register.
Key in R to display the registers and first instruction and key in a T command to execute the MOV.
Type P for INT 16H, the system waits for you to press a key.
If you press the number 1, the operation delivers 31H (hex for ASCII 1) to AL.
*
SSK3207 – Chapter 10