OSU CSE 2421
CSE 2421
J.E.Jones
OSU CSE 2421
Recall from last lecture …
J. E. Jones
OSU CSE 2421
# multiplier in %ecx, multiplicand in %eax
# if %ecx = 0x00001000 and %eax = 0x75747372
%rdx %rax
Note: 1 operand!
Some value 0x0000000075747372
imull %ecx # OR mull %ecx %rdx %rax = x
0x0000000000000757 0x0000000047372000
#8-byte result is 0x00000757 47372000
Note value in %rdx is changed!
Recall from Lab 5 that when a 4-byte value is written to a 4-byte register, the upper 4-bytes are set to zero.
J. E. Jones
OSU CSE 2421
%eax: 0x47372000
%edx: 0x00000757
want 8-byte result: 0x0000075747372000
shlq $32, %rdx
%rdx: 0x0000075700000000 %rax= 0x0000000047372000
orq %rdx, %rax or
0x0000075700000000
0x0000000047372000
0x0000075747372000
addq %rdx, %rax
J. E. Jones
OSU CSE 2421
If we have two 2-byte values that we multiply together, using imulw %cx, giving us a 4-byte result where the least significant 2 bytes are in %ax and the most significant 2 bytes in %dx, then:
◦ Garbage can be in the 6 upper bytes in both the %rax and %rdx registers
◦ We must combine the 2-byte values in %ax and %dx so that we have a 4-byte value we can use
J. E. Jones
OSU CSE 2421
# multiplier in %cx, multiplicand in %ax # if %cx = 0x0100 and %ax = 0x7574
Note: 1 operand!
%rdx %rax
Some value 0x????????????7574
imulw %cx # OR mull %cx
%rdx %rax = x
Note only value in %dx is changed!
0x????????????0075 0x????????????7400
#4-byte result is 0x0075 7400
Recall from Lab 5 that when a 2-byte values is written to a 2-byte register, the upper 6-bytes do not change.
J. E. Jones
OSU CSE 2421
%ax: 0x1234, assume %eax = 0x784f1234 %dx: 0x0056, assume %edx = 0xab330056 want 4-byte result: 0x00561234
shll $16, %edx %edx: 0x00560000 movzwl %ax, %eax
or %eax=0x00001234
andl $0x0000ffff, %eax
orl %edx, %eax or
0x00560000
0x00001234
0x00561234
addl %edx, %eax
J. E. Jones
OSU CSE 2421
OK! I get that if I’m multiplying 8-byte values together, that using mulX (or imulX with one operand) is needful because I get a 16-byte result, but isn’t there an easier way for smaller values?
So, you want an easier way to multiply w-bit values (w=<32) and get the whole result (2w-bits) in one register?
Have I got a deal for you!
J. E. Jones
OSU CSE 2421
Recall that using mulX (for unsigned values) doesn’t allow for use of more than one operand and always (except for 1-byte products) results in our 2w-bit product being split between 2 registers.
If we are using imulX (for signed values), we have two other options:
◦ imulX Src, Dest
◦ imulX Imm, Src, Dest
◦ But these two options produce w-bit results, so overflow is possible.
J. E. Jones
OSU CSE 2421
Consider that using imulX (signed multiply) for two positive values works pretty much the same way as mulX (unsigned multiply) does except when we are using large numbers. Such as when the MSB is set to1 when interpreting BTU values.
Alternatively, what about using imulY with 2 operands, when mulX would normally be used, where Y is a one- increment promotion of the suffix X? (e.g., w to l)
◦ mull %edx, %eax rather than mulw %dx
Full 4-byte result is in %eax
4-byte result is half in %dx and half in %ax
J. E. Jones
OSU CSE 2421
An alternative if we have 4-byte unsigned values:
%eax contains multiplicand, %edx contains multiplier
Top 4 bytes of each 8-byte register is already 0
We could use mull %edx and have our 8-byte product split between %eax and %edx OR,
Use imulY with 2 8-byte operands: ◦ imulq %rdx, %rax
Now have 8-byte result in %rax, no contamination of %rdx register and knowledge that there is no overflow. Why do we know there is no overflow?
J. E. Jones
OSU CSE 2421
# multiplier in %ecx, multiplicand in %eax
# if %ecx = 0x00001000 and %eax = 0x75747372
%rdx %rax
Note: 1 operand!
Some value 0x0000000075747372
mull %ecx # OR imull %ecx %rdx %rax = x
0x0000000000000757 0x0000000047372000
#8-byte result is 0x00000757 47372000
Note value in %rdx is changed!
Recall from Lab 5 that when a 4-byte value is written to a
4-byte register, the upper 4-bytes are set to zero.
Let’s take advantage of this!!
J. E. Jones
OSU CSE 2421
# multiplier in %ecx, multiplicand in %eax
# if %ecx = 0x00001000 and %eax = 0x75747372
imull %ecx, %eax
Value in %rdx is not affected.
%rdx
%rax
Some value
0x0000000075747372
%rdx
%rax = x
Some value
0x0000000047372000
#8-byte result is 0x0000000047372000 Unfortunately, this isn’t quite what we wanted...
Note value in %rax is truncated.
Unsigned multiply (mulX) in this form is not a legal instruction
Note: 2 operands!
J. E. Jones
OSU CSE 2421
# multiplier in %ecx, multiplicand in %eax
# if %ecx = 0x00001000 and %eax = 0x75747372 # take advantage of upper 4-bytes being set to 0
imulq %rcx, %rax %rdx
Note value in %rdx is not affected.
%rdx
%rax
Some value
0x0000000075747372
Some value
0x0000075747372000
#8-byte result is 0x0000075747372000 This works!!
Note value in %rax is valid.
Unsigned multiply (mulX) in this form is not a legal instruction
%rax = x
J. E. Jones
OSU CSE 2421
An alternative if we have 2-byte unsigned values:
%ax contains multiplicand, %dx contains multiplier
Zero extend both registers from 2 bytes to 4 bytes: ◦ movzwl %ax, %eax
◦ movzwl %dx, %edx
Use imulX with 2 4-byte operands: ◦ imull %edx, %eax
Now have 4-byte result in %eax, no contamination of %edx register, knowledge that there is no overflow.
J. E. Jones
OSU CSE 2421
# multiplier in %cx, multiplicand in %ax # if %cx = 0x0100 and %ax = 0x7574
movzwl %cx, %ecx movzwl %ax, %eax
#assume values are signed
%rdx
%rax
Some value
0x????????????7574
%rdx
%rax = x
Some value
0x0000000000007574
J. E. Jones
OSU CSE 2421
# multiplier in %cx, multiplicand in %ax # if %cx = 0x0100 and %ax = 0x7574
movzwl %cx, %ecx
#create equiv values in more
movzwl %ax, %eax
# bytes
Note value in %rdx is not affected.
imull %ecx, %eax %rdx
%rax = x
%rdx
%rax
Some value
0x0000000000007574
Some value
0x0000000000757400
#4-byte result is 0x00757400 in %eax
Note value in %eax is correct.
J. E. Jones
OSU CSE 2421
Let’s use an example that uses the highest 2-byte unsigned values we can:
%ax = 0xffff, then 0 extend for unsigned to %eax=0x0000ffff using movzwl
%dx = 0xffff, then 0 extend for unsigned to %edx=0x0000ffff using movzwl
Use imull %edx, %eax (%eax=0xfffffffe) instead of mulw %dx (which results in %ax = 0xfffe and %dx=0xffff)
0xffff * 0xffff = 0xfffffffe -> mulw %dx
0x0000ffff*0x0000ffff = 0xfffffffe-> imull %edx, %eax
J. E. Jones
OSU CSE 2421
An alternative if we have 2-byte signed values:
%ax contains multiplicand, %dx contains multiplier
Sign extend both registers from 2 bytes to 4 bytes: ◦ movswl %ax, %eax
◦ movswl %dx, %edx
Use imulX with 2 4-byte operands: ◦ imull %edx, %eax
Now have 4-byte result in %eax, no contamination of %edx register, knowledge that there is no overflow.
J. E. Jones
OSU CSE 2421
You, as the programmer, must choose what is appropriate to most easily compute the multiplication of 2 values.
You must choose whether a w-bit value is reasonable given the potential for overflow or whether a 2w-bit value should be used.
Many different options have been presented here for your use.
Choose wisely.
◦ https://www.bing.com/videos/search?q=raiders+of+the+lost+ark+%2 2choose+wisely%22&&view=detail&mid=BC9BA556D3545D3C47 7FBC9BA556D3545D3C477F&rvsmid=74E636E14A3596BAB922 74E636E14A3596BAB922&FORM=VDQVAP
J. E. Jones
OSU CSE 2421
Like multiplication, x86-64 has a way for us to use a 16-byte (128-bit) value as a dividend.
◦ Must use an 8-byte (64-bit) divisor
◦ Quotient must fit in an 8-byte (64-bit) register
◦ Remainder must fit in an 8-byte (64-bit) register
Other sized dividend/divisor/quotient/remainder are available as well. These are analogous to multiplication examples.
J. E. Jones
OSU CSE 2421
ALU uses 2 8-byte registers (the %rdx/%rax register pair) to accomplish this: %rdx %rax
}
Dividend (high) Dividend (low)
This means dividend *MUST* be in %rax/%rdx pair.
If dividend is only in %rax, must set %rdx to all 1’s or all 0’s (sign or zero extend) Either the divq (unsigned) or the idivq (signed) divide instruction is used.
long q = x/y; long r = x%y; *qp = q;
*rp = r;
%rdx %rax
remainder
quotient
Example:
void remdiv (long x, long y, long *qp, long *rp) {
J. E. Jones
OSU CSE 2421
void remdiv(long x, long y, long *qp, long *rp); x in %rdi, y in %rsi, qp in %rdx, rp in %rcx
remdiv:
movq %rdx, %r8 movq %rdi, %rax
#Copy qp, because %rdx has another purpose #Move x to lower 8 bytes of dividend
cqto
#sign-extend to upper 8 bytes of dividend %rax (x)
idivq %rsi
#Divide by y
movq %rax, (%r8) movq %rdx, (%rcx) ret
#Store quotient at qp #Store remainder at rp
%rdx (some address *qp)
%rax (x)
3rd parameter
dividend
%rdx = all 1’s or 0’s
Dividend (high)
Dividend (low)
%rdx (x%y)
%rax (x/y)
remainder
quotient
Unsigned division makes use of the divq instruction rather than idivq. Typically, register %rdx is set to zero beforehand rather than sign extending (using cqto) an unsigned value because that can cause errors.
J. E. Jones
OSU CSE 2421
void remdiv(long x, long y, long *qp, long *rp);
Address in 0xff %rdx/%r8 0xff
x(-5) in %rdi, y(2) in %rsi, qp in remdiv:
%rdx, rp in %rcx
#Copy qp , because %rdx has another purpose
0xff 0xff 0xff 0xff 0xff
movq %rdx, %r8 movq %rdi, %rax
#Move x to lower 8 bytes of dividend %rax (x)
%rdx (some address *qp)
cqto
#sign-extend to upper 8 bytes of dividend %rax (x)
3rd parameter
0xfffffffffffffffb (-5)
%rdx = all 1’s or 0’s
0xffffffffffffffff
0xfffffffffffffffb (-5)
0xff 0xff 0xff
idivq %rsi
#Divide by y
%rax (x/y)
%rdx (x%y)
0xffffffffffffffff (-1)
0xffffffffffffffe (-2)
movq %rax, (%r8) movq %rdx, (%rcx) ret
#Store quotient at qp #Store remainder at rp
Address in %rcx 0xff 0xff
Unsigned division makes use of the divq instruction. Typically, register %rdx is set to zero beforehand rather than sign extending (using cqto) an unsigned value because that can cause errors.
0xff 0xff 0xff
0xfe
J. E. Jones
OSU CSE 2421
void remdiv(long x, long y, long *qp, long *rp);
x(5) in %rdi, y(2) in %rsi, qp in %rdx, rp in %rcx
Address in %rdx/%r8
0x02
0x00
0x00
0x00
0x00
0x00
0x00
0x00
remdiv:
movq %rdx, %r8 movq %rdi, %rax
#Copy qp
#Move x to lower 8 bytes of dividend
%rdx (some address *qp)
%rax (x)
cqto
#sign-extend to upper 8 bytes of dividend %rax (x)
3rd parameter
0x0000000000000005
%rdx = all 1’s or 0’s
0x0000000000000000
0x0000000000000005
0x01
idivq %rsi
#Divide by y 0x00
%rdx (x%y)
%rax (x/y)
0x00
0x00
0x00
0x00
0x0000000000000001
0x0000000000000002
movq %rax, (%r8) movq %rdx, (%rcx) ret
#Store quotient at qp #Store remainder at rp
Address in %rcx
Unsigned division makes use of the divq instruction rather than idivq. Typically, register
%rdx is set to zero beforehand rather than sign extending (using cqto) an unsigned value 0x00
because that can cause errors. .
0x00
J. E. Jones
OSU CSE 2421
void remdiv(int x, int y, int *qp, int *rp)
x(5) in %edi, y(2) in %esi, qp in %rdx, rp in %rcx
remdiv:
movq %rdx, %r8 movl %edi, %eax
#Copy qp
#Move x to lower 4 bytes of dividend
0x02
0x00
0x00
0x00
%rdx (some address *qp)
%eax (x) Address in %r8 0x0000000000000005 (Quotient)
3rd parameter
movslq %eax, %rax (OR cltq)#sign-extend to upper 4 bytes of dividend
%rdx (some address *qp)
%rax (x)
3rd parameter
0x0000000000000005
idivl %esi
#Divide by y %eax (x/y)
%edx (x%y)
0x00000001
0x00000002
0x01
0x00
0x00
movl %eax, (%r8) movl %edx, (%rcx) ret
#Store quotient at qp #Store remainder at rp
Address in %rcx 0x00 (Remainder)
J. E. Jones
OSU CSE 2421
When in doubt, write a test program.
The 15 minutes it takes to write it and evaluate the results, will save you days of debugging.
J. E. Jones
OSU CSE 2421
What is Operand Size?
Where is the Dividend?
Where is the Divisor?
Where is Quotient?
Where is Remainder?
idiv quotient range
maximum div quotient
1 or 2 bytes
%ax
a 1‐byte reg or memory address
%al
%ah
‐128 to +127
255
2 or 4 bytes
%dx‐2 high bytes %ax‐2 lower bytes
4 bytes or 8 bytes
%edx‐4 high bytes %eax‐4 lower bytes
8 bytes or 16 bytes
%rdx‐8 high bytes %rax‐8 lower bytes
a 2‐byte reg or memory address
%ax
%dx
‐32768 to +32767
65535
a 4‐byte reg or memory address
%eax
%edx
‐231 to 231‐1
232‐1
an 8‐byte reg or memory address
%rax
%rdx
‐263 to 263‐1
263‐1
J. E. Jones
OSU CSE 2421
void uremdiv(unsigned long x, unsigned long y,
unsigned long *qp, unsigned long *rp) unsigned long q = x/y;
}
unsigned long r = x%y; *qp = q;
*rp = r;
Modify the assembly code shown on previous slide for signed division to implement this function (unsigned division).
Submit code to Carmen.
J. E. Jones