CS代考 compiler assembly algorithm CSCI 2021: Binary, Integers, Arithmetic

CSCI 2021: Binary, Integers, Arithmetic
Chris Updated:
Mon Sep 27 02:27:48 PM CDT 2021
1

Logisitcs Reading
▶ C references
▶ Bryant/O’Hallaron Ch 2.1-3
Goals
▶ Binary Representations / Notation
▶ Integers in binary
▶ Arithmetic operations
Assignments: Questions?
▶ P1 Ongoing
▶ Lab03 on File Input ▶ HW03 on Binary Ints
Date Event
Fri 09/24 Mon 09/27 Wed 09/29
Fri 10/01 Mon 10/04
Solutions
Binary Ints/Chars Binary Ints/Chars Lec Practice Exam Lab Review
Exam 1 Project 1 Due
Lab 1/2 + HW 1/2 Solutions Posted to course schedule
2

Announcements: Course Feedback Survey on Canvas
▶ Will post a Feedback Survey tomorrow on Canvas, due following week
▶ Anonymous, worth 1 Engagement Point in place of Lab04
▶ Soliciting feedback on various aspects of the course, both
numerically and free-form
▶ If you have had good/bad/ugly experiences so far, this will be your chance to weigh in
3

Exam 1 Logistics
▶ In-person in class on Fri 10/01
▶ Exam starts at 2:30pm, ends at 3:20pm
▶ Expect 3 sides of paper (front, back, front)
Open Resource Exam
4

Open Resource Exam Rules
▶ Sign the sheet on turning in your exam to show attendence ▶ Silence your devices and keep screens visible to instructor ▶ Protect your work from theft
▶ You may be asked to show your Student ID
Can Use, physical or electronic
▶ Notes, Slides, Dictionary
▶ Your own previous Exams
(on Gradescope okay)
▶ Textbook(s) (online ok)
▶ Editor, Compiler, Vole, SSH
▶ Your code / Instructor code
▶ Locally stored webpages
▶ Online Manual Pages
http://man.he.net/ ex: search for ascii
Cannot Use
▶ General Internet

▶ Online calculators, converters, tables
▶ Chat, Texting, IM, etc.
▶ Communication with anyone
but Instructor/ you aren’t sure of something, ask
5

Unsigned Integers: Decimal and Binary
▶ Unsigned integers are always positive: unsigned int i = 12345;
▶ To understand binary, recall how decimal numbers “work”
Decimal: Base 10 Example
Each digit adds on a power 10 80,345 =5 × 100+
Binary: Base 2 Example
Each digit adds on a power 2
4 × 101+ 3 × 102+ 0 × 103+ 8 × 104
5+40+300+80,000
5 ones 40 tens
300 hundreds 0 thousands
80, 000 …
110012 =1 × 20+ 0 × 21+
0 × 22+ 1 × 23+ 1 × 24+
=1+8+16 So, 110012 = 2510
1 ones 0 twos 0 fours 8 eights
16 sixteens = 25
6

Exercise: Convert Binary to Decimal
Base 2 Example:
11001 =1 × 20+ 0 × 21+
0 × 22+ 1 × 23+ 1 × 24+
=1 + 8 + 16 So, 110012 = 2510
1 0 0 8
16 = 25
Try With a Pal
Convert the following two numbers from base 2 (binary) to base 10 (decimal)
▶ 111
▶ 11010
▶ 01100001
7

Answers: Convert Binary to Decimal
1112 =1×22 +1×21 +1×20 =1 × 4 + 1 × 2 + 1 × 1 =710
110102 =1×24 +1×23 +0×22 +1×21 +0×20 =1 × 16 + 1 × 8 + 0 × 4 + 1 × 2 + 0 × 1 =2610
011000012 =0×27 +1×26 +1×25 +0×24 +0×23 +0×22 +0×21 +1×20
=0 × 128 + ×64 + 1 × 32 + 0 × 16 +0×8+0×4+0×2+1×1
=9710
Note: last example ignores leading 0’s
8

The Other Direction: Base 10 to Base 2
Converting a number from base 10 to base 2 is easily done using repeated division by 2; keep track of remainders
Convert 124 to base 2:
124 ÷ 2 = 62 62 ÷ 2 = 31 31 ÷ 2 = 15 15 ÷ 2 = 7
7÷2=3 3÷2=1 1÷2=0
rem 0 rem 0 rem 1 rem 1 rem 1 rem 1 rem 1
▶ Last step got 0 so we’re done.
▶ Binary digits are in remainders in reverse
▶ Answer: 1111100
▶ Check:
0+0+22 +23 +24 +25 +26 = 4+8+16+32+64 = 124
9

Decimal, Hexadecimal, Octal, Binary
▶ Numbers exist independent of any writing system
▶ Can write the same number in a variety of bases
▶ C provides syntax for most common bases used in computing
Decimal Binary
Hexadecimal Octal 16 8 7D16 1758 0x.. 0… 0x7D 0175
Base 10 Mathematical 125 C Prefix None C Example 125
2
11111012 0b… 0b1111101
▶ Hexadecimal often used to express long-ish byte sequences Larger than base 10 so for 10-15 uses letters A-F
▶ Examine number_writing.c and table.c for patterns
▶ Expectation: Gain familiarity with doing conversions between bases as it will be useful in practice
10

Hexadecimal: Base 16
▶ Hex: compact way to write bit sequences
▶ One byte is 8 bits
▶ Each hex character
represents 4 bits
▶ Each Byte is 2 hex digits
|———–+—————-+—–| | Byte | Hex | Dec | |———–+—————-+—–| |01010111|57=5*16+7 | 87| |57| || |00111100|3C=3*16+12| 60| |3 C=12| | | | 1110 0010 | E2 = 14*16 + 2 | 226 | |E=142 | | | |———–+—————-+—–|
Hex to
4 bit equivalence
Dec Bits Hex 0 0000 0
1 0001 1
2 0010 2
3 0011 3 4 0100 4 5 0101 5 6 0110 6 7 0111 7 8 1000 8 9 1001 9
10 1010 A 11 1011 B 12 1100 C 13 1101 D 14 1110 E 15 1111 F
11

Exercise: Conversion Tricks for Hex and Octal
Examples shown in this week’s HW, What tricks are illustrated?
|———+————–+—————–+———————–|
| Decimal | Byte = 8bits | Byte by 4 | Hexadecimal | |———+————–+—————–+———————–|
| 87| 01010111|bin:01010111 |57=5*16+7 |
| | |hex:5 7 |hexdec | |||||
| 60| 00111100|bin:00111100 |3C=3*16+12 |
| | |hex:3 C=12|hexdec | |||||
| 226| 11100010|bin:11100010 |E2=14*16+2 |
| | | hex: E=14 2 | hex dec | |———+————–+—————–+———————–|
| Decimal | Byte = 8bits | Byte by 3 | Octal | |———+————–+—————–+———————–|
| 87| 01010111|bin:01010111|127=1*8^2+2*8+7|
| | |oct:1 2 7 |oct dec | |||||
| 60| 00111100|bin:00111100|074=0*8^2+7*8+4|
| | |oct:0 7 4 |oct dec | |||||
| 226| 11100010|bin:11100010|342=3*8^2+4*8+2|
| | |oct:3 4 2 |oct dec | |———+————–+—————–+———————–| 12

Answers: Conversion Tricks for Hex and Octal
▶ Converting between Binary and Hexadecimal is easiest when grouping bits by 4: each 4 bits corresponds to one hexadecimal digit
bin: 0101 0111 bin: 1110 0010
hex: 5 7 hex: E=14 2
▶ Converting between Binary and Octal is easiest when grouping bits by 3: each 3 bits corresponds to one octal digit
bin: 01 010 111 bin: 11 100 010 oct:1 2 7 oct:3 4 2
13

Unix Permissions with Octal
▶ Octal arises associated with Unix file permissions ▶ Every file has 3 permissions for 3 entities
▶ Permissions are true/false so a single bit will suffice
▶ ls -l: long list files, shows permissions
▶ chmod 665 somefile.txt:
change permissions of
binary octal
110110101 = 665
rw-rw-r-x somefile.txt
somefile.txt to those UGO
shown to the right
▶ chmod 777 x.txt: read /
SRT EOH RUE
write / exec for everyone PR
▶ chmod also honors letter versions like r and w
▶ chmod u+x script.sh # make file executable
Readable chmod version:
chmod u=rw,g=rw,o=rx somefile.txt
14

Character Coding Conventions
▶ Would be hard for people to share words if they interpretted bits as letters differently
▶ ASCII: American Standard Code for Information Interchange An old standard for bit/character correspondence
▶ 7 bits per character, includs upper, lower case, punctuation
Dec Hex
65 41
66 42
67 43
68 44
69 45
70 46
71 47
72 48
73 49
74 4A
75 4B
76 4C
77 4D
91 5B
92 5C
Binary Char Dec Hex
Binary Char 01001110 N 01001111 O 01010000 P 01010001 Q 01010010 R 01010011 S 01010100 T 01010101 U 01010110 V 01010111 W 01011000 X 01011001 Y 01011010 Z 01100001 a 01100010 b
01000001 A 01000010 B 01000011 C 01000100 D 01000101 E 01000110 F 01000111 G 01001000 H 01001001 I 01001010 J 01001011 K 01001100 L 01001101 M 01011101 [ 01011110 \
78 4E 79 4F 80 50 81 51 82 52 83 53 84 54 85 55 86 56 87 57 88 58 89 59 90 5A 97 61 98 62
15

Unicode
▶ World: why can’t I write
in my code/web address/email?
▶ America: ASCII has 128 chars. Deal with it.
▶ World: Seriously?
▶ America: We invented
computers. ’Merica!
▶ World:
▶ ASCII Uses 7 bits per char, limited to 128 characters
▶ UTF-8 uses 1-4 bytes per character to represent many more characters
(1,112,064 codepoints)
▶ Uses 8th bit in a byte to indicate extension to more than a single byte
▶ Requires software to understand coding convention allowing broader language support
▶ ASCII is a proper subset of UTF-8 making UTF-8 backwards compatible and increasingly popular
▶ America: … Unicode?
▶ World: But my language takes
more bytes than American.
▶ America: Deal with it. ’Merica!
16

Binary Integer Addition/Subtraction
Adding/subtracting in binary works the same as with decimal EXCEPT that carries occur on values of 2 rather than 10
ADDITION #1
1 11 <-carries 0100 1010 = 74 +01011001=89 ------------ 1010 0011 = 163 ADDITION #2 1111 1 <-carries 0110 1101 = 109 +01111001=121 ------------ 1110 0110 = 230 SUBTRACTION #1 ? <-carries 0111 1001 = 121 -00010011= 19 ------------ VVVVVVVVVVVVV VVVVVVVVVVVVV VVVVVVVVVVVVV x12 <-carries 0111 0001 = 119 -00010011= 19 ------------ 0110 0110 = 102 17 Two’s Complement Integers: Representing Negative Values ▶ To represent negative integers, must choose a coding system ▶ Two’s complement is the most common for this ▶ Alternatives exist ▶ Signed magnitude: leading bit indicates pos (0) or neg (1) ▶ One’s complement: invert bits to go between positive negative ▶ Great advantage of two’s complement: signed and unsigned arithmetic are identical ▶ Hardware folks only need to make one set of units for both unsigned and signed arithmetic 18 Summary of Two’s Complement Short explanation: most significant bit is associated with a negative power of two. UNSIGNED BINARY --------------- 7654 3210 : position ABCD EFGH : 8 bits A: 0/1 * +(2^7) *POS* B: 0/1 * +(2^6) C: 0/1 * +(2^5) ... H: 0/1 * +(2^0) UNSIGNED BINARY --------- 7654 3210 : position 1000 0000 = +128 1000 0001 = +129 1000 0011 = +131 1111 1111 = +255 0000 0000 = 0 0000 0001 = +1 0000 0101 = +5 0111 1111 = +127 TWO's COMPLEMENT (signed) ------------------------- 7654 3210 : position ABCD EFGH : 8-bits A: 0/1 * -(2^7) *NEG* B: 0/1 * +(2^6) C: 0/1 * +(2^5) ... H: 0/1 * +(2^0) TWO's COMPLEMENT (signed) --------- 7654 3210 : position 1000 0000 = -128 1000 0001 = -127 = -128+1 1000 0011 = -125 = -128+1+2 1111 1111 =-1 0000 0000 = 0 0000 0001 = +1 0000 0101 = +5 0111 1111 = +127 = -128+1+2+4+..+64 [ +127 ] 19 Two’s Complement Notes ▶ Leading 1 indicates negative, 0 indicates positive ▶ All 0’s = Zero ▶ Positive numbers are identical to unsigned Conversion Trick Positive → Negative ▶ Invert bits, Add 1 Negative → Positive ▶ Invert bits, Add 1 Same trick works both ways, implemented in hardware for the unary minus operator as in int y = -x; ~ 0110 1000 +104 : negate ----------- 1001 0111 inverted +1 ----------- 1001 1000 = -104 ~ 1001 1000 = -104 : negate ----------- 0110 0111 = +103 inverted +1 ----------- 0110 1000 = +104 Add Pos/Neg should give 0 1 1111 <-carries 0110 1000 = +104 + 1001 1000 = -104 ----------- x 0000 0000 = zero 20 Overflow ▶ Sums that exceed the representation of the bits associated with the integral type overflow ▶ Excess significant bits are dropped ▶ Addition can result in a sum smaller than the summands, even for two positive numbers (!?) ▶ Integer arithmetic in fixed bits is a mathematical ring Examples of Overflow in 8 bits ADDITION #3 OVERFLOW 1 1111 111 <-carries 1111 1111 = 255 +00000001= 1 ------------ 1 0000 0000 = 256 x drop 9th bit ------------ 0000 0000 = 0 ADDITION #4 OVERFLOW 1 1 <-carries 1010 1001 = 169 +11000001=193 ------------ 1 0110 1010 = 362 x drop 9th bit ------------ 0110 1010 = 106 21 Underflow ▶ Underflow occurs in unsigned arithmetic when values go below 0 (no longer positive) ▶ Pretend that there is an extra significant bit to carry out subtraction ▶ Subtracting a positive integer from a positive integer may result in a larger positive integer (?!?) ▶ Integer arithmetic in fixed bits is a mathematical ring Examples of 8-bit Underflow SUBTRACTIION #2 UNDERFLOW ?<-carries 0000 0000 = 0 - 0000 0001 =1 ------------ VVVVVVVVVVVVV ?<-carries 1 0000 0000 = 256 (pretend) - 0000 0001 = 1 ------------ VVVVVVVVVVVVV x 2<-carries 0 1111 1110 = 256 - 0000 0001 = 1 ------------ 1111 1111 = 255 22 Overflow and Underflow In C Programs ▶ See over_under_flow.c for demonstrations in a C program. ▶ No runtime errors for under/overflow ▶ Good for hashing and cryptography ▶ Bad for most other applications: system critical operations should use checks for over-/under-flow ▶ See textbook Ariane Rocket Crash which was due to overflow of an integer converted from a floating point value ▶ At the assembly level, there are condition codes indicating that overflow has occurred 23 Endinaness: Byte ordering in Memory ▶ Single bytes like ASCII characters lay out sequentially in memory in increasing address ▶ Multi-byte entities like 4-byte ints require decisions on byte ordering ▶ We think of a 32-bit int like this Binary: 0000 0000 0000 0000 0001 1000 1110 1001 000018E9 Hex : 000018E9 Decimal: 6377 ▶ But need to assign memory addresses to each byte ▶ Little Endian: least significant byte early ▶ Big Endian: most significant byte early ▶ Example: Integer starts at address #1024 Address LittleEnd: #1027 #1026 #1025 Binary: 0000 0000 0000 0000 0001 1000 1110 1001 000018E9 BigEnd: #1024 #1025 #1026 #1027 Address #1024 24 Little Endian vs. Big Endian ▶ Most modern machines use little endian by default ▶ Processor may actually support big endian ▶ Both Big and Little Endian have engineering trade-offs ▶ At one time debated hotly among hardware folks: a la Gulliver’s Travels conflicts ▶ Intel chips were little endian and “won” so set the basis for most modern use ▶ Big endian byte order shows up in network programming: sending bytes over the network is done in big endian ordering ▶ Examine show_endianness.c to see C code to print bytes in order ▶ Since most machines are little endian, will see bytes print in the revers order usually think of them 25 Output of show_endianness.c 1 > cat show_endianness.c
2 // Show endiannes layout of a binary number in memory Most machines
3 // are little endian so bytes will print leas signficant earlier.
4 #include
5
6 int main(){
7 8 9
int bin = 0b00000000000000000001100011101001;
// | | | | | | | |
// 0 0 0 0 1 8 e 9 printf(“%d\n%x\n”,bin,bin);
// 6377
// print bytes of bin from low // to high memory address
10 11 12 13 14 15 16
17 }
18 >
19
20 >
21 6377
22 18e9
23 e91800
// show decimal/hex of binary unsigned char *ptr = (unsigned char *) &bin; // pointer to beginning of bin
for(int i=0; i<4; i++){ printf("%x ", ptr[i]); } printf("\n"); return 0; gcc show_endianness.c ./a.out Notice: num prints with value 18e9 but bytes appear in reverse order e9 18 when looking at memory 26 Integer Ops and Speed ▶ Along with Addition and Subtraction, Multiplication and Division can also be done in binary ▶ Algorithms are the same as base 10 but more painful to do by hand ▶ This pain is reflected in hardware speed of these operations ▶ The Arithmetic and Logic Unit (ALU) does integer ops in the machine ▶ A clock ticks in the machine at some rate like 3Ghz (3 billion times per second) ▶ Under ideal circumstances, typical ALU Op speeds are Operation Cycles Addition 1 Logical 1 Shifts 1 Subtraction 1 Multiplication 3 Division >30
▶ Due to disparity, it is worth knowing about relation between multiply/divide and bitwise operations
▶ Compiler often uses such tricks: shift rather than multiply/divide
27

Mangling bits puts hair on your chest
Below contrasts difference between logical and bitwise operations.
int xl = 12 || 10; // truthy (Logical OR) intxb=12| 10; // 14 (Bitwise OR) int yl = 12 && 10; // truthy (Logical AND)
intyb=12& intzb=12^ int wl = !12; int wb = ~12;
10; //8 (Bitwise AND) 10; //6 (Bitwise XOR) // falsey (Logical NOT)
//3 (Bitwise NOT/INVERT)
▶ Bitwise ops evaluate on a per-bit level ▶ 32 bits for int, 4 bits shown
Bitwise OR Bitwise AND 1100=12 1100=12 |1010=10 &1010=10
Bitwise XOR Bitwise NOT 1100=12
^1010=10 ~1100=12 ———– ———– ———– ———– 1110=14 1000= 8 0110 = 6 0011= 3
28

Bitwise Shifts
▶ Shift operations move bits within a field of bits ▶ Shift operations are
x=y<> k; // right shift y by k bits, store in x ▶ All integral types can use shifts: long, int, short, char
▶ Not applicable to pointers or floating point
▶ Examples in 8 bits
// 76543210
char x = 0b00010111; // 23 chary=x<<2; //leftshiftby2 // y = 0b01011100; // 92 // x = 0b00010111; // not changed charz=x>>3; //rightshiftby3
// z = 0b00000010; // 2
// x = 0b00010111; // not changed char n = 0b10000000; // -128, signed chars=n>>4; //rightshiftby4
// s = 0b11111000; // -8, sign extension
// right shift >> is “arithmetic”
29

Shifty Arithmetic Tricks
▶ Shifts with add/subtract can be used instead of multiplication and division
▶ Turn on optimization: gcc -O3 code.c
▶ Compiler automatically does this if it thinks it will save cycles ▶ Sometimes programmers should do this but better to convince
compiler to do it for you, comment if doing manually
Multiplication
// 76543210
char x = 0b00001010; // 10 char x2 = x << 1; // 10*2 // x2 = 0b00010100; // 20 char x4 = x << 2; // 10*4 // x4 = 0b00101000; // 40 char x7 = (x << 3)-x; // 10*7 // x7=(x*8)-x; //10*7 // x7 = 0b01000110; // 70 // 76543210 Division // 76543210 char y = 0b01101110; // 110 char y2 =y>>1; // 110/2 // y2 = 0b00110111; // 55 char y4 =y>>2; // 110/4 // y4 = 0b00011011; // 27 char z = 0b10101100; // -84 char z2 =z>>2; // -84/4 // z2 = 0b11101011; // -21 // right shift sign extension
30

Exercise: Checking / Setting Bits
Use a combination of bit shift / bitwise logic operations to… 1. Check if bitiofintxis set (has value 1)
2. Clear bit i (set bit at index i to value 0)
Show C code for this
31

Answers: Checking / Setting Bits
1. Check if bitiofintxis set (has value 1)
int x = …;
int mask = 1; // or 0b0001 or 0x01 …
int shifted = mask << i; // shifted 0b00...010..00 if(x & shifted){ // x & 0b10...010..01 ... // ------------------ }// 0b00...010..00 2. Clear bit i (set bit at index i to value 0) int x = ...; int mask = 1; // or 0b0001 or 0x01 ... int shifted = mask << i; // shifted 0b00...010..00 int inverted = ~shifted; // inverted 0b11...101..11 x = x & inverted; // x & 0b10...010..01 ... // ------------------ // 0b10...000..01 32 Showing Bits ▶ printf() capabilities: %d as Decimal %x as Hexadecimal %o as Octal %c as Character ▶ No specifier for binary ▶ Can construct such with bitwise operations ▶ Code pack contains two codes to do this ▶ printbits.c: single args printed as 32 bits ▶ showbits.c: multiple args printed in binary, hex, decimal ▶ Showing bits usually involves shifting and bitwise AND & ▶ Example from showbits.c #define INT_BITS 32 // print bits for x to screen void showbits(int x){ for(int i=INT_BITS-1; i>=0; i–){
int mask = 1 << i; if(mask & x){ printf("1"); } else { printf("0"); } } } 33 Bit Masking ▶ Semi-common for functions to accept bit patterns which indicate true/false options ▶ Frequently makes use of bit masks which are constants associated with specific bits ▶ Example from earlier: Unix permissions might be... #define S_IRUSR 0b100000000 // User #define S_IWUSR 0b010000000 // User #define S_IXUSR 0b001000000 // User #define S_IRGRP 0b000100000 // Group Read ... #define S_IWOTH 0b000000010 // Others Write #define S_IXOTH 0b000000001 // Others Execute ▶ Use them to create options to C functions like int permissions = S_IRUSR|S_IWUSR|S_RGRP; chmod("/home/kauffman/solution.zip",permissions); Read Write Execute 34