OSU CSE 2421
• Required Reading: Computer Systems: A Programmer’s Perspective, 3rd Edition
• Chapter 2, Sections 2.1.3
• Chapter 3, Sections 3.9 – 3.10.3 (inclusive)
Pointers on C
• Chapter 15 through section 15.7 (inclusive)
J.E.Jones
OSU CSE 2421
•For humans, it is intuitive to write the most significant byte of a multi-byte number on the left, and then write the remaining bytes to the right.
•For example, suppose we represent (550)10 as a 32-bit binary value:
0000 0000 0000 0000 0000 0010 0010 0110 (spaces provided for clarity)
•In hex, this value can be written as: 00000226
J. E. Jones
OSU CSE 2421
•In some machines (e.g., SPARC, Power, PowerPC, MIPS), the bytes are stored in memory in the order to which we are accustomed, that is, with the most significant byte at the first address in memory, and with the remaining bytes stored in order till the least significant byte, which is stored at the highest address.
•Such machines are called “Big-Endian”, because the most significant byte is stored first in memory.
•For example, if the 32-bit integer on the previous slide is stored in RAM starting at address 1000 hex, the bytes, and the addresses at which they are stored, are as shown on the next slide.
J. E. Jones
OSU CSE 2421
Address Byte
1000 00
1001 00
1002 02
1003 26
Note that bits within each byte have not modified their order
J. E. Jones
OSU CSE 2421
•In other machines (Intel, including x86, and ARM processors, e.g.), however, the bytes are stored in the opposite order in memory; that is, the least significant byte is stored at the first address in memory, and the next most significant byte at the next address, and so on, in order, till the most significant byte, which is stored at the highest address.
•Such machines are called “Little-Endian”, because the least significant byte is stored first in memory.
•If the 32-bit integer considered above, hex 00000226, is stored in RAM starting at address 1000, the bytes, and the addresses at which they are stored in a Little-Endian machine, are as shown on the next slide.
J. E. Jones
OSU CSE 2421
Address Byte
1000 26
1001 02
1002 00
1003 00
Note that bits within each byte have not modified their order
J. E. Jones
OSU CSE 2421
Address Byte Address Byte
1000 26 1000 00 1001 02 1001 00 1002 00 1002 02 1003 00 1003 26
Note that bits within each byte have not modified their order
J. E. Jones
OSU CSE 2421
•Both Big-Endian and Little-Endian architectures work, but when we do assembly language programming or, in C, if we are looking at specific bytes of larger data types, we need to be aware of which type of architecture the machine we are writing code for uses.
•Also, be careful not to get confused about what is changing here. Endian-ness relates to the order in which the bytes are stored in hardware, and not the order in which the bits are stored. Bit order for a given byte (most significant, least significant, or some byte in between) is the same for both types of machines.
•Many processors, especially RISC processors, are now configurable in terms of Endian-ness.
•Because both types of architecture are used in the real world, you must be familiar with the difference.
J. E. Jones
OSU CSE 2421
The mental note I use is that “Little” Endian stores the “littlest” (i.e. least significant) byte in the “littlest” (i.e. lowest) memory address.
This isn’t the only way to keep the two straight in your mind, but it works for me.
If you come up with a different mental note that works for you, let me know and I’ll add it to this slide. It may help someone that my note above doesn’t help
J. E. Jones
OSU CSE 2421
Consider
struct prob {
};
int p; int j; int a[2]; int *p;
Four fields:
Two 4-byte values (int)
A two-element array of type int
8-byte integer pointer for a total of 24 bytes
J. E. Jones
OSU CSE 2421
Unions provide a way to circumvent the type system in C, allowing a single object to be referenced according to multiple types
You can also think of unions as a way to interpret the same binary pattern in different ways
Syntax similar to struct
However, all fields reference the same block of memory rather
than different blocks
union { int b;
struct{
int b;
char y[4];
char y[4];
}}
The size of this union is 4 bytes The size of this structure is 8 bytes
J. E. Jones
OSU CSE 2421
struct{ int b;
s_x.b = 0x02030405; s_x.y = {5,4,3,2};
char[4] y; }s_x;
0x600020
0x600021
0x600022
0x600023
union{ int b;
u_x.b = 0x02030405;
y values set because it is in same memory as u_x.b
char[4] y; }u_x;
0x600020 0x600021 0x600022 0x600023
int b char[4] y
0000 0101
0000 0100
0000 0011
0000 0010
0x600024
0x600025
0x600026
0x600027
0000 0101
0000 0100
0000 0011
0000 0010
int b or 0000 0101 0000 0100 0000 0011 0000 0010 char[4] y
Remember Little Endian vs Big Endian? stdlinux is Little Endian and that is what is represented here.
J. E. Jones
OSU CSE 2421
struct{ int b;
s_x.y[1] = 0x06; s_x.y[3] = 0x07;
char[4] y; }s_x;
0x600020
0x600021
0x600022
0x600023
union{ int b;
u_x.y[1] = 0x06; u_x.y[3] = 0x07;
char[4] y; }u_x;
0x600020
0x600021
0x600022
0x600023
What is value of s_x.b??? What is value of u_x.b???
int b 0000 0101
0000 0100
0000 0011
0000 0010
char[4] y 0000 0101
0000 0110
0000 0011
0000 0111
int b or 0000 0101 char[4] y
0000 0110
0000 0011
0000 0111
0x600024
0x600025
0x600026
0x600027
J. E. Jones
OSU CSE 2421
struct{ int b;
s_x.y[1] = 0x06; s_x.y[3] = 0x07;
char[4] y; }s_x;
0x600020
0x600021
0x600022
0x600023
union{ int b;
u_x.y[1] = 0x06; u_x.y[3] = 0x07;
char[4] y; }u_x;
0x600020
0x600021
0x600022
0x600023
int b 0000 0101
0000 0100
0000 0011
0000 0010
char[4] y 0000 0101
0000 0110
0000 0011
0000 0111
int b or 0000 0101 char[4] y
0000 0110
0000 0011
0000 0111
What is value of s_x.b? 0x02030405 What is value of u_x.b? 0x07030605
struct value stayed the same union value changed!
0x600024
0x600025
0x600026
0x600027
J. E. Jones
OSU CSE 2421
#include
{
union {
int i;
Program Output:
char y[4]; }u_x;
u_x.i = 0x02030405;
[jones.5684@fl1 test]$ test_union Integer i is 0x02030405
y[0] is 0x05
y[1] is 0x04
printf(“Integer i is 0x%08x\n”, u_x.i); printf(“y[0] is 0x%02x\n”, u_x.y[0]); printf(“y[1] is 0x%02x\n”, u_x.y[1]); printf(“y[2] is 0x%02x\n”, u_x.y[2]); printf(“y[3] is 0x%02x\n”, u_x.y[3]);
y[2] is 0x03
y[3] is 0x02
Integer i is 0x07030605 [jones.5684@fl1 test]$
u_x.y[1] = 0x06;
u_x.y[3] = 0x07;
printf(“Integer i is 0x%08x\n”, u_x.i);
return(0); }
J. E. Jones
OSU CSE 2421
Consider the following:
struct S3 { char c;
union U3{ char c;
int i[2];
int i[2];
double v; }S3_val;
double v; }U3_val;
union P4 {
long *l_ptr;
union F7{ float fl_val;
short *s_ptr; float *flt_ptr; int *int_ptr;
unsigned char c_val[4];
}P4_val;
unsigned int i_val[2]; } F7_val;
J. E. Jones
OSU CSE 2421
}
How can you print $$ values with a comma every 3 digits???
#include
{
float amount = 1234567.89; setlocale(LC_NUMERIC,””); printf(“$%’.2f\n”, amount);
Prints: $1,234,567.89
‘$’ prints just because of the $ char prior to the ASCII % char,
note single quote after the % required
Ref: http://stackoverflow.com/questions/1449805/how-to- format-a-number-from-1123456789-to-1-123-456-789-in-c
J. E. Jones
OSU CSE 2421
Many computer systems place restrictions on the allowable addresses for the primitive data types requiring that the address for some objects be a multiple of some value K (typically 2, 4, 8, but 16 for some newer processors)
Suppose a processor always fetches 8 bytes from memory with an address that must be a multiple of 8.
If we can guarantee that any double will be aligned to have its address be a multiple of 8, then the value will always be read or written with a single memory operation.
For now, just knowing that Data Alignment “happens” is good enough. We’ll get more specific in the 2nd half of the semester
J. E. Jones
OSU CSE 2421
As discussed previously, stdin, stdout and stderr are three data streams that are defined by default for any executable that requires IO.
What if I want to read or write from/to a file on disk instead of from the keyboard/screen without using redirection?
◦ I’ll need a “file pointer” or 2 or 3…
◦ stdio.h defines a structure called FILE which is a data structure
used to access a data stream. It’s considered a file descriptor. ◦ FILE is *not* the data file that is on the disk.
J. E. Jones
OSU CSE 2421
Open the file:
Any ASCII, NULL terminated string that denotes a filename (can be a variable)
FILE *input_file;
input_file = fopen(“lab2_datafile”, “r”);
/* alternatively, use a full pathname input_file=fopen(“/home/lab2/lab2_datafile”, “r”); */
Open the file for reading
if (input_file == NULL){ perror(“lab2_datafile”);
exit(EXIT_FAILURE); }
perror() is only used for operating system errors. Since input_file==NULL means we couldn’t open a file, that is an OS error.
Now we have a shiny new piece of gold!! And we are keeping it in the variable input_file.
J. E. Jones
OSU CSE 2421
From before:
FILE *input_file;
Read from the file:
float float_val;
fscanf(input_file, “%f”, &float_val);
fscanf() works the same way scanf() works except that we must specify the file to read from as the first parameter. All other parameters just shift one to the right.
J. E. Jones
OSU CSE 2421
Open the file:
“r” means we want to read from the file “w” means we want to write to the file “a” means we want to append to a file
Any ASCII, NULL terminated string that denotes a filename. Can be a variable. Since we want to open to write in it, if it doesn’t exist, one will be created.
FILE *output_file; float float_val=6.25;
output_file = fopen(“lab2_output”, “w”); if (output_file == NULL){
perror(“lab2_ouput”); exit(EXIT_FAILURE);
}
fprintf(output_file, “The value of float_val is %f\n”, float_val);
fprintf() works the same way printf() works except that we must specify the file to write To as the first parameter. All other parameters just shift one to the right.
perror() is only used for operating system errors. Since output_file==NULL means we couldn’t open or create file, that is an OS error.
J. E. Jones
OSU CSE 2421
if(fclose(input_file) != 0) { perror(“fclose”);
}
exit(EXIT_FAILURE);
}
if(fclose(output_file) != 0) {
perror(“fclose”); exit(EXIT_FAILURE);
perror() is only used for operating system errors. There must be a seriously sick OS, if we can’t close a file.
J. E. Jones
OSU CSE 2421
The definition of feof():
C Language: feof() function (Test for End-of-File) In the C Programming Language, the feof
()function tests to see if the end-of-file indicator has been set for a stream pointed to by stream.
Many students think that using feof() is the way to find when the end of data in a file has been encountered. IT’S NOT! Note the definition of feof() above. feof() tests to see if the EOF indicator has ALREADY been set!!! This means that some other function that tried to read the file has already encountered EOF and set the end-of-file indicator. Think of feof() as figuratively walking up to the Unix/Linux representative and asking “Hey! Has EOF been found on my data stream already?” Then the Unix/Linux representative looks up at his/her overall system status bulletin board, notes that EOF has been set to 1, then tells feof() “Yup!”
So some other read() ALREADY FOUND IT! What we are looking for is how to determine when EOF occurs THE FIRST TIME, when the status of the stream gets set to EOF.
So don’t use feof() it will cause your program to have bugs. It’s wrong because (in the absence of a read error) it introduces logic that allows the loop to execute one more time than the author expects. If there is a read error, the loop never terminates. I know this because I’ve debugged enough student programs to know what symptoms it causes. If you want further info, go here:
https://stackoverflow.com/questions/5431941/why-is-while-feof-file-always-wrong
J. E. Jones