Microsoft PowerPoint – 16_Assorted Topics in C
O
SU
C
SE
2
42
1
J.E.Jones
• Required Reading: Computer Systems: A Programmer’s Perspective, 3rd Edition
• Chapter 2, Sections 2.1.3
• Chapter 3, Sections 3.9 – 3.10.3 (inclusive)
Pointers on C
• Chapter 15 through section 15.7 (inclusive)
O
SU
C
SE
2
42
1
J. E. Jones
•For humans, it is intuitive to write the most significant byte of
a multi-byte number on the left, and then write the remaining
bytes to the right.
•For example, suppose we represent (550)10 as a 32-bit binary
value:
0000 0000 0000 0000 0000 0010 0010 0110
(spaces provided for clarity)
•In hex, this value can be written as:
00000226
O
SU
C
SE
2
42
1
J. E. Jones
•In some machines (e.g., SPARC, Power, PowerPC, MIPS), the bytes
are stored in memory in the order to which we are accustomed, that is,
with the most significant byte at the first address in memory, and with
the remaining bytes stored in order till the least significant byte, which
is stored at the highest address.
•Such machines are called “Big-Endian”, because the most significant
byte is stored first in memory.
•For example, if the 32-bit integer on the previous slide is stored in
RAM starting at address 1000 hex, the bytes, and the addresses at
which they are stored, are as shown on the next slide.
O
SU
C
SE
2
42
1
J. E. Jones
Address Byte
1000 00
1001 00
1002 02
1003 26
Note that bits within each byte have not modified their order
O
SU
C
SE
2
42
1
J. E. Jones
•In other machines (Intel, including x86, and ARM processors, e.g.),
however, the bytes are stored in the opposite order in memory; that is,
the least significant byte is stored at the first address in memory, and
the next most significant byte at the next address, and so on, in order,
till the most significant byte, which is stored at the highest address.
•Such machines are called “Little-Endian”, because the least
significant byte is stored first in memory.
•If the 32-bit integer considered above, hex 00000226, is stored in
RAM starting at address 1000, the bytes, and the addresses at which
they are stored in a Little-Endian machine, are as shown on the next
slide.
O
SU
C
SE
2
42
1
J. E. Jones
Address Byte
1000 26
1001 02
1002 00
1003 00
Note that bits within each byte have not modified their order
O
SU
C
SE
2
42
1
J. E. Jones
Address Byte
1000 26
1001 02
1002 00
1003 00
Note that bits within each byte have not modified
their order
Address Byte
1000 00
1001 00
1002 02
1003 26
O
SU
C
SE
2
42
1
J. E. Jones
•Both Big-Endian and Little-Endian architectures work, but when we do
assembly language programming or, in C, if we are looking at specific bytes of
larger data types, we need to be aware of which type of architecture the
machine we are writing code for uses.
•Also, be careful not to get confused about what is changing here. Endian-ness
relates to the order in which the bytes are stored in hardware, and not the order
in which the bits are stored. Bit order for a given byte (most significant, least
significant, or some byte in between) is the same for both types of machines.
•Many processors, especially RISC processors, are now configurable in terms
of Endian-ness.
•Because both types of architecture are used in the real world, you must be
familiar with the difference.
O
SU
C
SE
2
42
1
J. E. Jones
The mental note I use is that “Little” Endian stores the
“littlest” (i.e. least significant) byte in the “littlest” (i.e.
lowest) memory address.
This isn’t the only way to keep the two straight in your
mind, but it works for me.
If you come up with a different mental note that works for
you, let me know and I’ll add it to this slide. It may help
someone that my note above doesn’t help
“little for little byte first, big for biggest byte first” – Bradley Dufek
CSE SP’21
O
SU
C
SE
2
42
1
J. E. Jones
Consider
struct prob {
int k; 0x600040->0x600043
int j; 0x600044->0x600047
int a[2]; 0x600048-0x60004f
int *p; 0x600050 ->0x600057
};
Four fields:
Two 4-byte values (int)
A two-element array of type int
8-byte integer pointer for a total of 24 bytes
O
SU
C
SE
2
42
1
J. E. Jones
Unions provide a way to circumvent the type system in C, allowing
a single object to be referenced according to multiple types
You can also think of unions as a way to interpret the same binary
pattern in different ways
Syntax similar to struct
However, all fields reference the same block of memory rather
than different blocks
union { struct{
int b; int b;
char y[4]; char y[4];
} }
The size of this union is 4 bytes The size of this structure is 8 bytes
O
SU
C
SE
2
42
1
J. E. Jones
struct{ s_x.b = 0x02030405;
int b; s_x.y = {5,4,3,2};
char y[4];
}s_x;
union{ u_x.b = 0x02030405;
int b; y values set because it is in same memory as u_x.b
char y[4];
}u_x;
Remember Little Endian vs Big Endian? stdlinux is Little Endian and that is what is
represented here.
0x600020 0x600021 0x600022 0x600023
int b 0000 0101 0000 0100 0000 0011 0000 0010
0x600024 0x600025 0x600026 0x600027
char[4] y 0000 0101 0000 0100 0000 0011 0000 0010
0x600020 0x600021 0x600022 0x600023
int b or
char[4] y
0000 0101 0000 0100 0000 0011 0000 0010
O
SU
C
SE
2
42
1
J. E. Jones
struct{ s_x.y[1] = 0x06;
int b; s_x.y[3] = 0x07;
char[4] y;
}s_x;
union{ u_x.y[1] = 0x06;
int b; u_x.y[3] = 0x07;
char[4] y;
}u_x;
What is value of s_x.b???
What is value of u_x.b???
0x600020 0x600021 0x600022 0x600023
int b 0000 0101 0000 0100 0000 0011 0000 0010
0x600024 0x600025 0x600026 0x600027
char[4] y 0000 0101 0000 0110 0000 0011 0000 0111
0x600020 0x600021 0x600022 0x600023
int b or
char[4] y
0000 0101 0000 0110 0000 0011 0000 0111
O
SU
C
SE
2
42
1
J. E. Jones
struct{ s_x.y[1] = 0x06;
int b; s_x.y[3] = 0x07;
char y[4];
}s_x;
union{ u_x.y[1] = 0x06;
int b; u_x.y[3] = 0x07;
char y[4];
}u_x;
What is value of s_x.b? 0x02030405
What is value of u_x.b? 0x07030605
0x600020 0x600021 0x600022 0x600023
int b 0000 0101 0000 0100 0000 0011 0000 0010
0x600024 0x600025 0x600026 0x600027
char[4] y 0000 0101 0000 0110 0000 0011 0000 0111
0x600020 0x600021 0x600022 0x600023
int b or
char[4] y
0000 0101 0000 0110 0000 0011 0000 0111
struct value stayed the same
union value changed!
O
SU
C
SE
2
42
1
J. E. Jones
#include
int main()
{
union {
int i;
char y[4];
}u_x;
u_x.i = 0x02030405;
printf(“Integer i is 0x%08x\n”, u_x.i);
printf(“y[0] is 0x%02x\n”, u_x.y[0]);
printf(“y[1] is 0x%02x\n”, u_x.y[1]);
printf(“y[2] is 0x%02x\n”, u_x.y[2]);
printf(“y[3] is 0x%02x\n”, u_x.y[3]);
u_x.y[1] = 0x06;
u_x.y[3] = 0x07;
printf(“Integer i is 0x%08x\n”, u_x.i);
return(0);
}
Program Output:
[jones.5684@fl1 test]$ test_union
Integer i is 0x02030405
y[0] is 0x05
y[1] is 0x04
y[2] is 0x03
y[3] is 0x02
Integer i is 0x07030605
[jones.5684@fl1 test]$
O
SU
C
SE
2
42
1
J. E. Jones
Consider the following:
struct S3 { union U3{
char c; char c;
int i[2]; int i[2];
double v; double v;
}S3_val; }U3_val;
union P4 { union F7{
long *l_ptr;
short *s_ptr; signed int s_int;
float *flt_ptr; unsigned int u_int;
int *int_ptr; } F7_val;
void *v_ptr;
}P4_val;
O
SU
C
SE
2
42
1
J. E. Jones
How can you print $$ values with a comma every 3
digits???
#include
#include
main()
{
float amount = 1234567.89;
setlocale(LC_NUMERIC,””);
printf(“$%’.2f\n”, amount);
}
Prints: $1,234,567.89
‘$’ prints just because of the $ char prior to the ASCII % char,
note single quote after the % required
Ref: http://stackoverflow.com/questions/1449805/how-to-
format-a-number-from-1123456789-to-1-123-456-789-in-c
O
SU
C
SE
2
42
1
J. E. Jones
Many computer systems place restrictions on the allowable
addresses for the primitive data types requiring that the
address for some objects be a multiple of some value K
(typically 2, 4, 8, but 16 for some newer processors)
Suppose a processor always fetches 8 bytes from memory
with an address that must be a multiple of 8.
If we can guarantee that any double will be aligned to have
its address be a multiple of 8, then the value will always be
read or written with a single memory operation.
For now, just knowing that Data Alignment “happens”
is good enough. We’ll get more specific in the 2nd half of
the semester
O
SU
C
SE
2
42
1
J. E. Jones
As discussed previously, stdin, stdout and stderr are
three data streams that are defined by default for any
executable that requires IO.
What if I want to read or write from/to a file on disk
instead of from the keyboard/screen without using
redirection?
◦ I’ll need a “file pointer” or 2 or 3…
◦ stdio.h defines a structure called FILE which is a data structure
used to access a data stream. It’s considered a file descriptor.
◦ FILE is *not* the data file that is on the disk.
O
SU
C
SE
2
42
1
J. E. Jones
Open the file:
FILE *input_file;
input_file = fopen(“lab2_datafile”, “r”);
/* alternatively, use a full pathname
input_file=fopen(“/home/lab2/lab2_datafile”, “r”); */
if (input_file == NULL){
perror(“lab2_datafile”);
printf(“no file found\n”);
exit(EXIT_FAILURE);
}
Now we have a shiny new piece of gold!! And we are keeping
it in the variable input_file.
Any ASCII, NULL
terminated string that
denotes a filename
(can be a variable)
perror() is only used for
operating system errors. Since
input_file==NULL means we
couldn’t open a file, that is an
OS error.
Open the file for
reading
O
SU
C
SE
2
42
1
J. E. Jones
Open the file:
FILE *input_file;
char filename[256];
scanf(“ %s”, filename);
input_file = fopen(filename, “r”);
O
SU
C
SE
2
42
1
J. E. Jones
From before:
FILE *input_file;
Read from the file:
float float_val;
fscanf(input_file, “%f”, &float_val);
fscanf() works the same way scanf() works except that we
must specify the file to read from as the first parameter.
All other parameters just shift one to the right.
O
SU
C
SE
2
42
1
J. E. Jones
Open the file:
FILE *output_file;
float float_val=6.25;
output_file = fopen(“lab2_output”, “w”);
if (output_file == NULL){
perror(“lab2_ouput”);
exit(EXIT_FAILURE);
}
fprintf(output_file, “The value of float_val is %f\n”, float_val);
fprintf() works the same way printf() works
except that we must specify the file to write
To as the first parameter. All other parameters
just shift one to the right.
Any ASCII, NULL
terminated string that
denotes a filename.
Can be a variable.
Since we want to open
to write in it, if it
doesn’t exist, one will
be created.
perror() is only used for
operating system errors. Since
output_file==NULL means we
couldn’t open or create file, that
is an OS error.
“r” means we want to read from the file
“w” means we want to write to the file
“a” means we want to append to a file
O
SU
C
SE
2
42
1
J. E. Jones
if(fclose(input_file) != 0) {
perror(“fclose”);
exit(EXIT_FAILURE);
}
if(fclose(output_file) != 0) {
perror(“fclose”);
exit(EXIT_FAILURE);
}
perror() is only used for
operating system errors. There
must be a seriously sick OS, if
we can’t close a file.
O
SU
C
SE
2
42
1
J. E. Jones
The definition of feof():
C Language: feof() function (Test for End-of-File) In the C Programming Language, the feof
()function tests to see if the end-of-file indicator has been set for a stream pointed to by stream.
Many students think that using feof() is the way to find when the end of data in a file has been
encountered. IT’S NOT! Note the definition of feof() above. feof() tests to see if the EOF
indicator has ALREADY been set!!! This means that some other function that tried to read the
file has already encountered EOF and set the end-of-file indicator. Think of feof() as figuratively
walking up to the Unix/Linux representative and asking “Hey! Has EOF been found on my data
stream already?” Then the Unix/Linux representative looks up at his/her overall system status
bulletin board, notes that EOF has been set to 1, then tells feof() “Yup!”
So some other read() ALREADY FOUND IT! What we are looking for is how to determine when
EOF occurs THE FIRST TIME, when the status of the stream gets set to EOF.
So don’t use feof() it will cause your program to have bugs. It’s wrong because (in the absence of
a read error) it introduces logic that allows the loop to execute one more time than the author
expects. If there is a read error, the loop never terminates. I know this because I’ve debugged
enough student programs to know what symptoms it causes. If you want further info, go here:
https://stackoverflow.com/questions/5431941/why-is-while-feof-file-always-wrong