C Crash Course (II): C Basics for System Programming
Presented by
Dr. Shuaiwen Leon Song
USYD Future System Architecture Lab (FSA) https://shuaiwen-leon-song.github.io/
Warmup Contents
FACULTY OF ENGINEERING
What is the printout of this code?
Segmentation fault (core dump) !
Core Dump/Segmentation Fault
Core Dump/Segmentation fault is a specific kind of error caused by accessing memory that “does not belong to you.”
•When a piece of code tries to do read and write operation in a read only location in memory or freed block of memory, it is known as core dump.
•It is an error indicating memory corruption.
Six Main Ways:
1. Modifyingastringliteral(writingtoread-onlymemory)
2. Accessing an address that is freed (not allowed in compiler)
3. Accessing out of array index bounds
4. Improper use of scanf() (input fetched from STDIN is placed in invalid memory,
causing memory corruption)
5. StackOverflow
6. Dereferencing uninitialized pointer (A pointer must point to valid memory
before accessing it).
What is in text?
A text segment , also known as a code segment or simply as text, is one of the sections of a program in an object file or in memory, which contains executable instructions.
As a memory region, a text segment may be placed below the heap or stack in order to prevent heaps and stack overflows from overwriting it.
Usually, the text segment is sharable so that only a single copy needs to be in memory for frequently executed programs, such as text editors, the C compiler, the shells, and so on. Also, the text segment is often read-only, to prevent a program from accidentally modifying its instructions.
Examples: string literals defined in functions
const and its pointers defined in functions
Less familiar const
› constprevents the value being modified – const char *fileheader = “P1”
– fileheader[1] = ‘3’;
› It can be used to help avoid arbitrary changes to memory
› The value constprotects depends where it appears
– char * const fileheader = “P1”
– fileheader = “P3”; Illegal: change of address value
› You can cast if you know if the memory is writable char fileheader[] = {‘P’, ‘1’};
Illegal: change of char value
Non-writable
writable
const char *dataptr = (char*)fileheader; char *p = (char*)dataptr;
p[1] = ‘3’;
Dangling pointers
A pointer pointing to a memory location that has been deleted (or freed) is called dangling pointer.
#include
int* foo(int a,int b)
{
int sum = a + b;
return ∑
}
int main() {
int* addr = foo(5, 6);
}
#include
int* foo(int a,int b)
{
static int sum = a + b; return ∑
}
int main() {
int* addr = foo(5, 6);
}
You can also do this!
#include
int* foo(int a,int b, int* sum) {
*sum = a + b;
return sum; }
int main() {
int *sum = (int*) malloc(sizeof(int)*10); if (sum == NULL) {
printf(“Memory not allocated.\n”);
exit(0); }
sum= foo(5, 6, sum); printf(“%d\n”, *sum); free(sum);
return 0;
}
Which one causing problems?
Which of the above three functions are likely to cause problems with pointers?
(A) Only P3
(B) Only P1 and P3
(C) Only P1 and P2 (D) P1, P2 and P3
Try not to return local variable address! Segmentation fault or program crash
Pointers
FACULTY OF ENGINEERING
Memory
address
0x100
0x101
0x102
0x103
0x104
0x105
0x106
0x107
0x108 …
Pointers
content
00100010
01010010
00110110
00101010
10100010
01100010
00111010
00100110
11100010
Memory
address
0x100 0x101 0x102 0x103 0x104 0x105 0x106 0x107
0x108 …
Pointers
content
00100010
01010010
00110110
00101010
10100010
01100010
00111010
00100110
11100010
Random values initially
Memory
address
0x100 0x101 0x102 0x103 0x104 0x105 0x106 0x107
0x108 …
content
Pointers
00100010
01010010
00110110
00101010
10100010
01100010
00111010
00100110
11100010
› a pointer is essentially a variable that stores a memory address
› we can find out the address of a variable using the & operator
Memory
address
0x100 0x101 0x102 0x103 0x104 0x105 0x106 0x107
0x108 …
content
char initial = ‘A’; char *initp = &initial
Pointers
00100010
01010010
00110110
00101010
10100010
01100010
00111010
00100110
11100010
&initial is the address of initial initp is a pointer to initial
Somewhere in memory… Label: “ptr” 000101100100010000101011000010010
0
0
0
01
0
01
1
0
0
0
1
01
0
0
0
0
1
01
1
01
0
0
0
0
1
01
1
0
0
0
0
01
0
0
0
01
1
10
0
01
1
0
1
01
1
01
1
01
0
0
0
01
1
0
0
01
1
0
0
01
0
01
10
0
0
0
1
0
0
0
0
0
0
1
1
0
1
0
1
1
0
1
0
1
1
1
Somewhere else in memory…
int count;
int *ptr;
count = 2;
ptr = &count;
printf(”%d\n”, count); printf(”%d\n”, *ptr);
printf(”%d\n”, &count); printf(”%d\n”, ptr);
Label: “count”
01 address of count: 0x1000 = 4,096
Clearly, the value of a pointer can only be determined at run-time.
2
2 4096 4096
0
0
0
1
0
1
1
0
0 00
1 1
0 1
0 0
0 0
1 1
0 0
0 0
0
1
0
0
1
1
0
0
1
1
0
0
1
1
1
1
0
1
0
0
0
1
0
0
1
1
0
0
0
1
0
0
0
0
1
1
1
0
0
1
0
0
1
0
1
1
0
0
0
1
1
0
0 01
01
0
0
01
01
0
0
0
0
01
0
01
01
01
01
0
0
01
0
01
0
1
0
1
01
0
0
0
01
0
0
0
0
1
0
0
01
0
1
1
variable name: “count”
Pointers (notation)
› Pointer operators:
– address operator, ‘&’
– indirection operator, ‘*’
Note that these operators are “overloaded”, that is they have more than one meaning.
– ‘&’ is also used in C as the bitwise ‘AND’ operator – ‘*’ is also used in C as the multiplication operator
Pointers (notation)
Pointers (notation)
› The indirection operator, ‘*’, is used in a variable declaration to declare a “pointer to a variable of the specified type”:
int * countp; /* pointer to an integer */
Variable name, “countp” Type is “a pointer to an integer”
Pointers (notation)
What do the following mean?
float * amt;
Answers:
A pointer (labeled “amt”) to
a float.
A pointer (labeled “tricky”) to a pointer to an int.
int ** tricky;
Pointers (notation)
› The indirection operator, ‘*’, is used to “unravel” the indirection:
countp points to an integer variable that contains the value 2. Then…
printf(”%d”, *countp); …prints ‘2’ to standard output.
Unravel the indirection
countp
2
Pointers (notation)
What is output in the following?
printf(”%d”, count);
17 17
printf(”%d”, *countp); 17
printf(”%d”, countp);
Don’t know… but it will be the address of count. Why?
count
countp
Pointers (notation)
› The address operator, ‘&’, is used to accessthe address of a variable.
› This completes the picture! A pointer can be assigned the address of a variable simply:
Declare “a pointer to an integer” called countp int * countp = &count;
Assign countp the address of count.
Wild pointers: I don’t like it
Uninitialized pointers are known as wild pointers because they point to some arbitrary memory location and may cause a program to crash or behave badly.
int main()
{
int *p; /* wild pointer */
/* Some unknown memory location is being corrupted. This should never be done. */
*p = 12;
}
int main()
{
int *p; /* wild pointer */ int a = 10;
p = &a;
*p = 12;
}
This is ok !
Pointers and arrays
Use of pointer notation to manipulate arrays…
char msg[] = ”Hello!”; char *str = &msg[0];
OR:
char *str = msg;
’H’ ’e’ ’l’ ’l’ ’o’ ’!’ ’/0’
Examples
msg[0] str[0] *str
’H’ ’e’ ’l’ ’l’ ’o’ ’!’ ’/0’
msg[1]
str[1]
*(str+1)
Examples
Pointer notation leads to some (intimidating?) shortcuts as part of the C idiom.
Moving through a string:
while (*str != ’\0’) str++;
’H’ ’e’ ’l’ ’l’ ’o’ ’!’ ’/0’
Examples
The previous example may exploit the fact that C
treats ‘0’ as FALSE: while (*str)
str++;
’H’ ’e’ ’l’ ’l’ ’o’ ’!’ ’/0’
Why use pointers?
› Some mathematical operations are moreconvenient using pointers
– e.g., array operations
› However, we have only looked at static data. Pointers
are essential in dealing with dynamic data structures. › Imagine you are writing a texteditor.
– You could estimate the largest line-length and create arrays of that size (problematic).
– Or you could dynamically allocate memory as needed, using pointers.
Revision Pointers
›What is the value held by p? and how much memory is used by p (in bytes)?
› int p;
› char p;
› void foo( int *p ) › char *p;
› char **p;
Revision Pointers
›What is the value held by p? and how much memory is used by p (in bytes)?
› int p;
› char p;
› void foo( int *p )
› char *p;
› char **p;
› int **p;
› long *p;
› void *p;
› const unsigned long long int * const p; › bubblebobble ************p;
Pointer interpretation
› char *p
– Address to a single char value
– Address to a single char value that is the first in an array › char *argv[]
– Array of “the type” with unknown length – Type is char *
› char **argv
– * Address to the first element to an array of type char * – Then, each element in * is an…
– * address to the first element to an array of type char char argv[][3]; // oh no!
Pointer interpretation › Interpretations of int **data;
1. Pointer to pointer to single int value
2. Array of addresses that point to a single int
3. Address that points to one array of int values
4. Array of addresses that point to arrays of int values
Pointer interpretation › Interpretations of int **data;
1. Pointer to pointer to single int value
2. Array of addresses that point to a single int
3. Address that points to one array of int values
4. Array of addresses that point to arrays of int values
› Thinking about each * as an array:
1. Array size ==1, Array size ==1
2. Array size >=1, Array size == 1
3. Array size ==1, Array size >= 1
4. Array size >=1, Array size >= 1
Pointer arithmetic
› int *p = NULL; › int x[4];
› p = x;
x = 0x0100
0x0100 0x0101 0x0101 0x0102 0x0103 0x0104 0x0105 0x0106 0x0107 0x0108 0x0109 0x010A 0x010B 0x010C 0x010D 0x010E
*(p + 0) *(p + 1) *(p + 2) *(p + 3) › Seeking to the nth byte from a starting address?
0
0
0
17
Pointer arithmetic
› int *p = NULL; › int x[4];
› p = x;
x = 0x0100
0x0100 0x0101 0x0101 0x0102 0x0103 0x0104 0x0105 0x0106 0x0107 0x0108 0x0109 0x010A 0x010B 0x010C 0x010D 0x010E
*(p + 0) *(p + 1) *(p + 2) *(p + 3) › Seeking to the nth byte from a starting address?
void *get_address( sometype *data , int n) {
unsigned char *ptr = (unsigned char*)data;
return (void*)(ptr + n); }
0
0
0
17
Q1
Assume that an int variable takes 4 bytes and a char variable takes 1 byte (A) Number of elements between two pointer are: 5.
Number of bytes between two pointers are: 20
(B) Number of elements between two pointer are: 20.
Number of bytes between two pointers are: 20
(C) Number of elements between two pointer are: 5. Number of bytes between two pointers are: 5
(D) Compiler Error
(E) Runtime Error
Q2
(A) Compiler Error (B) Garbage Value (C) Runtime Error (D) G
Q3
(A) 10 20 30 40
(B) Machine Dependent (C) 10 20
(D) Northing
int arr[4] = {10, 20 ,30,40};
printf(“%d\n”, sizeof(arr)); What is this print out?
Assume it is a 64-bit machine.
16
10 20
Q4
Assuming 32-bit machine:
(A) sizeof arri[] = 3 sizeof ptri = 4 sizeof arrc[] = 3 sizeof ptrc = 4 (B) sizeof arri[] = 12 sizeof ptri = 4 sizeof arrc[] = 3 sizeof ptrc = 1 (C) sizeof arri[] = 3 sizeof ptri = 4 sizeof arrc[] = 3 sizeof ptrc = 1 (D) sizeof arri[] = 12 sizeof ptri = 4 sizeof arrc[] = 3 sizeof ptrc = 4
(A) 2 2 (B) 2 1 (C) 0 1 (D) 0 2
Q5
Q6
S1: will generate a compilation error
S2: may generate a segmentation fault at runtime depending on the arguments passed S3: correctly implements the swap procedure for all input pointers referring to integers stored in memory locations accessible to the process
S4: implements the swap procedure correctly for some but not all valid input pointers S5: may add or subtract integers and pointers.
(A) S1
(B) S2 and S3 (C) S2 and S4 (D) S2 and S5
(A) 18 (B) 19 (C) 21 (D) 22
X+y+z= 7+7+5=19
Q7
Q8
(A) 12
(B) Compiler Error (C) Runt Time Error (D) 0
printf(“%d”, *(int *)ptr);
1.void pointers cannot be dereferenced. It can however be done using typecasting the void pointer
2.Pointer arithmetic is not possible on pointers of void due to lack of concrete value and thus size
(A)
Q9
(geeksquiz, geeksforgeeks) (geeksforgeeks, geeksquiz)
(B)
(geeksforgeeks, geeksquiz) (geeksquiz, geeksforgeeks)
(C)
(geeksquiz, geeksforgeeks) (geeksquiz, geeksforgeeks)
(D)
(geeksforgeeks, geeksquiz) (geeksforgeeks, geeksquiz)
(A) ab (B) cd (C) ef (D) gh
Q10
Memory
• Memory is a long array of 8 bit pieces called bytes
• This array is indexed from 0 to the number of bytes in the memory
• Each index is a memory address 0 1 2 3 ……
Memory Areas
• Stack: local variables, function arguments, return addresses, temporary storage
• Heap: dynamically allocated memory
• Global/static: global variables, static variables
• Code: program instructions
Memory Layout
End of memory
…
stack
heap
…
static global
code
Free space
0
The Stack
• In C, all variables local to a function and function arguments are stored on the stack
• To call a function the code does:
push arguments onto stack push return address onto stack
jump to function code
The Stack
• Insidethefunction,thecodedoesthefollowing:
increment the stack pointer to allow space for the local variables
execute the code
pop local variables and arguments off the stack
push the return result onto the stack jump to return address
Function call example
…
res = myfun(123);
…
0 1 2
…
123 1b345f0
argument
Address
Memory
stack ptr
Return Address
Function call example
int myfun(int a) {
int b = 5;
…
return 0;
…
123 1b345f0 5
0 1 2
}
stack ptr
Address
Memory
Function call example
int myfun(int a) {
int b = 5;
…
return 0;
stack ptr
…
0
0 1 2
}
Address
Memory
Heap
Memory may be dynamically allocated at run-time from an area known as “the heap”.
Unlike the stack, which meets the temporary storage demands associated with called functions, the heap is accessed under direct programmer control.
We request an allocation of memory from the heap.
used
If there is sufficient contiguous memory available, we are given the address of the start of the allocated memory.
heap
free
Pointer
newly allocated
Q: Where are parts of this program stored?
int a;
int main() {
int b; int *p;
p = malloc(…) }
int doit(int c) { static int d;
}
int a;
int main() {
int b;
int *p;
p = malloc(…) }
int doit(int c) { static int d;
}
SUMMARY
Memory allocation is not difficult!
It only causes problems because novice programmers maynotrecognisetheneedto address it…
Java programmers are less likely to experience such problems simply because Java hides the need to deal with this whole issue.
Memory Management Functions
Memory allocation functions
Memory allocation functions return a “pointer to void”.
A pointer to void is a “generic” pointer type. A void * can be converted to any other pointer type
The pointer must therefore be cast to a specific type.
Memory allocation functions: malloc
#include
void *malloc(size_t size);
Typically defined as:
typedef unsigned int size_t;
Requests size number of bytes of memory.
Returns a pointer to the allocated memory, if successful, or a NULL pointer if unsuccessful
A comment on the use of size_t:
Use of size_t replaces the use of more specific types, such as int, short, etc. This allows the actual implementation to be system-specific.
The sizeof operator is of type size_t. This is often
used to specify memory requirements, so it makes
sense to have the size argument in memory allocation functions of type size_t.
int * ptr;
ptr = (int *)malloc(sizeof(int)*20);
If an int is 4 bytes, then this call will request 80 bytes of memory from the heap.
ptr
calloc
#include
void *calloc(size_t num, size_t size);
This is similar to malloc except that: • Ithastwoarguments:
– num specifies the number of “blocks” of contiguous memory
– size specifies the size of each block
• Theallocatedmemoryiscleared(setto‘0’).
Can be implemented by : malloc and memset
free
#include
This is used to de-allocate memory previously allocated by any of the memory allocation functions.
int * ptr;
ptr = (int *)malloc(sizeof(int)*20);
free((void *)ptr); Ptr = NULL;
ptr
Memory Leak
Big issues for long-running data center applications, scientific simulations, Daemons (background process) and servers.
realloc
#include
void *realloc(void *ptr, size_t size);
This takes previously-allocated memory and attempts to resize it.
This may require a new block of memory to be found, so it returns a new void pointer to memory.
Contents are preserved.
int * ptr;
ptr = (int *)malloc(sizeof(int)*2);
ptr = (int *)
realloc(ptr, sizeof(int)*200);
ptr
int * ptr;
ptr = (int *)malloc(sizeof(int)*2);
ptr = (int *)
realloc(ptr, sizeof(int)*200);
You can use realloc to implement free (): just set the size to 0.
ptr
Example
Correct?
Example
Correct?
Dynamically creating structures
struct thing * ptr;
ptr = (struct thing *)malloc(sizeof(struct thing));
/* Do stuff */
ptr->day = mon; …
free((void *)ptr); Ptr = NULL;
This is a some of what Java does “behind the scenes” on object creation.
Safety issues
Caution #1:
Safety issues
De-allocate memory that is no longer required.
While the system should de-allocate resources on termination, it is good practice to take control of this process.
In some Java programs there is a noticeable performance dip when the automatic “garbage collection” functionality kicks in.
Caution #2:
Safety issues
NEVER attempt to de-allocate memory that has not been allocated!
A common error is to try to free memory that has already been de-allocated, or was never allocated in the first instance.
Safety issues
Caution #3:
NEVER try to use memory that has been de- allocated.
This is also a common error leading to serious problems.
int* p = malloc(sizeof(int)); free(p);
*p = 5; //segfault here
Caution #4:
Safety issues
Know your memory allocation requirements!
Use of the sizeof operator addresses the more obvious problems.
However, a common problem is to forget that a string includes a ‘\0’ terminating character.
Caution #5:
Safety issues
Check for success!
A failed memory allocation request can lead to disaster if it is simply assumed to be successful.
Previous examples here have made this
assumption for convenience. This would NOT qualify as bullet-proof code!
Summary
✓Understand the need for memory allocation and de-allocation
✓Be able to use relevant C functions for achieving this
✓ malloc ✓ calloc ✓ realloc ✓ free
✓Be able to allocate and access memory safely
Write a 2d Array
Write a 2d array via single pointer
Using an array of pointers
Using pointer to a pointer
Q1
Q2
Which of the following is true?
(A) “ptr = calloc(m, n)” is equivalent to following ptr = malloc(m * n);
(B) “ptr = calloc(m, n)” is equivalent to following ptr = malloc(m * n);
memset(ptr, 0, m * n);
(C) “ptr = calloc(m, n)” is equivalent to following ptr = malloc(m);
memset(ptr, 0, m);
(D) “ptr = calloc(m, n)” is equivalent to following ptr = malloc(n);
memset(ptr, 0, n);
Q3
(A) Compiler Error: free can’t be applied on NULL pointer
(B) Memory Leak
(C) Dangling Pointer
(D) The program may crash as free() is called for NULL pointer.
Q4
Which of the following is/are true
(A) calloc() allocates the memory and also initializes the allocates memory to zero, while memory allocated using malloc() has uninitialized data.
(B) malloc() and memset() can be used to get the same effect as calloc().
(C) calloc() takes two arguments, but malloc takes only 1 argument.
(D) Both malloc() and calloc() return ‘void *’ pointer.
(E) All of the above