Pointers in C – Part 2
CSE 2421
C Pointers – Part 2
Arrays and pointers
Statically allocated arrays
Dynamically allocated arrays
Pointers to void (void *)
Dynamic memory allocation and pointers
Freeing (deallocating) dynamically allocated storage
Pointer arithmetic
Function parameters and pointers
C Pointers – Part 2 – Overview
Different from arrays in Java in a number of respects.
Arrays are closely related to pointers in C; elements of arrays can be accessed using a pointer in C as well as by index.
But, when arrays get complex, it’s almost always accessed via pointers
C has two different types of arrays:
Statically allocated arrays: The compiler generates code to allocate the space for the array elements at compile time (and to initialize them, if they are initialized as part of the declaration). The space is allocated on the stack or the heap depending upon the declaration statement.
File or block scope/static or automatic class
Dynamically allocated arrays: A C library function is called to request space for the array elements at runtime (that is, after the program begins running).
Need to declare a pointer to an array of whatever kind of array we want the space to represent to hold the address of this block of memory
Arrays in C
3
Example:
int scores[6];
C arrays declared as above are allocated storage at run time and has a fixed size.
The expression enclosed in [ ] must be a constant, the value of which is known at compilation time.
The [ ] cannot be empty
Whether the array elements are of static storage class or automatic storage class, the compiler will generate code to allocate memory storage for the elements of the array and, additionally, to initialize them to zero if on the heap. This is why these arrays are referred to as static arrays, or statically allocated arrays (Here, static means at compile time).
Declaration of Static Arrays
Example:
int scores[6] = {19, 17, 18, 16, 15, 20};
int scores[] = {19, 17, 18, 16, 15, 20};
C arrays declared as above are also allocated storage at compile time and have a fixed size.
The expression enclosed in [ ] must be a constant, the value of which is known at compilation time.
Because we are specifying initial values for array elements, if [ ] is empty, then the compiler will create an array with a number of elements equal to the number of values enclosed in braces.
Thus, the two declarations above declare the same array.
Whether the array elements are of static storage class or automatic storage class, the compiler will generate code to allocate memory storage for the elements of the array, and to store their initial values on the heap or on the stack. This is why these arrays are referred to as static arrays, or statically allocated arrays (Here, static means at compile time).
Declaration of Static Arrays
Arrays of static storage class will be initialized to 0 (all elements) by most compilers if no explicit initialization is given for any element.
If you provide fewer values than the number of elements in the array, the remaining values will be initialized to 0 (this works for both static and automatic storage class arrays):
int scores[10] = {19, 20}; /* last 8 elements set to 0 */
To explicitly initialize all elements to 0 (for a static storage class or automatic storage class array):
int scores[10] = {0};
Consider:
int scores[10] = {19, 17, 18, 16, 15, 20};
int scores[] = {19, 17, 18, 16, 15, 20};
These two declarations do not declare the same array. Why?
Array Initialization
There is no library function in C that will tell you the size of an array (for statically or dynamically allocated arrays).
This is because no array termination marker is stored in the array (except for strings or char arrays – more later).
Therefore, you must keep track of the size, and check indexes “manually” (in the code you write) to ensure that they are within range.
If you try to access elements beyond the last element, this will produce a run-time error (typically a segmentation fault) OR will read or write a value that is not an element of the array (a harder bug to find).
Array Size
There is no library function for copying arrays in C (except for strings, which we’ll see soon).
If you want to copy an array in C, you must copy the elements one by one (with a for or while loop). For example, we can use something such as:
int scores[6] = {19, 17, 18, 16, 15, 20}; /* Array to be copied */
int copy[6]; /* Copy of original array */
for (i = 0; i < 6; i++) {
copy[i] = scores[i];
}
or:
int scores[6] = {19, 17, 18, 16, 15, 20}; /* Array to be copied */
int copy[6]; /* Copy of original array */
int *scr_ptr, *copy_ptr;
scr_ptr = scores; /* note these two lines */
copy_ptr = copy;
for (i = 0; i < 6; i++) {
*copy_ptr++ = *scr_ptr++;
}
Copying Arrays
int scores[6] = {19, 17, 18, 16, 15, 20}; /* Array to be copied */
int copy[6]; /* Copy of original array */
int *scr_ptr, *copy_ptr;
scr_ptr = scores; /* note these two lines */
copy_ptr = copy;
for (i = 0; i < 6; i++) {
*copy_ptr++ = *scr_ptr++;
}
When an array name is used by itself, it represents the address of the array, which is also the address of the first element of the array.
Arrays and Pointers
int scores[6] = {19, 17, 18, 16, 15, 20};
In C, the name of the array by itself is a constant pointer to the first element of the array; that is, scores is the same as &scores[0] (&scores gives you something completely different)
Because this pointer is a constant, it cannot be changed (for example, you cannot assign a different address to it).
You can say:
int scores[6] = {19, 17, 18, 16, 15, 20};
int *scores_ptr;
scores_ptr = scores;
You can’t say:
int scores[6] = {19, 17, 18, 16, 15, 20};
int scores2[6] = {21, 16, 12, 13, 5, 7}
int *scores_ptr;
scores_ptr = scores2;
scores = scores_ptr;
All of the elements of the array are stored in contiguous memory locations.
Arrays and Pointers
As we work with pointers, it now becomes important that we know the common data type sizes
These aren’t always the case, but we can use these values as typical ones.
Common Data Type Sizes
Data Type Size in Bytes
byte 1
char 1
short 2
int 4
long 8
float 4
double 8
All of the elements of an array are stored in contiguous memory locations.
int scores[6] = {19, 17, 18, 16, 15, 20};
Arrays and Pointers
Array element address value
scores[0] 0x600020 19
scores[1] 0x600024 17
scores[2] 0x600028 18
scores[3] 0x60002C 16
scores[4] 0x600030 15
scores[5] 0x600034 20
When we have pointers, there are several arithmetic operations that we can perform*:
Adding an integer to a pointer
Subtracting in integer to a pointer
Subtracting two pointers from each other
Comparing two pointers
*these operations are not always permitted on pointers to functions
Pointer Arithmetic
The compiler generates instructions to compute the exact address for any chosen element of an array when using an array; it can do this because it knows all the bytes used to store the array are contiguous.
For example, to access scores[3], the compiler generates instructions to compute the address as:
scores + (3 * (sizeof (int)))
Recall that scores is a constant pointer to the 1st element, so the compiler can compute the address above as:
&scores[0] + (3 * (sizeof (int)))
If we wanted to access the value of scores[3] we would code it as:
*(scores+3)
the compiler multiplies 3 by sizeof(int) for us
Pointer Arithmetic
Any time that we add or subtract a constant integer from a pointer value in C, the compiler will multiply the integer by the size of the data type for which the pointer is declared, then add or subtract that value from the address
So, scores+3 would calculate 0x600020+(3*sizeof(int))
= 0x600020+12
= 0x60002C
IMPORTANT: When you use pointer arithmetic, do not scale the integer because the compiler will do it. You must add or subtract by the size of the type or you will get double scaling (this will always cause access errors). See code on next page.
Pointer Arithmetic: +/- constant
Array element address value
scores[0] 0x600020 19
scores[1] 0x600024 17
scores[2] 0x600028 18
scores[3] 0x60002C 16
scores[4] 0x600030 15
scores[5] 0x600034 20
When we subtract two pointers from each other in C,
The data types of the two pointers must be the same
The compiler will perform the subtraction, then take in to account the size of the data type for which the two pointers are declared and divide the result by the size so that it reports the number of “units” by which the two addresses differ.
Using the scores array, if
int *p0 = scores; /* p0 = 0x600020 */
int *p1 = scores+1; /* p1 = 0x600024 */
int *p2 = scores+2; /* p2 = 0x600028 */
int *p5 = scores+5; /* p5 = 0x600034 */
printf (“p2-p0: %d\n”, p2-p0); /* p2-p0: 2 */
printf (“p2-p1: %d\n”, p2-p1); /* p2-p1: 1 */
printf (“p0-p1: %d\n”, p0-p1); /* p0-p1: -1 */
printf (“p5-p2: %d\n”, p5-p2); /* p5-p2: 3 */
Note that in the first printf statement, the difference between the positions of the two array elements is 2. That is, their indexes differ by 2.
((0x600028 – 0x600020)/4)=2
In the 3rd printf statement, the difference is -1, indicating that p0 immediately precedes the element point to by p1.
((0x600020 – 0x600024)/4)=-1
Pointer Arithmetic: subtract 2 pointers
Pointers can be compared using the standard comparison operators
Normally, comparing pointers does not produce much useful information. Although it can be used to determine the relative ordering of array elements.
Then using the scores array again and 0 for false and 1 for true, if
int *p0 = scores; /* p0 = 0x600020 */
int *p1 = scores+1; /* p1 = 0x600024 */
int *p2 = scores+2; /* p2 = 0x600028 */
int *p5 = scores+5; /* p5 = 0x600034 */
printf (“p2>p0: %d\n”, p2>p0); /* p2>p0: 1 */
printf (“p2
printf (“p5
void *calloc(size_t nmemb, size_t size);
void *malloc(size_t size);
void free(void *ptr);
void *realloc(void *ptr, size_t size);
DESCRIPTION
calloc() allocates memory for an array of nmemb elements of size bytes each and returns a
pointer to the allocated memory. The memory is set to zero. If nmemb or size is 0, then
calloc() returns either NULL, or a unique pointer value that can later be successfully
passed to free().
malloc() allocates size bytes and returns a pointer to the allocated memory. The memory
is not cleared. If size is 0, then malloc() returns either NULL, or a unique pointer
value that can later be successfully passed to free().
free() frees the memory space pointed to by ptr, which must have been returned by a pre-
vious call to malloc(), calloc() or realloc(). Otherwise, or if free(ptr) has already
been called before, undefined behavior occurs. If ptr is NULL, no operation is per-
formed.
Malloc(3) Manual Page
Returns a pointer to void (i.e., void *), which points to the address of the first byte of the allocated memory space on the heap.
This function takes one parameter – which may be an expression – that specifies the number of bytes being requested (Use sizeof() for portability!).
For example, to request enough bytes for 4 integer values, we could use:
int *p;
p = malloc ( 4 * sizeof(int) ); /* malloc (4 * 4) is not portable! */
If we only needed the number of bytes for one int, we could use:
int *p;
p = malloc ( sizeof(int) );
The memory returned by malloc is uninitialized *(contains garbage values), so be sure to initialize it before you use it!
*Does anyone see a security problem here???
malloc()
Returns a pointer to void (i.e., void *), which points to the address of the first byte of the allocated memory space on the heap.
This function takes two parameters – which may be expressions – that specify the number of elements for which storage is being requested and the size of each element in bytes (Use sizeof() for portability!).
So, to request memory space for 4 integer values, we could use:
int *p;
p = calloc ( 4, sizeof(int) ); /* calloc (4, 4) is not portable! */
If we only needed space for one int, we could use:
int *p;
p = calloc ( 1, sizeof(int) );
The memory returned by calloc is initialized to 0, so if you do not plan to initialize the values before using them, use calloc, and not malloc!
This means that calloc() will take more CPU time to execute than will malloc(). It may or may not be an issue, but worth thinking about depending upon how much space you are requesting.
calloc()
If the requested memory cannot be allocated, both malloc() and calloc() return the null pointer (defined in stdlib.h), which has a value of 0.
Therefore, before using the pointer to access any of the allocated memory, you should check to make sure that the pointer returned was not null. For example:
int *p;
p = calloc (10, sizeof(int) );
if (p != 0) { /*Also (if p != NULL), NULL is #defined in stdlib.h */
. . . .
. . . . /* OK to access values in allocated storage */
}
else { . . . . /* Some code to handle the allocation failure*/
}
What if allocation fails?
If your program uses storage which has been allocated dynamically, then you should free it (return it to the operating system) once it is no longer being used.
The C library function free() is used for this; it returns void, and has a single parameter, which is a pointer to the first byte of the allocated storage to be freed, and this pointer MUST be pointing to the 1st byte of some dynamically allocated storage!
free() is also declared in stdlib.h.
Freeing allocated storage
Here’s an example of how to free dynamically allocated storage:
int *p;
p = calloc (10, sizeof (int) );
……
if (p != NULL) free (p); /* releases storage to which p points
back to the OS */
p = NULL;
The pointer which was passed to free should also be set to NULL or 0 after the call to free(), to ensure that you do not attempt to access it inadvertently. To do so can cause a segmentation fault.
Freeing allocated storage
Run your program with a utility call valgrind
If we ran valgrind on lab2:
[jones.5684@cse-fac2 lab2]$ valgrind bit_encode2
==30211== Memcheck, a memory error detector
==30211== Copyright (C) 2002-2017, and GNU GPL’d, by Julian Seward et al.
==30211== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
==30211== Command: bit_encode2
==30211==
enter cleartext: two fat dogs
two fat dogs
hex encoding:
74 77 6F 20 66 61 74 20 64 6F
67 73
enter 4-bit key: 0110
hex ciphertext:
12 11 09 46 00 07 12 46 02 09
01 15
==30211==
==30211== HEAP SUMMARY:
==30211== in use at exit: 0 bytes in 0 blocks
==30211== total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==30211==
==30211== All heap blocks were freed — no leaks are possible
==30211==
==30211== For counts of detected and suppressed errors, rerun with: -v
==30211== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
[jones.5684@cse-fac2 lab2]$
How to check if you freed everything
The improper use of dynamic memory allocation can frequently be a source of bugs. These can include security bugs or program crashes, most often due to segmentation faults. The most common errors are as follows:
Not checking for allocation failures
Memory allocation is not guaranteed to succeed, and may instead return a null pointer. Using the returned value, without checking if the allocation is successful, invokes undefined behavior. This usually leads to a crash (due to the resulting segmentation fault on the null pointer dereference), but there is no guarantee that a crash will happen so relying on that can also lead to problems.
Memory leaks
Failure to deallocate memory using free() leads to buildup of non-reusable memory, which is no longer used by the program. This wastes memory resources and can lead to allocation failures when these resources are exhausted.
Logical errors
All allocations must follow the same pattern: allocation using malloc(), usage to store data, deallocation using free. Failures to adhere to this pattern, such as memory usage after a call to free (dangling pointer) or before a call to malloc() (wild pointer), calling free twice (“double free”), etc., usually causes a segmentation fault and results in a crash of the program. These errors can be transient and hard to debug – for example, freed memory is usually not immediately reclaimed by the OS, and thus dangling pointers may persist for a while and appear to work.
Malloc() frequent errors…(from Wiki)
Suppose we request enough space from malloc() or calloc() to store more than one variable of some type, and assign some value to the first element in the allocated memory:
int *p;
p = malloc (10 * sizeof(int));
*p = 100; /* assigns 100 to the first sizeof (int) bytes */
Since malloc() and calloc() return a pointer to the first byte of the allocated memory, how do we access the rest of the space beyond the first element?
Pointer Arithmetic for Dynamic Arrays
We can use pointer arithmetic to access the elements beyond the first element (or even the first element).
If p points to the first integer in our example, p + 1 points to the second integer, p + 2 points to the third, and p + n points to the (n + 1)th.
When the code is compiled, the compiler can generate instructions to access the appropriate bytes, because we assigned the pointer to the allocated storage to int *p, so the compiler knows the elements are integers, and it also knows sizeof (int) for the system.
Pointer Arithmetic
How can we use pointer arithmetic and the dereference operator to access elements of our dynamically allocated storage?
int *p;
p = malloc (10 * sizeof(int));
*p = 100; /* assigns 100 to the first (sizeof) int bytes */
To assign int values to the next two elements:
*(p + 1) = 200; /*Remember, the compiler scales the value added*/
*(p + 2) = 300;
More generally, for any statically or dynamically allocated array:
array[i] accesses the same element as *(array + i)
Dereference operator with pointer arithmetic
Be careful when you use the dereference operator with a pointer to elements of a statically or dynamically allocated array, along with pointer arithmetic (suppose p points to the 1st of 5 integers):
*(p + 3) = 45; /* Assigns 45 to 4th int */
*p + 3 = 45; /* Invalid – Why? */
Pointer Arithmetic – Caution!
What appears on the left side of an assignment operator in C has to be an L- value, that is, a location in memory where a value can be stored.
What appears on the right side of an assignment operator in C has to be an R-value, that is, a numeric value which can be stored in a binary form.
In the invalid expression on the previous slide, we have:
*p + 3 = 45;
*p + 3 is not an L-value, however, because it is not a location in memory!
Explanation of the problem
When you use pointers to access dynamically allocated storage, be sure that you do not use a pointer value that will attempt to access space outside the allocated storage!
This will result in a segmentation fault typically.
As we stated before, C has no library function which returns the size of an array, so you have to keep track of it explicitly, and pass it as a parameter to any function which accesses elements of an array (statically or dynamically allocated).
Checking Bounds
When functions are called in C, the parameters to the function are passed by value:
int a = 5;
int b = 10;
func1(a, b);
What this means is that the values of a and b in the calling environment will be passed to func1, but func1 will not have access to the memory where a and b are stored in the calling function, so it cannot alter their values.
The values of the parameters are placed on the stack (the values are copied from the variables in the calling function, and written to the stack), before the function begins execution.
Function Parameters and Pointers
Normally, it is desirable that the called function not be able to change the values of variables in the calling environment, because this limits the interaction between the calling function and the called function, and makes debugging and maintenance easier.
At times, though, we may want to give a function access to the memory where the parameters are located.
In some cases in C, this is the only way we can pass a parameter; for example, elements of an array cannot be passed by value (unless each of the elements is passed as an individual parameter).
In such cases, we can, in effect, pass by reference, which means we pass a pointer to the parameter.
This will allow the function to alter the variable which is used to pass the parameter’s value in the calling environment.
Function Parameters and Pointers cont.
Consider a simple function to swap, or exchange two values, with the following declaration:
void swap(int x, int y);
Example
Consider using this function to swap, or exchange two values, in the following code:
/* Recall the declaration of the function: void swap(int x, int y); */
int a = 10;
int b = 5;
swap(a, b);
Even if swap correctly exchanges a and b in the called function, swap(), the value of a and b in this, the calling function will still have their original values
What to do?
Example – When pass by value does not work
int a = 10;
int b = 5;
swap(&a, &b); /*Now, swap will be able to exchange the values of a and b in the calling environment */
NOTICE: The declaration of swap must be changed too:
void swap(int *x, int *y);
because we are now passing 2 8-byte addresses rather than 2 4-byte integers
Pass by reference
Suppose we want to call a function declared as:
int sum(const int *array, int size); It sums the elements of an array given the address of the start of the array and its size:
int array[6] = {18, 16, 15, 20, 19, 17};
int size = 6;
int total;
. . . .
total = sum(array, size); /* OR total = sum(&array[0], size); */
. . . .
Any time a pointer is passed as a parameter, if the function will not write to variables pointed to by the pointer, the const keyword should be used. This is why the first parameter of sum should be declared with the const keyword above. It will allow the function sum() to read values from the array, but not affect the values in any way.
Passing arrays as parameters
As mentioned earlier, using pointers to pointers (aka double pointer) is often used in C programs.
Lets look at an example…
char *titles[7] = {“A Tale of Two Cities”,
“Wuthering Heights”,
“Don Quixote”,
“Odyssey”,
“Moby Dick”,
“Hamlet”,
“Gulliver’s Travels”};
Multiple Levels of Indirection
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 11 12 13 14
02 A T a l e o f T w o C i t i e s \0
03 W u t h e r i n g H e i g h t s \0
04 D o n Q u i x o t e \0
05 O d y s s e y \0
06 M o b y D i c k \0
07 H a m l e t \0
08 G u l l i v e r ‘ s T r a v e l s \0
Multiple Levels of Indirection
If “A Tale of Two Cities” was stored at 0x200, “Wuthering Heights” at 0x300, etc.
46
We plan to have two other arrays. These two arrays will hold a list of the reader’s opinion of the best books and the other will hold a list of books written by English writers.
char *bestBooks[3]={“A Tale of Two Cities”,
“Odyssey”,
“Hamlet”};
char *englishBooks[4]={“A Tale of Two Cities”,
“Wuthering Heights”, “Hamlet”,
“Gulliver’s Travels”};
Doesn’t this seem like a waster of space???
Multiple Levels of Indirection
Instead of holding copies of the titles in all three arrays, we can build our data such that there is only one copy of the titles by using double pointers.
char **bestBooks[3];
char **englishBooks[4];
bestBooks[0]=&titles[0];
bestBooks[1]=&titles[3];
bestBooks[2]=&titles[5];
englishBooks[0]=&titles[0];
englishBooks[1]=&titles[1];
englishBooks[2]=&titles[5];
englishBooks[3]=&titles[6];
Multiple Levels of Indirection
bestBooks[0] titles[0] 0x100 0x200 “A Tale of Two Cities”
bestBooks[1] titles[1] 0x108 0x300 “Wuthering Heights”
bestBooks[2] titles[2] 0x110 0x400 “Don Quixote”
titles[3] 0x118 0x500 “Odyssey”
titles[4] 0x120 0x600 “Moby Dick”
englishBooks[0] titles[5] 0x128 0x700 “Hamlet”
englishBooks[1] titles[6] 0x130 0x800 “Gulliver’s Travels”
englishBooks[2]
englishBooks[3]
Using multiple levels of indirection provides additional flexibility with respect to how code can be written and used. Note that if the address of a title changes, it will only require modification ro the title array. No other array would have to change.
Multiple Levels of Indirection
0x100
0x118
0x128
0x500
0x400
0x300
0x100
0x108
0x128
0x130
0x200
0x600
0x700
0x800
1. Write code to declare any necessary pointer or pointers, and to call calloc() to allocate space for 10 integers.
2. Write code to declare any necessary pointer or pointers, and to call malloc() to allocate space for 5 pointers to floats.
Exercise
int *int_ptr;
int_ptr = calloc (10, sizeof(int));
float **float_ptr_ptr;
float_ptr_ptr = malloc(5 * sizeof(float *));
Exercise Answers