C Pointers – Part 2
J.E.Jones
OSU CSE 2421
OSU CSE 2421
Arrays and pointers
◦ Statically allocated arrays
◦ Dynamically allocated arrays
Pointers to void (void *)
Dynamic memory allocation and pointers
Freeing (deallocating) dynamically allocated storage Pointer arithmetic
Function parameters and pointers
J. E. Jones
OSU CSE 2421
Different from arrays in Java in several respects.
Arrays are closely related to pointers in C; elements of arrays can be
accessed using a pointer in C as well as by index.
But, when arrays get complex, it’s almost always accessed via pointers
C has two different types of arrays:
◦ Statically allocated arrays: The compiler generates code to allocate the space for the array elements at compile time (and to initialize them, if they are initialized as part of the declaration). The space is allocated on the stack or the heap depending upon the declaration statement.
File or block scope/static or automatic class
◦ Dynamically allocated arrays: A C library function is called to request space for the array elements at runtime (that is, after the program begins running).
Need to declare a pointer to an array of whatever kind of array we want the space to represent to hold the address of this block of memory
J. E. Jones
OSU CSE 2421
Example:
int scores[6];
C arrays declared as above have statements within the assembler level code to allocate space for it at either load or run time and it has a fixed size. The size of the array can only be changed by modifying the constant within the brackets and then recompiling the code.
The expression enclosed in [ ] must be a constant, the value of which is known at compilation time.
The [ ] cannot be empty in declarations as shown above.
Whether the array elements are of static storage class* (memory allocated on the stack) or automatic storage class* (memory allocated on the heap), the compiler will generate code to allocate memory storage for the elements of the array and, additionally, to initialize them to zero if on the heap. Consequently, these arrays are referred to as static arrays, or statically allocated arrays because their size cannot be changed at run time.
*we will discuss storage classes later this week.
J. E. Jones
OSU CSE 2421
Example:
int scores[6] = {19, 17, 18, 16, 15, 20}; int scores[] = {19, 17, 18, 16, 15, 20};
C arrays declared as above are also allocated storage at compile time and have a fixed size.
The expression enclosed in [ ] must be a constant, the value of which is known at compilation time.
Because we are specifying initial values for array elements, the [ ] can be empty. If so, the compiler will create an array with the number of elements equal to the number of values enclosed in braces.
Thus, the two declarations above declare the same array.
J. E. Jones
OSU CSE 2421
Arrays of static storage class will be initialized to 0 (all elements) by most compilers if no explicit initialization is given for any element. Arrays of automatic storage class are not initialized in any way unless specifically done so with code.
If you provide fewer initial values than the number of elements in an array, the remaining values will be initialized to 0 (this works for both static and automatic storage class arrays):
int scores[10] = {19, 20}; /* last 8 elements set to 0 */
To explicitly initialize all elements to 0 (for a static storage class or
automatic storage class array): int scores[10] = {0};
Consider:
int scores[10] = {19, 17, 18, 16, 15, 20}; int scores[] = {19, 17, 18, 16, 15, 20};
These two declarations do not declare the same array. Why?
J. E. Jones
OSU CSE 2421
There is no library function in C that will tell you the size of an array (no matter whether is it a statically or dynamically allocated arrays).
There is no array termination marker stored in the array (except for strings, which use char arrays – more later).
You must keep track of the size, and check indexes “manually” (in the code you write) to ensure that they are within range.
If you try to access elements beyond the last element, this will produce a run-time error (typically a segmentation fault) OR will read or write a value that is not an element of the array (a harder bug to find).
J. E. Jones
OSU CSE 2421
There is no library function for copying arrays in C (except for strings, which we’ll see soon).
If you want to copy an array in C, you must copy the elements one by one (with a for or while loop). For example, we can use something such as:
or:
int scores[6] = {19, 17, 18, 16, 15, 20}; int copy[6];
for (i = 0; i < 6; i++) {
/* Array to be copied */
/* Copy of original array */
copy[i] = scores[i]; }
int scores[6] = {19, 17, 18, 16, 15, 20}; int copy[6];
int *scr_ptr, *copy_ptr;
scr_ptr = scores;
/* Array to be copied */
/* Copy of original array */
copy_ptr = copy;
for (i = 0; i < 6; i++) { *copy_ptr++ = *scr_ptr++;
/* note ++ increments to next array element, not a single address value. */
}
J. E. Jones
/* note these two lines */
OSU CSE 2421
int scores[6] = {19, 17, 18, 16, 15, 20}; /* Array to be copied */
int copy[6]; /* Copy of original array */ int *scr_ptr, *copy_ptr;
scr_ptr = scores; /* note these two lines */ copy_ptr = copy;
for (i = 0; i < 6; i++) {
*copy_ptr++ = *scr_ptr++; }
When an array name is used by itself, it represents the address of the beginning of the array, which is also the address of the first element of the array.
i.e. scores is equivalent to &scores[0]; both reference an address.
J. E. Jones
OSU CSE 2421
int scores[6] = {19, 17, 18, 16, 15, 20};
In C, the name of the array by itself is a constant pointer to the first element of the array; that is, scores is the same as &scores[0] (&scores gives you something completely different)
Because this pointer is a constant, it cannot be changed (for example, you cannot assign a different address to it). It will always point to the spot in memory where the array declared was allocated space.
◦ Youcansay:
int scores[6] = {19, 17, 18, 16, 15, 20}; int *scores_ptr;
scores_ptr = scores;
◦ Youcan’tsay:
int scores[6] = {19, 17, 18, 16, 15, 20}; int scores2[6] = {21, 16, 12, 13, 5, 7} int *scores_ptr;
scores_ptr = scores2;
scores = scores_ptr;
All elements of the array are stored in contiguous memory locations.
J. E. Jones
OSU CSE 2421
A variable declared as a pointer. Value can be changed
Constant pointer. Value can’t be changed
scores_ptr or scores or &scores[0]
scores[0] or *scores
Note these arrows point to the values in the array
address of the start of scores array
19 17 18
16 15 20
scores[1] or *(scores+1) &scores – this would be an address in the symbol table
Note these arrows point to addresses in memory
J. E. Jones
OSU CSE 2421
As we work with pointers, it now becomes important to remember the common data type sizes
Data Type
Size in Bytes
byte char short int long float double
1 1 2 4 8 4 8
These aren’t always the case, but we can use these values as typical ones.
J. E. Jones
OSU CSE 2421
All elements of an array are stored in contiguous memory locations.
int scores[6] = {19, 17, 18, 16, 15, 20};
Array element
Address*
Value
scores[0] scores[1] scores[2] scores[3] scores[4] scores[5]
0x600020 19 0x600024 17 0x600028 18 0x60002C 16 0x600030 15 0x600034 20
*since addresses are 8-bytes, then 0x0000000000600020, etc.
J. E. Jones
OSU CSE 2421
When we have pointers, there are several arithmetic operations that we can perform*:
◦ Using ++ and -- operators
◦ Adding an integer to a pointer
◦ Subtracting an integer from a pointer
◦ Subtracting two pointers from each other ◦ Comparing two pointers
*these operations are not permitted on pointers to functions
J. E. Jones
OSU CSE 2421
When we have pointers and are using them within an array of elements of a particular data type, using the ++ and – operator allow us to move from the address of one element in the array to the next (++) or the previous(--)
For example:
◦ #define SIZE 15
◦ long l_array[SIZE];
◦ long *l_array_ptr = l_array;
◦ int i;
◦ For (i=0; i
void *calloc(size_t nmemb, size_t size); void *malloc(size_t size);
void free(void *ptr);
void *realloc(void *ptr, size_t size);
DESCRIPTION
calloc() allocates memory for an array of nmemb elements of size bytes each and returns a pointer to the allocated memory. The memory is set to zero. If nmemb or size is 0, then calloc() returns either NULL, or a unique pointer value that can later be successfully passed to free().
malloc() allocates size bytes and returns a pointer to the allocated memory. The memory is not cleared. If size is 0, then malloc() returns either NULL, or a unique pointer value that can later be successfully passed to free().
free() frees the memory space pointed to by ptr, which must have been returned by a previous call to malloc(), calloc() or realloc(). Otherwise, or if free(ptr) has already been called before, undefined behavior occurs. If ptr is NULL, no operation is performed.
J. E. Jones
OSU CSE 2421
Returns a pointer to void (i.e., void *), which points to the address of the first byte of the allocated memory space on the heap.
This function takes one parameter – which may be an expression – that specifies the number of bytes being requested. (Use sizeof() for portability!).
For example, to request enough bytes for 4 integer values, we could use: int *p;
p = (int *)malloc ( 4 * sizeof(int) ); /* malloc (4 * 4) is not portable!*/
If we only needed the number of bytes for one int, we could use: int *p;
p = (int *)malloc ( sizeof(int) );
The memory returned by malloc is uninitialized *(contains garbage values), so be sure to initialize it before you use it!
*Does anyone see a security problem here???
J. E. Jones
OSU CSE 2421
Returns a pointer to void (i.e., void *), which points to the address of the first byte of the allocated memory space on the heap.
This function takes one parameter – which may be an expression – that specifies the number of bytes being requested. (Use sizeof() for portability!).
For example, to request enough bytes for 4 integer values, we could use: int *p;
p = (int *)malloc ( 4 * sizeof(int) ); /* malloc (4 * 4) is not portable!*/
If we only needed the number of bytes for one int, we could use: int *p;
p = (int *)malloc ( sizeof(int) );
The memory returned by malloc is uninitialized *(contains garbage values), so be sure to initialize it before you use it!
*Does anyone see a security problem here??? OSU did. malloc() is mapped to calloc() on stdlinux.
J. E. Jones
OSU CSE 2421
Returns a pointer to void (i.e., void *), which points to the address of the first byte of the allocated memory space on the heap.
This function takes two parameters – which may be expressions – that specify the number of elements for which storage is being requested and the size of each element in bytes (Use sizeof() for portability!).
So, to request memory space for 4 integer values, we could use: int *p;
p = calloc ( 4, sizeof(int) ); /* calloc (4, 4) is not portable! */
If we only needed space for one int, we could use: int *p;
p = calloc ( 1, sizeof(int) );
The memory returned by calloc is initialized to 0, so if you do not plan to initialize the values before using them, use calloc(), and not malloc()!
◦ This means that calloc() will take more CPU time to execute than will malloc(). It may or may not be an issue, but worth thinking about depending upon how much space you are requesting.
◦ Just because OSU maps malloc() to calloc() doesn’t mean that all other systems you may work on do the same thing. So beware!!
J. E. Jones
OSU CSE 2421
If the requested memory cannot be allocated, both malloc() and calloc() return the NULL pointer (defined in stdlib.h), which has a value of 0.
Therefore, before using the pointer to access any of the allocated memory, you should check to make sure that the pointer returned was not NULL. For example:
int *p;
p = (int *)calloc (10, sizeof(int) );
if (p != 0) { /*Or (if p != NULL), NULL is #defined in stdlib.h */
…. . . . .
/* OK to access values in allocated storage */ /* Some code to handle the allocation failure*/
} else {
}
. . . .
J. E. Jones
OSU CSE 2421
If your program uses storage which has been allocated dynamically, then you should free it (return it to the operating system) once it is no longer being used.
The C library function free() is used for this; it returns void, and has a single parameter, which is a pointer to the first byte of the allocated storage to be freed, and this pointer MUST be pointing to the 1st byte of some dynamically allocated storage!
free() is also declared in stdlib.h.
J. E. Jones
OSU CSE 2421
Here’s an example of how to free dynamically allocated memory: int *p;
p = calloc (10, sizeof (int) );
……
if (p != NULL) free (p); /* releases storage to which p points
p = NULL;
back to the OS */
The pointer which was passed to free should also be set to NULL or 0 after the call to free(), to ensure that you do not attempt to access it inadvertently. To do so can cause a segmentation fault.
J. E. Jones
OSU CSE 2421
Run your program with a utility call valgrind
If we ran valgrind on lab2:
[jones.5684@cse-fl1 lab2]$ valgrind bit_decode1 < decode.input1 ==36528== Memcheck, a memory error detector
==36528== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==36528== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info ==36528== Command: bit_decode1
==36528==
this is a message
==36528==
==36528== HEAP SUMMARY:
==36528== in use at exit: 0 bytes in 0 blocks
==36528== total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==36528==
==36528== All heap blocks were freed -- no leaks are possible
==36528==
==36528== For lists of detected and suppressed errors, rerun with: -s
==36528== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) [jones.5684@cse-fl1 lab2]$
Look for this sentence.
J. E. Jones
OSU CSE 2421
The improper use of dynamic memory allocation can frequently be a source of bugs. These can include security bugs or program crashes, most often due to segmentation faults. The most common errors are as follows:
Not checking for allocation failures
Memory allocation is not guaranteed to succeed and may instead return a null pointer. Using the returned value, without checking if the allocation is successful, invokes undefined behavior. This usually leads to a crash (due to the resulting segmentation fault on the null pointer dereference), but there is no guarantee that a crash will happen so relying on that can also lead to problems.
Memory leaks
Failure to deallocate memory using free() leads to buildup of non-reusable memory, which is no longer used by the program. This wastes memory resources and can lead to allocation failures when these resources are exhausted.
Logical errors
All allocations must follow the same pattern: allocation using malloc(), usage to store data, deallocation using free. Failures to adhere to this pattern, such as memory usage after a call to free (dangling pointer) or before a call to malloc() (wild pointer),
calling free twice ("double free"), etc., usually causes a segmentation fault and results in a crash of the program. These errors can be transient and hard to debug – for example, freed memory is usually not immediately reclaimed by the OS, and thus dangling pointers may persist for a while and appear to work.
J. E. Jones
OSU CSE 2421
Suppose we request enough space from malloc() or calloc() to store more than one variable of some type, and assign some value to the first element in the allocated memory:
int *p;
p = (int *)malloc (10 * sizeof(int));
*p = 100; /* assigns 100 to the first (sizeof (int)) bytes*/
Since malloc() and calloc() return a pointer to the first byte of the allocated memory, how do we access the rest of the space beyond the first element?
J. E. Jones
OSU CSE 2421
We can use pointer arithmetic to access the elements beyond the first element.
If p points to the first integer in our example, p + 1 points to the second integer, p + 2 points to the third, and p + n points to the (n + 1)th.
When the code is compiled, the compiler generates instructions to access the appropriate bytes, because we assigned the pointer to the allocated storage to int *p, so the compiler knows the elements are integers, and it also knows sizeof (int) for the system.
J. E. Jones
OSU CSE 2421
How can we use pointer arithmetic and the dereference operator to access elements of our dynamically allocated storage?
int *p;
p = (int *)malloc (10 * sizeof(int));
*p = 100; /* assigns 100 to the first (sizeof) int bytes */
To assign int values to the next two elements:
*(p + 1) = 200; /*compiler scales the value added*/ *(p + 2) = 300;
More generally, for any statically or dynamically allocated array:
array[i] accesses the same element as *(array + i)
J. E. Jones
OSU CSE 2421
Be careful when you use the dereference operator with a pointer to elements of a statically or dynamically allocated array, along with pointer arithmetic (suppose p points to the 1st of 5 integers):
*(p + 3) = 45; /*Assigns 45 to 4th int */ *p + 3 = 45; /* Invalid - Why? */
J. E. Jones
OSU CSE 2421
What appears on the left side of an assignment operator in C has to be an L- value, that is, a location in memory where a value can be stored.
What appears on the right side of an assignment operator in C has to be an R-value, that is, a numeric value which can be stored in a binary form.
In the invalid expression on the previous slide, we have: *p + 3 = 45;
*p + 3 is not a valid expression for an L-value
J. E. Jones
OSU CSE 2421
When you use pointers to access dynamically allocated storage, be sure that you do not use a pointer value that will attempt to access space outside the allocated storage!
This will result in a segmentation fault typically.
As we stated before, C has no library function which returns the size of an array, so you must keep track of it explicitly and pass it as a parameter to any function which accesses elements of an array (statically or dynamically allocated).
J. E. Jones
OSU CSE 2421
When functions are called in C, the parameters to the function are passed by value:
int a = 5; int b = 10; func1(a, b);
What this means is that a copy of the values of a and b in the calling environment will be passed to func1, but func1 will not have access to the memory where a and b are stored in the calling function, so it cannot alter their values.
The values of the parameters are placed on the stack (the values are copied from the variables in the calling function, and written to the stack), before the function begins execution.
J. E. Jones
OSU CSE 2421
Normally, it is desirable that the called function not be able to change the values of variables in the calling environment, because this limits the interaction between the calling function and the called function and makes debugging and maintenance easier.
At times, though, we may want to give a function access to the memory where the parameters are located.
Sometimes in C, this is the only way we can pass a parameter; for example, elements of an array cannot be passed by value (unless each of the elements is passed as an individual parameter).
In such cases, we can, in effect, pass by reference, which means we pass by value a pointer to the parameter.
This will allow the called function to reference all elements of the array and alter their values.
J. E. Jones
OSU CSE 2421
Consider a simple function to swap, or exchange two values, with the following declaration:
void swap(int x, int y);
J. E. Jones
OSU CSE 2421
Consider using this function to swap, or exchange two values, in the following code:
/* Recall the declaration of the function: void swap(int x, int y); */
int a = 10; int b = 5; swap(a, b);
Even if swap correctly exchanges a and b in the called function, swap(), the value of a and b in this, the calling function will still have their original values
What to do?
J. E. Jones
OSU CSE 2421
int a = 10;
int b = 5;
swap(&a, &b); /*Now, swap will be able to exchange the values of a and b in the
calling environment */
NOTE: The declaration of swap must be changed too:
void swap(int *x, int *y);
because we are now passing two 8-byte addresses rather than two 4-byte integers
J. E. Jones
OSU CSE 2421
Suppose we want to call a function declared as: int sum(const int *array, int size);
It sums the elements of an array given the address of the start of the array and its size:
int array[6] = {18, 16, 15, 20, 19, 17}; int size = 6;
int total;
....
total = sum(array, size); /* OR total = sum(&array[0], size); */ ....
Any time a pointer is passed as a parameter, if we do not want to give the calling function the ability to write to variables pointed to by the pointer, the const keyword can be used. Here the first parameter of sum is declared with the const keyword. It will allow the function sum() to read values from the array, but not change the values in any way.
J. E. Jones
OSU CSE 2421
As mentioned earlier, using pointers to pointers (aka double pointer) is often used in C programs.
Let’s look at an example...
If we wanted to specify a single book title, we could say:
char *title = “A Tale of Two Cities”;
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 11 12 13 14
02 A T a l e o f T w o C i t i e s \0
Given this example, the variable title would have the value 0x200. This is the address where the string “A Tale of Two Cities” is stored.
J. E. Jones
OSU CSE 2421
As mentioned earlier, using pointers to pointers (aka double pointer) is often used in C programs.
Let’s look at an example...
If we wanted to specify several book titles, we could say:
char *titles[7] = {“A Tale of Two Cities”, “Wuthering Heights”,
“Don Quixote”, “Odyssey”,
“Moby Dick”, “Hamlet”, “Gulliver’s Travels”};
J. E. Jones
OSU CSE 2421
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 11 12 13 14
02 A T a
03 W u t h
04 Do n
05 O d y s
06 Mo b y
l e
e r i Q ui s e y
o f T w o C i t i e s \0 n g H e i g h t s \0
x o t e \0
\0
07 H a
08 G u
m l l l
Di e t \0
c k \0
r ‘ s T r a v e l s \0
i v e
If “A Tale of Two Cities” was stored at 0x200, “Wuthering Heights” at 0x300, etc. (Low addresses used for ease of reading.) In practice, these strings would be stored in contiguous memory, I’m using addresses 10016 apart for easy address computations.
J. E. Jones
OSU CSE 2421
*titles *(titles+1) *(titles+2) *(titles+3) *(titles+4) *(titles+5) *(titles+6)
titles[0] titles[1] titles[2] titles[3] titles[4] titles[5] titles[6]
0x0000000000600400 0x0000000000600408 0x0000000000600410 0x0000000000600418 0x0000000000600420 0x0000000000600428 0x0000000000600430
0x0000000000000200 0x0000000000000300 0x0000000000000400 0x0000000000000500 0x0000000000000600 0x0000000000000700 0x0000000000000800
titles=0x600400
In this example, the char *titles[7] array has 56 bytes (7*8-bytes) allocated to it because it hold 7 8-byte addresses. The 56 bytes of space starts at address 0x0000000000600400.
J. E. Jones
OSU CSE 2421
We plan to have two other arrays. These two arrays will hold a list of the reader’s opinion of the best books and the other will hold a list of books written by English writers.
char *bestBooks[3]={ “A Tale of Two Cities”, “Odyssey”,
“Hamlet”};
char *englishBooks[4]={“A Tale of Two Cities”,
“Wuthering Heights”, “Hamlet”,
“Gulliver’s Travels”};
Doesn’t this seem like a waster of space??? We’ve got 2 or 3 different spots where we’ve duplicated these strings.
J. E. Jones
OSU CSE 2421
Instead of holding copies of the titles in all three arrays, we can build our data such that there is only one copy of the titles by using double pointers.
char *titles[7] = {“A Tale of Two Cities”, “Wuthering Heights”,
char **bestBooks[3]; char **englishBooks[4];
Instead of:
bestBooks[0]=&titles[0]; bestBooks[1]=&titles[3]; bestBooks[2]=&titles[5];
char *bestBooks[3]={“A Tale of Two Cities”, “Odyssey”,
englishBooks[0]=&titles[0]; englishBooks[1]=&titles[1]; englishBooks[2]=&titles[5]; englishBooks[3]=&titles[6];
“Wuthering Heights”, “Hamlet”, “Gulliver’s Travels”};
“Don Quixote”, “Odyssey”,
“Moby Dick”, “Hamlet”, “Gulliver’s Travels”};
“Hamlet”};
char *englishBooks[4]={“A Tale of Two Cities”,
This 2nd option saves space and gives us the flexibility to change any typographical errors only once rather than in each array.
J. E. Jones
OSU CSE 2421
bestBooks[0] bestBooks[1] bestBooks[2]
0x600400 0x600418
titles[0] 0x600400 titles[1] 0x600408 titles[2] 0x600410 titles[3] 0x600418 titles[4] 0x600420 titles[5] 0x600428 titles[6] 0x600430
0x200 0x200 “A Tale of Two Cities” 0x300 0x300 “Wuthering Heights”
englishBooks[0] 0x600400 englishBooks[1] 0x600408
0x600 0x600 “Moby Dick” 0x700 0x700 “Hamlet”
0x800 0x800 “Gulliver’s Travels”
06004028 englishBooks[3] 0x600430
englishBooks[2]
0x600428
0x400 0x400 “Don Quixote” 0x500 0x500 “Odyssey”
Using multiple levels of indirection provides additional flexibility with respect to how code can be written and used. Note that if the address of a title changes, it will only require modification to the title array. No other array would have to change.
J. E. Jones