OSU CSE 2421
Required Reading: Pointers on C,
Beginning of Chapter 3 through Section 3.1.3
J.E.Jones
OSU CSE 2421
POINTERS
◦ At last, we arrive at THE MOST DREADED WORD in the lexicon of the C programming student*. Pointers are indeed so dreaded that Java has completely done away with pointers and wrapped their functionality into the (admittedly safer) concept of references. C++, as a transitional step, has both pointers and references.
◦ When we are done, you will have one of the most powerful C languages tools to add to your belt!
◦ *well, at least those who don’t know how to use them yet!!
J. E. Jones
OSU CSE 2421
What is covered is slide deck 12 and 13 is one of the main differences between C and other languages.
It’s quite possible that you already know some of this material. If so, it will be very easy to tune out.
Please try not to do so.
This material can go from “This is so boring!!!” to “$^*@!!, I missed something important!” at lightning speed.
Interestingly enough, by paying close attention to this lecture and the next one, you may find the rest of the course to be mostly intuitive and easy to grasp.
J. E. Jones
OSU CSE 2421
Do you know where this is??? Building 082,
411 Woody Hayes Drive Columbus, Ohio, 43210-1140 United States of America
What about this one??? 555 Borror Drive
Columbus, Ohio, 43210-1187 United States of America
Should you???
J. E. Jones
OSU CSE 2421
Everyone on campus knows where Ohio Stadium is…
Did you know the address of Ohio Stadium is: ◦ Building 082,
◦ 411 Woody Hayes Drive
◦ Columbus, Ohio, 43210-1140
◦ United States of America
If you are going to meet someone outside the stadium, do you tell them you’ll meet them at “The Shoe” or at 411 Woody Hayes Drive?
J. E. Jones
OSU CSE 2421
This is the address for The Schott 555 Borror Drive
Columbus, Ohio, 43210-1187 United States of America
J. E. Jones
OSU CSE 2421
What about the CSE Department Office?
Did you know the address for it is:
◦ Room 395
◦ Dreese Laboratories,
◦ 2015 Neil Avenue,
◦ Columbus, Ohio, 43210-1277 ◦ United States of America
If you need to go to the CSE Department office, do you tell someone you’re going to the “CSE office” or to Room 395 at 2015 Neil Avenue??
J. E. Jones
OSU CSE 2421
In all instances, it’s easier to talk to someone who is “local” about Ohio Stadium or The Shoe or The Schott or CSE Office, right?
But, if someone is not “local”, is not familiar with local buildings, or is trying to send something to one of these locations via FedEx or USPS, or doesn’t care what’s there, just needs to go there, we’d give them the real address, right?
Sometimes people use longitude/latitude values, right?
J. E. Jones
OSU CSE 2421
Consider that C variable names are like a building name: Dreese_Labs
Ohio_Stadium
McPherson_Hall
Each of these places has an address! Dreese_Labs: 2015 Neil Avenue
Ohio_Stadium: 411 Woody Hayes Drive McPherson_Hall: 140 W 18th Avenue
Each variable you declare also has an address
We have places (variable names) where we store building names, but we also have places (other variable names) where we can store their associated addresses. Hmmmm…..
J. E. Jones
OSU CSE 2421
The basic concept of a pointer is really rather simple: it is a variable that stores the address of a memory location.
◦ Sometimes, that memory location can also be referenced by way of a variable name
◦ For example, int b;
int *int_ptr;
If we say int_ptr = &b;, then we can get to the same memory location either by using the variable b or by referencing the address that is stored in int_ptr.
b = 17; is equivalent to *int_ptr = 17;
J. E. Jones
OSU CSE 2421
int_ptr b or *int_ptr
address of b
17
J. E. Jones
OSU CSE 2421
• Values of variables are stored in memory, at a particular location
• Exactly where is typically unknown to a high-level language programmer and s/he doesn’t care what the address is, anyway, at least most of the time.
• One of data types that can be stored (in addition to all floating point and integer data types) is pointers!
• A location in memory is identified and referenced with an 8- byte address (when on a 64-bit machine like stdlinux).
• This address is analogous to identifying a house’s location via an address
• The “size” (e.g. char, int, etc.) can be thought to be
analogous to the size of the lot the house is on.
• An apartment in New York City
• A 5000-acre farm in Iowa
• This address can also be called a reference (but usually is not done so in C terminology).
J. E. Jones
OSU CSE 2421
Variable A B C D Name
Address 0x1000 0x1004 0x1008 0x100C 0x1010 Value 112 08 4100 00 883
• The total amount of memory used above is 20 bytes
• If these were values in our program, remembering where they are in memory, and using addresses to access them, would be cumbersome and error-prone
• That’s why we typically use identifiers for variables to associate a name with a memory location where the variable is stored; the compiler and assembler take care of mapping identifiers to addresses, which frees programmers from this burden.
• This mapping process is accomplished with what is called a Symbol Table or by relative address depending upon how it was declared.
J. E. Jones
OSU CSE 2421
We can access values within variables using identifiers in C, and we often do that.
In C, it is also possible to access the value of a variable in memory using another variable (can also be a constant) which holds the 8-byte* address of the first variable.
The second type of variable mentioned above is a pointer.
Sometimes data in C programs can only be accessed with pointers (for example, elements of arrays, or characters in strings (more on this later)).
*adddresses are 8-bytes on 64-bit processors, 4-bytes on 32-bit processors
J. E. Jones
OSU CSE 2421
• Remember, at the lowest level, data is always just a bit pattern; a given bit pattern can have multiple interpretations, depending on which type of data it encodes.
• Thetypeofdataitencodescanalsodeterminethesizeofthebitpattern.
• In the CPU (Central Processing Unit), instructions which perform a given type of operation on data of different types are different instructions (for example, integer addition and floating-point addition are performed by different assembler level instructions).
• When the compiler generates instructions for the CPU to access a value in memory, it does not make any assumptions about how to interpret that value (the bit pattern). You must explicitly tell the compiler the type of value stored there (with data type in a declaration or with an explicit cast), so that it can generate the appropriate type of assembler instruction for the CPU.
J. E. Jones
OSU CSE 2421
• When using pointers (addresses), the compiler chooses assembler instructions for the CPU to execute based upon the data type you declared the pointer to represent.
• Another way to think about this is to say, from the compiler’s perspective, it is not enough to know an address (or even a variable name) to access data.
• The compiler will always ask the question: What type of data is stored at this address (or in this variable)?
• Your code must answer this question for the compiler (with a declaration, cast, or both), or it will give you warnings or errors, and, perhaps, wrong information.
J. E. Jones
OSU CSE 2421
0000 1010 0x600200
0011 1110 0x600201
0000 0000 0010 1000 0000 1110 0x600202 0x600203 0x600204
0000 0010 0x600205
0000 1000 0x600206
0000 1010 0x600207
I am a pointer! My value is 0x600200! Look! Here’s the memory address 0x60020 and there’s data stored there!
J. E. Jones
OSU CSE 2421
0000 1010 0x600200
0011 1110 0x600201
0000 0000 0010 1000 0000 1110 0x600202 0x600203 0x600204
0000 0010 0x600205
0000 1000 0x600206
0000 1010 0x600207
What type pointer am I? That will tell me how many bytes to read and how to interpret the data.
J. E. Jones
OSU CSE 2421
0000 1010 0x600200
0011 1110 0x600201
0000 0000 0010 1000 0000 1110 0x600202 0x600203 0x600204
0000 0010 0x600205
0000 1000 0x600206
0000 1010 0x600207
If I’m a char *, then I’ll read the value 0000 1010 and interpret it as signed binary.
J. E. Jones
OSU CSE 2421
0000 1010 0x600200
0011 1110 0x600201
0000 0000 0010 1000 0000 1110 0000 0010 0000 1000 0x600202 0x600203 0x600204 0x600205 0x600206
0000 1010 0x600207
If I’m an int *, then I’ll read the value 0000 1010 0000 0000 0010 1000 0000 1110 and interpret it as signed binary.
J. E. Jones
OSU CSE 2421
If I’m a double *, then I’m reading ALL 8 bytes and interpret it as IEEE 754 double precision!
0000 1010 0011 1110 0000 0000 0010 1000 0000 1110 0000 0010 0000 1000 0000 1010 0x600200 0x600201 0x600202 0x600203 0x600204 0x600205 0x600206 0x600207
J. E. Jones
OSU CSE 2421
• Most pointers are used to manipulate data in memory.
• By using pointers to manipulate data, it is often the case that we can
• Createfasterandmoreefficientcode
• Supportdynamicmemoryallocation
• Makeexpressionscompactandsuccinct
• Providetheabilitytopassdatastructuresasparameterswithout incurring large overhead
• Protectdatapassedasaparametertoafunction
J. E. Jones
OSU CSE 2421
int A = 112; /* 0b 0000 0000 0000 0000 0000 0000 0111 0000 */ printf(“%d\n”, A); /* Prints 112, as expected */
printf(“%f\n”, A); /* Compiles and runs, but */
/* strange result: 0.000000 */
Binary value 0b 0000 0000 0000 0000 0000 0000 0111 0000 passed to both
printf() calls.
Variable A B C D Name
Address 0x1000 0x1004 0x1008 0x100C 0x1010 Value 112 08 4100 00 883
If not using memory in the way it was declared, the compiler may attempt to protect you and throw a warning or an error, but not always.
We should explicitly cast (better) to use a value as a non-declared data type or ensure that the compiler will implicitly cast correctly.
J. E. Jones
OSU CSE 2421
Variable Name
A B C D 0x1000 0x1004 0x1008 0x100C 0x1010 112 08 4100 00 883
Address (hex)
Value
(Decimal)
(0x1004)
int A = 112; int B = 8;
int *C = 4100; int D = 883;
/* equivalent to 0x1004; */
/* 112, 0b 0000 0000 0000 0000 0000 0000 0111 0000 passed*/
printf(“%u\n”, (unsigned int) A);
/* 112.00000 , 0b 0100 0010 1110 0000 0000 0000 000 00000 passed*/
printf(“%f\n”, (float) A);
These will both work!
You must tell the compiler how to generate instructions to interpret memory when you want to use a value in a way different from the way in which in was declared.
J. E. Jones
OSU CSE 2421
• Apointerisavariable(usually)oraconstant that contains the address of (that is, a reference to) another variable
• Thetypeofapointeris:pointertothetypeof data to which it points.
• Forexample,thetypeofapointerwhich points to an integer is a pointer to integer.
J. E. Jones
OSU CSE 2421
* (the unary dereference operator) is used in the declaration of a pointer type
The trick to reading pointer declarations so that they are easy to understand is to read them backward.
For example: const short *s_ptr; ◦ s_ptr is a variable
const short *s_ptr; const short *s_ptr; const short *s_ptr; const short *s_ptr;
◦ s_ptr is a pointer variable
◦ s_ptr is a pointer variable to a short
◦ s_ptr is a pointer variable to a constant short
J. E. Jones
OSU CSE 2421
Pointer variables are declared using a data type followed by an asterisk, then the pointer variable’s name.
◦ Integer declaration: int number;
◦ Integer pointer declaration: int *i_ptr;
The use of white space around the asterisk is irrelevant. The following declarations are equivalent:
◦ int* i_ptr; ◦ int * i_ptr; ◦ int *i_ptr; ◦ int*i_ptr;
J. E. Jones
OSU CSE 2421
Each pointer variable must be declared with an asterisk next to its identifier.
Consider:
◦ int *i_ptr1, i_ptr2, *i_ptr3;
i_ptr1 is an integer pointer
i_ptr2 is an integer (there is no asterisk)
i_ptr3 is an integer pointer
the names of the variables don’t affect what type they are
J. E. Jones
OSU CSE 2421
int* ptr_a; int *ptr_a;
Both are valid, but the first style leads to mistakes: ◦ int* ptr_b, ptr_c, ptr_d
ptr_b is a pointer to integer but, ptr_c and ptr_d are integers
◦ int *ptr_b, *ptr_c, *ptr_d
3 pointers to integer are declared here
Generally, the operand of the dereference operator is the identifier or expression which follows, so there must be a separate dereference operator for each pointer variable you wish to declare.
J. E. Jones
OSU CSE 2421
• & (the unary) address operator gives the “address of” a piece of data; this is always a constant. The constant that represents the address is determined by the compiler. Suppose we have:
int *p;
int c = 10;
p = &c; /* This statement assigns to the variable p the */ /* address of the variable c. So, p points to */ /* the memory location of variable c */
Assigning an address to a variable in this way can ONLY happen at run time because where the program we are running is loaded into memory can change each time it’s run.
J. E. Jones
OSU CSE 2421
There are a few different ways to print out the value of pointers.
Since a pointer is just an 8-byte string of bits, we can obviously print it out using %d
Other alternatives are:
◦ %x – display the value as a hexadecimal number
◦ %o – display the value as an octal number
◦ %p = display the value in an implementation-specific manner;
typically as a hexadecimal number
J. E. Jones
OSU CSE 2421
Variable Name Address
Value
A B C
0x1000 0x1004 0x1008
112 08 4100 00
D 0x1010 883
int B = 8; int *C; C = &B;
/* B = the 4-byte value, 0x00000008
/* Declare C to be a pointer to int */
/* and assign the variable C the address of B */ /* C = the 8-byte value, 0x0000000000001004
Now, C is an integer pointer that points to B. C is an 8-byte address that points to a 4-byte value. We can access B either through its name (identifier) or using indirection through C (See next slide).
(0x1004)
0x100C
*/
*/
J. E. Jones
OSU CSE 2421
Suppose we wanted to print the value of B.
We can access the value using the identifier B
or, using indirection, with *C:
printf(“%i”, B);
printf(“%i”, *C);
Both statements output the value of B; the second one accesses the value using indirection. The value passed to printf() in both statements is, the 4-byte value, 0x00000008.
J. E. Jones
OSU CSE 2421
• Every pointer points to a specific data type
◦ An exception is a void pointer (a generic pointer); a pointer to void
holds the address of a value of any type but can’t be dereferenced (i.e. cannot be used to get the “contents of” another memory location through indirection) without casting. We will learn more about the use of pointers to void soon.
• PLEASE do not say “this is a pointer in my program.” Instead, say “this is an integer pointer” or “this is a float pointer”, or “this is a void pointer”, etc.
• This is not just being picky. Making use of pointers in C programs without all kinds of strange errors requires always paying attention to the type of data to which the pointer points!
Examples:
unsigned int *p; char *c;
void *x;
int **y;
/* p is a pointer to unsigned int*/
/* c is a char pointer */
/* x is a void pointer */
/* y is a pointer to an integer pointer */
J. E. Jones
OSU CSE 2421
• unsigned int *x;
• Read as: “declare x as a pointer to an unsigned
integer”
• Interpretation: declare x as a variable that holds the numeric (8-byte) address of a location in memory at which bits are stored that we intend to manipulate as an unsigned (4-byte) integer
• At this point, x does not contain a valid value. Why?
J. E. Jones
OSU CSE 2421
• unsigned int *x;
• Read as: “declare x as a pointer to an unsigned
integer”
• Interpretation: declare x as a variable that holds the numeric (8-byte) address of a location in memory at which bits are stored that we intend to manipulate as an unsigned (4-byte) integer
• At this point, x does not contain a valid value. Why?
• It is good practice to initialize a pointer as soon as possible
J. E. Jones
OSU CSE 2421
1000
1004 2048
Address in memory Value
Variable (identifier)
50
0 1000
var
(normal variable)
var2 ptr (normal variable) (pointer)
int var = 50, var2 = 0; /* access var directly – through its name */
int *ptr; /* declare ptr to be a pointer to int */
ptr = &var; /* *ptr points to var (i.e. ptr contains 1000 ) */
J. E. Jones
OSU CSE 2421
1000
1004 2048
0 1000
Address in memory Value
Variable (identifier)
50
var
(normal variable)
var2 ptr (normal variable) (pointer)
int var = 50, var2 = 0; /* access var directly – through its name */
int *ptr; ptr = &var; *ptr = 1;
/* declare ptr to be a pointer to int */
/* *ptr points to var (i.e. ptr contains 1000 ) */
/* Access var using indirection, that is, through */ /* the address in ptr */ /* so var and var 2 now equal what? */
J. E. Jones
OSU CSE 2421
1000
1004 2048
1 0 1000
Address in memory Value
Variable (identifier)
var2 ptr (normal variable) (normal variable) (pointer)
var
int var = 50, var2 = 0; /* access var directly – through its name */
int *ptr; ptr = &var; *ptr = 1;
/* declare ptr to be a pointer to int */
/* *ptr points to var (i.e. ptr contains 1000 ) */
/* Access var using indirection, that is, through */ /* the address in ptr */ /* so var and var 2 now equal what? */
var now equals 1, var2 still equals 0
J. E. Jones
OSU CSE 2421
1000 1004 2048
1 0 1000
Address in memory Value
Variable (identifier)
var var2 ptr (normal variable) (normal variable) (pointer)
int var = 50, var2 = 0; /* access var directly – through its name */
int *ptr; ptr = &var; *ptr = 1;
/* declare ptr to be a pointer to int */
/* *ptr points to var (i.e. ptr contains 1000 ) */
/* Access var using indirection, that is, through */ /* the address in ptr */ /* so var and var 2 now equal what? */
/* Access var using indirection, that is, through */ /* the address in ptr */
var2 = *ptr;
What value does the variable var2 have?
J. E. Jones
OSU CSE 2421
1000 1004 2048
1 1 1000
Address in memory Value
Variable (identifier)
var var2 ptr (normal variable) (normal variable) (pointer)
int var = 50, var2 = 0; /* access var directly – through its name */
int *ptr; ptr = &var; *ptr = 1;
/* declare ptr to be a pointer to int */
/* *ptr points to var (i.e. ptr contains 1000 ) */ /* Access var using indirection, that is, through */
/* the address in ptr */
/* Access var using indirection, that is, through */ /* the address in ptr */
var2 = *ptr;
What value does the variable var2 have? It is set to 1!
J. E. Jones
OSU CSE 2421
•
* (unary, not the arithmetic operator) is a dereferencing operator when applied to pointers
◦ When applied to a pointer, it accesses the data (i.e., the bits in memory) the pointer points to
◦ * in front of a pointer variable means “get (or set) the value at that address” i.e. “contents of” (what the pointer points to)
“get” if it is an Rvalue “set” if it is an Lvalue
Reading data through indirection: int a;
int b = 25;
int *p;
p = &b;
a = *p; means get the value at the address stored in p and assign it
to the variable a
Writing data through indirection:
*p = 12; means set the 4 bytes (because it’s an integer *) starting at
the address stored in p to the value 12. The full 4-byte value stored would be 0x0000000C.
J. E. Jones
OSU CSE 2421
• Example: y = *int_ptr + 1 takes whatever int_ptr points at, adds 1, and assigns the result to y
• Other ways to increment by 1:
◦ *int_ptr += 1*int_ptr = *int_ptr + 1 ◦ ++*int_ptr
◦ (*int_ptr)++
The parentheses are necessaryin the last example; without them, the expression would increment int_ptr (so that it points to the following address in memory) instead of what it should point to, because post-fix increment has higher precedence than the dereference operator, *, so without parentheses, the compiler will treat the expression as *(int_ptr++)
J. E. Jones
OSU CSE 2421
1000 1004 2048 Address in memory
1 1 1000 var var2 ptr
Value
Variable (identifier)
(normal variable) (normal variable) (pointer)
int var = 50, var2 = 0; /* access var directly – through its name */
int *ptr; ptr = &var; *ptr = 1;
/* declare ptr to be a pointer to int */
/* *ptr points to var (i.e. ptr contains 1000 ) */ /* Access var using indirection, that is, through */
/* the address in ptr */
var2 = *ptr + 1; /* Access var using indirection, that is, through */ /* the address in ptr */
What value does the variable var2 have?
J. E. Jones
OSU CSE 2421
1000 1004 2048 1 2 1000
Address in memory Value
Variable (identifier)
var var2 ptr (normal variable) (normal variable) (pointer)
int var = 50, var2 = 0; /* access var directly – through its name */
int *ptr; ptr = &var; *ptr = 1;
/* declare ptr to be a pointer to int */
/* *ptr points to var (i.e. ptr contains 1000 ) */
/* Access var using indirection, that is, through */ /* the address in ptr */
var2 = *ptr + 1; /* Access var using indirection, that is, through */ /* the address in ptr */
What value does the variable var2 have? It is set to 2!
J. E. Jones
OSU CSE 2421
•
Pointers are variables so they can be used without
dereferencing.
Example:
◦ int x, *iq, *ip=&x;
/*declares 3 variables, 2 of which are integer pointers*/
iq = ip;
/* Copies the contents of ip (an address) into iq, making iq point to whatever ip points to */
IMPORTANT NOTE:
int x;
int *ip = &x;
is equivalent to:
int x;
int *ip;
ip = &x; /* &x is assigned to ip, NOT *ip */
J. E. Jones
OSU CSE 2421
• int **x; /* assume integer is 32 bits */
• Read as: declare x as a pointer to a pointer to an
integer (or a pointer to an integer pointer)
• Interpretation: Declare x as an 8-byte variable that holds the numeric address at which is another 8-byte numeric address at which are 32 bits (4-bytes) that we intend to manipulate as a signed integer
8-byte variable x 8-byte value
4-byte int variable
An address An address
J. E. Jones
OSU CSE 2421
• inti=5;
• char *y = (char *)(&i);
• Declares i as an integer and puts 5 in that location, then declares y as a pointer to a character and assigns to y the address of i cast to (interpreted as) a pointer to a character.
• &i and y are both the same numeric value pointing to the same piece of memory
• Dereferencing y without casting hereafter generates instructions that operate on chars at i’s memory address instead of integers
• Note that, in casts, the dereference operator follows the type name: (int *) OR (float *) OR (char *) etc.
J. E. Jones
OSU CSE 2421
int = 5;
char *y = (char*)(&i);
Really? Both point to the value 5???
At least on SOME machines. CSE servers are ones where it will be true because CSE servers use “little endian” byte ordering.
Read Section 2.1.3 in Bryant/O’Hallaron
If you want to do this type of thing, better to test the result on your server to ensure your code does what you think it is doing.
We’ll work more with “endian” later in the semester.
J. E. Jones
OSU CSE 2421
#include
{
int i = 0x02030405; char *y;
Output is:
y = (char*)(&i);
printf(“the address of i is %x\n”, &i); printf(” the value of y is %x\n”, y);
[jones.5684@sl6 test]$ test_ptr the address of i is 46135f44 the value of y is 46135f44
y points to the value 5
printf(” y points to the value %i\n”, *y); printf(” y+1 points to the value %i\n”, *(y+1)); printf(” y+2 points to the value %i\n”, *(y+2)); printf(” y+3 points to the value %i\n”, *(y+3));
y+1 points to the value 4 y+2 points to the value 3 y+3 points to the value 2 [jones.5684@sl6 test]$
return(0); }
J. E. Jones
OSU CSE 2421
• Every pointer points to a specific data type.
◦ The only exception: ‘‘pointer to void’’ is used to hold the
address of any type but cannot be dereferenced without a cast (more later)
• If ip points to the integer x, (ip = &x) then *ip can occur in any context where x could
◦ Example: *ip = *ip + 10 is equivalent to x=x+10; This increments the value stored at the address in ip by 10
• The unary operators * and & have higher precedence than arithmetic operators
J. E. Jones
OSU CSE 2421
• Inadeclaration
◦ * says “I am a variable that contains an address” that points
to a certain type of value • Inastatement
◦ & = “get the address of a variable”
◦ * = “access (get/read or set/write) the value at the address
stored in the variable which follows the dereference operator”
J. E. Jones
OSU CSE 2421
• No matter how complex a pointer structure gets, the list of rules remains short:
• A pointer stores a reference to its pointee. The pointee, in turn, stores something useful. The reference is a memory address.
• The dereference operation on a pointer accesses its pointee. A pointer may only be dereferenced after it has been assigned a value. Most pointer bugs involve violating this one rule.
• Allocating a pointer does not automatically assign it to refer to a pointee. Assigning the pointer to refer to a specific pointee is a separate operation which is easy to forget.
• Assignment between two pointers makes them refer to the same pointee which introduces sharing.
• NOTE: A “pointee” is a variable whose address is assigned to be the value of a pointer.
J. E. Jones