Pointer fundamentals, syntax, and operations.1)
The pointer is a concept found in many programming languages. Pointers in C++ are especially important because in C++ there is a close relationship between pointers and arrays, C strings, and references. Thoroughly understanding pointers will give the programmer amazing superpowers and rockstar incredibleness.
Before we go any further, you will want to remember that each variable in a program is stored in a particular location in computer memory and that these locations are identified internally using memory addresses. Depending on the size of the variable, the location may occupy a block from one to several bytes. The base address of a variable is the memory address corresponding to the starting point of this block.
You can think of a pointer variable as a variable that stores the base address of some other variable.
And?
For the moment, let's not consider why you would ever want to do such a thing. It turns out there are lots of cool things you can do with pointers, but you'll just have to trust me on this. Right now, let's just try to get our heads around the concept a little better.
There are two ways you can think about pointers. The first way relates pointers to what is going on vis-a-vis the computer’s hardware. The second is more abstract, visual, and high-level. Both ways have their advantages. Usually, an individual programmer becomes more comfortable using one or the other; however, understanding both will be helpful necessary no matter which model you prefer.
A regular variable occupies a fixed set of bytes in memory. For example, let's assume we want to create an int
named count
and store the value 7 in it. If an int
on our system is 4 bytes, when we run the program, the operating system may assign the memory locations 58000, 58001, 58002, and 58003 to count
. If we examine the contents of memory locations 58000 to 58003 as a unit, we would see a 4 * 8 = 32 bit pattern that represents the value 7, as tabulated below.
Variable name | Memory location | Memory contents across all four bytes (32 bits) |
---|---|---|
count | 58000 | 00000000000000000000000000000111 (i.e. 7 in decimal) |
58001 | ||
58002 | ||
58003 |
If we were now to ask what the address of count
is (which we will learn how to do shortly), the system would tell us count
is located at its base address—58000 in this case.2)
So, now let's say that (for whatever reason) we wanted to create a variable to store the address where count
is located. That variable is what we would call a pointer, because it “points to” a block of memory. Extending the example above, let's say that we create a pointer variable named countPtr
and set it to point to the int
that is count
. Furthermore, let's say that pointer variables on our system are stored in 8 byte (i.e., 64 bit) units, and that when we run the program the operating system places countPtr
at an 8-byte block starting at 64002. That would yield something that looks like:
Variable name | Memory location | Memory contents across all 8 bytes (64 bits) |
---|---|---|
countPtr | 64002 | 0000000000000000000000000000000000000000000000001110001010010000 (i.e., 58000 in decimal) |
64003 | ||
64004 | ||
64005 | ||
64006 | ||
64007 | ||
64008 | ||
64009 |
Note that while count
occupies a total of four bytes of memory (bytes 58000 to 58003), countPtr
stores only the address of the first byte of count.
In the abstract, visual, high-level model of pointers, we can think of them as graphically pointing to other memory locations. For example, given the count
variable used above:
the countPtr
pointer variable above would be represented as:
In this diagram, the value of the pointer countPtr
is the box the arrow points to (i.e., the integer count
).
Just as the value of a normal variable can change, the value of a pointer can change as well. Let’s say in addition to the variable count
we also have a variable num
that holds the value 2. If we change the value of countPtr
to point to num
, the diagram would now look like:
In C++, the *
character is used to indicate pointer variables in declarations and function parameter lists. The code fragment below creates a pointer variable named myPtr
that can point to a block of memory that stores an int
:
int *myPtr; // declare a pointer to an int
The location of the *
character is flexible: it can be placed before the name of the pointer (as in the above example), immediately after the type, or separated from both the type and the name by whitespace. All the declarations below are equivalent:
int *myPtr; int* myPtr; int * myPtr;
You can declare more than one pointer variable at a time:
int *myPtr, *anotherOne; // declare two pointers to an int
But be careful: the following actually creates one pointer and one int
:
int* myPtr, anotherOne; // a pointer and an int
Pointers can be created to point to any type:
char *yourPtr; // declare a pointer to a char double *zerPtr; // declare a pointer to a double string *hyPtr; // declare a pointer to a string
To set the value of a pointer, you need the address of something. C++ has an address operator, &
, that returns the base address of a variable:
int num = -42; cout << num << endl; // prints value held in variable num cout << &num << endl ; // prints the base address of variable num
Most environments show base addresses as hexadecimal numbers by default.
The example below declares an integer variable y
that is initialized to 5 and a pointer variable to an int
named myPtr
that is set to store the address of (i.e., “point to”) y
.
int y = 5; // declare an integer variable y int *myPtr; // declare a pointer to int myPtr = &y; // myPtr gets address of ("points to") y
The result of this code fragment may be diagrammed as follows:
and when run may generate the following memory map:
Variable name | Memory location | Memory contents |
---|---|---|
y | 52000 | 5 |
52001 | ||
52002 | ||
52003 | ||
… | … | … |
myPtr | 63002 | 52000 |
63003 | ||
63004 | ||
63005 | ||
63006 | ||
63007 | ||
63008 | ||
63009 |
You can change the value of pointer variables (that's why it's called a variable!):
double z = 3.33; double x = 42.0; double *myPtr; myPtr = &z; // myPtr gets address of z myPtr = &x; // myPtr now has address of x
Pointer variables in C++ are not automatically initialized. This means that a pointer variable declaration along the lines of
double *myPtr;
leaves the pointer pointing to an arbitrary memory location. This is dangerous because poking your fingers into arbitrary memory locations is a good way to corrupt your program's memory or create other headaches.
Pointer variables can be initialized when declared. It is good programming practice to always initialize pointers so they do not accidentally point to unknown memory locations. The code below initializes the value of the pointer variable in the declaration:
int y = 5; int *myPtr = &y; // myPtr gets address of y
You can set a pointer to a special value that indicates that it is pointing to nothing. This special value has the name nullptr
(C++11) or NULL
.
If the memory location that a pointer variable will point to isn’t known at the time it is declared, then you should initialize it to nullptr
or NULL
. This will prevent the pointer from accidentally pointing to arbitrary memory locations.
Below is an example of initializing a pointer variable to NULL
when declaring it.
int *yourPtr = nullptr; // yourPtr points to nothing int y = 5; yourPtr = &y; // yourPtr gets address of y
Note how yourPtr
is initialized to nullptr
even though we change the pointer immediately afterwards. We did this because it's a good habit to initialize your pointers so they never point to unknown memory locations.
nullptr
and NULL
have an integer value of 0, so sometimes you will see pointer variables set to 0 instead. Similarly, the character '\0
' also has an integer value of 0, so sometimes you will see pointers set to '\0
'. All three have the same effect: they make the pointer point to nothing.
You were introduced to the address operator, &
, above. The address operator is one of two fundamental pointer operators. The other is the indirection or dereferencing operator.
The indirection or dereferencing operator, *
, accesses the value of what its operand points to. For example, if myPtr
points to variable y
, *myPtr
returns the value stored in y
.
int y = -1; // declare y and initialize its value int *myPtr = &y; // declare pointer and set it to point to y cout << *myPtr; // prints -1
The *
operator can be used to assign a value to a location in memory:
int y = -1; // declare y and initialize its value int *myPtr = &y; // declare pointer and set it to point to y *myPtr = 7; // change value in y to 7 cout << y; // prints 7
You can think of the indirection/dereferencing operator as meaning, “the_thing_at_the_end_of_”, as in:
the_thing_at_the_end_of_myPtr = 7;
The *
and &
operators complement each other. In other words, the following expressions will all evaluate as true:
*&y == y &*myPtr == myPtr *&myPtr == myPtr
The example below demonstrates the use pointers, including address and dereferencing operators.
/** Demonstrate basic pointer usage. */ #include <iostream> using namespace std; int main() { int a; // a is an integer int *aPtr = nullptr; // aPtr is a pointer to an integer a = 7; // give a a value aPtr = &a; // set aPtr to the address of a cout << "The value of a is: " << a << endl << "The address of a is: " << &a << endl << "The value of aPtr is: " << aPtr << endl; cout << endl; cout << "The value of a is: " << a << endl << "The value of *aPtr is: " << *aPtr << endl; cout << endl; cout << "Showing that * and & are inverses of each other:" << endl << "&*aPtr = " << &*aPtr << endl << "*&aPtr = " << *&aPtr << endl; return 0; }