====== Pointers 1 ====== Pointer fundamentals, syntax, and operations.((Portions loosely adapted from: Deitel, Harvey M., and Paul J. Deitel. "Pointers and Strings." In //C++: How to Program//. 3 ed. Upper Saddle River, NJ: Prentice Hall, 2001. 304-388.)) ===== Introduction ===== The **pointer** is a concept found in many programming languages. Pointers in C++ are especially important because in C++ there is a close relationship between pointers and arrays, C strings, and references. Thoroughly understanding pointers will give the programmer amazing superpowers and rockstar incredibleness. ===== Address of a variable ===== Before we go any further, you will want to remember that each variable in a program is stored in a particular location in computer memory and that these locations are identified internally using memory addresses. Depending on the size of the variable, the location may occupy a block from one to several bytes. The **base address** of a variable is the memory address corresponding to the starting point of this block. ===== Pointer variables ===== You can think of a **pointer variable** as a variable that stores the base address of some other variable. And? For the moment, let's not consider why you would ever want to do such a thing. It turns out there are lots of cool things you can do with pointers, but you'll just have to trust me on this. Right now, let's just try to get our heads around the concept a little better. There are two ways you can think about pointers. The first way relates pointers to what is going on //vis-a-vis// the computer’s hardware. The second is more abstract, visual, and high-level. Both ways have their advantages. Usually, an individual programmer becomes more comfortable using one or the other; however, understanding both will be helpful necessary no matter which model you prefer. ==== Pointers in relation to computer hardware ==== A regular variable occupies a fixed set of bytes in memory. For example, let's assume we want to create an ''int'' named ''count'' and store the value 7 in it. If an ''int'' on our system is 4 bytes, when we run the program, the operating system may assign the memory locations 58000, 58001, 58002, and 58003 to ''count''. If we examine the contents of memory locations 58000 to 58003 as a unit, we would see a 4 * 8 = 32 bit pattern that represents the value 7, as tabulated below. ^ Variable name ^ Memory location ^ Memory contents across all four bytes (32 bits)^ |''count''|58000|''00000000000000000000000000000111'' (i.e. 7 in decimal)| |:::|58001|:::| |:::|58002|:::| |:::|58003|:::| If we were now to ask what the address of ''count'' is (which we will learn how to do shortly), the system would tell us ''count'' is located at its base address---58000 in this case.((FIXME This seems to be true for both little-endian and big-endian systems, but I need a decent reference on this to say this with certainty.)) So, now let's say that (for whatever reason) we wanted to create a variable to store the address where ''count'' is located. That variable is what we would call a //pointer//, because it "points to" a block of memory. Extending the example above, let's say that we create a pointer variable named ''countPtr'' and set it to point to the ''int'' that is ''count''. Furthermore, let's say that pointer variables on our system are stored in 8 byte (i.e., 64 bit) units, and that when we run the program the operating system places ''countPtr'' at an 8-byte block starting at 64002. That would yield something that looks like: ^ Variable name ^ Memory location ^ Memory contents across all 8 bytes (64 bits) ^ |''countPtr''|64002|''0000000000000000000000000000000000000000000000001110001010010000'' (i.e., 58000 in decimal)| |:::|64003|:::| |:::|64004|:::| |:::|64005|:::| |:::|64006|:::| |:::|64007|:::| |:::|64008|:::| |:::|64009|:::| Note that while ''count'' occupies a total of four bytes of memory (bytes 58000 to 58003), ''countPtr'' stores only the address of the first byte of count. ==== Pointers in the abstract ==== In the abstract, visual, high-level model of pointers, we can think of them as graphically pointing to other memory locations. For example, given the ''count'' variable used above: {{ :cplusplus:ptr01.png?nolink |}} the ''countPtr'' pointer variable above would be represented as: {{ :cplusplus:ptr02.png?nolink |}} In this diagram, the //value// of the pointer ''countPtr'' is the box the arrow points to (i.e., the integer ''count''). Just as the value of a normal variable can change, the value of a pointer can change as well. Let’s say in addition to the variable ''count'' we also have a variable ''num'' that holds the value 2. If we change the value of ''countPtr'' to point to ''num'', the diagram would now look like: {{ :cplusplus:ptr03.png?nolink |}} ==== Pointer syntax basics ==== === Declaration === In C++, the ''*'' character is used to indicate pointer variables in declarations and function parameter lists. The code fragment below creates a pointer variable named ''myPtr'' that can point to a block of memory that stores an ''int'': int *myPtr; // declare a pointer to an int The location of the ''*'' character is flexible: it can be placed before the name of the pointer (as in the above example), immediately after the type, or separated from both the type and the name by whitespace. All the declarations below are equivalent: int *myPtr; int* myPtr; int * myPtr; You can declare more than one pointer variable at a time: int *myPtr, *anotherOne; // declare two pointers to an int But be careful: the following actually creates one pointer and one ''int'': int* myPtr, anotherOne; // a pointer and an int Pointers can be created to point to any type: char *yourPtr; // declare a pointer to a char double *zerPtr; // declare a pointer to a double string *hyPtr; // declare a pointer to a string === Assignment === To set the value of a pointer, you need the address of something. C++ has an **address operator**, ''&'', that returns the base address of a variable: int num = -42; cout << num << endl; // prints value held in variable num cout << &num << endl ; // prints the base address of variable num Most environments show base addresses as [[http://www.cplusplus.com/doc/hex/|hexadecimal]] numbers by default. The example below declares an integer variable ''y'' that is initialized to 5 and a pointer variable to an ''int'' named ''myPtr'' that is set to store the address of (i.e., "point to") ''y''. int y = 5; // declare an integer variable y int *myPtr; // declare a pointer to int myPtr = &y; // myPtr gets address of ("points to") y The result of this code fragment may be diagrammed as follows: {{:cplusplus:ptr04a.png?nolink|}} and when run may generate the following memory map: ^ Variable name ^ Memory location ^ Memory contents ^ |''y''|52000|5| |:::|52001|:::| |:::|52002|:::| |:::|52003|:::| |...|...|...| |''myPtr''|63002|52000| |:::|63003|:::| |:::|63004|:::| |:::|63005|:::| |:::|63006|:::| |:::|63007|:::| |:::|63008|:::| |:::|63009|:::| You can change the value of pointer variables (that's why it's called a //variable//!): double z = 3.33; double x = 42.0; double *myPtr; myPtr = &z; // myPtr gets address of z myPtr = &x; // myPtr now has address of x === Initialization === Pointer variables in C++ are not automatically initialized. This means that a pointer variable declaration along the lines of double *myPtr; leaves the pointer pointing to an arbitrary memory location. This is dangerous because poking your fingers into arbitrary memory locations is a good way to corrupt your program's memory or create other headaches. Pointer variables can be initialized when declared. It is good programming practice to always initialize pointers so they do not accidentally point to unknown memory locations. The code below initializes the value of the pointer variable in the declaration: int y = 5; int *myPtr = &y; // myPtr gets address of y ==== nullptr/NULL pointers ==== You can set a pointer to a special value that indicates that it is //pointing to nothing//. This special value has the name ''nullptr'' (C++11) or ''NULL''. If the memory location that a pointer variable will point to isn’t known at the time it is declared, then you should initialize it to ''nullptr'' or ''NULL''. This will prevent the pointer from accidentally pointing to arbitrary memory locations. Below is an example of initializing a pointer variable to ''NULL'' when declaring it. int *yourPtr = nullptr; // yourPtr points to nothing int y = 5; yourPtr = &y; // yourPtr gets address of y Note how ''yourPtr'' is initialized to ''nullptr'' even though we change the pointer immediately afterwards. We did this because it's a good habit to initialize your pointers so they //never// point to unknown memory locations. ''nullptr'' and ''NULL'' have an integer value of 0, so sometimes you will see pointer variables set to 0 instead. Similarly, the character '''\0''' also has an integer value of 0, so sometimes you will see pointers set to '''\0'''. All three have the same effect: they make the pointer point to nothing. ==== Pointer operators ==== You were introduced to the **address operator**, ''&'', above. The address operator is one of two fundamental pointer operators. The other is the **indirection** or **dereferencing operator**. === Indirection/dereferencing operator === The **indirection** or **dereferencing** operator, ''*'', accesses the value of what its operand points to. For example, if ''myPtr'' points to variable ''y'', ''*myPtr'' returns the value stored in ''y''. int y = -1; // declare y and initialize its value int *myPtr = &y; // declare pointer and set it to point to y cout << *myPtr; // prints -1 The ''*'' operator can be used to assign a value to a location in memory: int y = -1; // declare y and initialize its value int *myPtr = &y; // declare pointer and set it to point to y *myPtr = 7; // change value in y to 7 cout << y; // prints 7 You can think of the indirection/dereferencing operator as meaning, “the_thing_at_the_end_of_”, as in: the_thing_at_the_end_of_myPtr = 7; The ''*'' and ''&'' operators complement each other. In other words, the following expressions will all evaluate as true: *&y == y &*myPtr == myPtr *&myPtr == myPtr ==== Example ==== The example below demonstrates the use pointers, including address and dereferencing operators. /** Demonstrate basic pointer usage. */ #include using namespace std; int main() { int a; // a is an integer int *aPtr = nullptr; // aPtr is a pointer to an integer a = 7; // give a a value aPtr = &a; // set aPtr to the address of a cout << "The value of a is: " << a << endl << "The address of a is: " << &a << endl << "The value of aPtr is: " << aPtr << endl; cout << endl; cout << "The value of a is: " << a << endl << "The value of *aPtr is: " << *aPtr << endl; cout << endl; cout << "Showing that * and & are inverses of each other:" << endl << "&*aPtr = " << &*aPtr << endl << "*&aPtr = " << *&aPtr << endl; return 0; }