CS - C++ Lesson 31:  Pointers & Dynamic Data Structures

Pointer variables open up a new world of programming. The power and creativity of pointer-based data structures give the programmer a much broader range of tools for problem solving.  This lesson will serve as an introduction to pointers and an overview of pointer-based data structures. 

The key topics for this lesson are:

A.  Pointer Variables
B.   Syntax of Pointer Variables
C.  Allocation of Dynamic Memory
D.  Pointer Variable Manipulations
E.   Dynamic Data Structures

VOCABULARY:                   POINTER VARIABLES         ADDRESS
                                                INDIRECTION                       NEW
                                                DELETE                                  NULL
                                                STATIC                                   DYNAMIC
                                                DEREFERENCE

DISCUSSION:                       A.  Pointer Variables

  1.   A pointer variable is a variable that stores a different kind of value, an address.

2.   The random access memory of a computer is mapped with locations called addresses.  But instead of 123 Main Street, memory addresses are labeled with hexadecimal numbers like F5A2.  Fortunately, we will not have to worry about or work with hexadecimal numbers (base 16), but it is helpful to know what type of values are stored in pointer variables.

3.   Indirection is a technique where information directs you to the actual location of an object.  This concept of indirection is what makes pointer variables difficult to understand at first. 

4.   Suppose you are driving from San Jose to San Francisco on Interstate 280.  When a large sign stating “San Francisco” appears above the road, it is a pointer directing you to the object you are seeking.  The sign is not the city of San Francisco, it is a method of indirection pointing you to your destination.

5.   A pointer variable is like a traffic sign.  It will provide information about where data is stored.  It points to a memory location.

B.   Syntax of Pointer Variables

1.   A pointer variable is declared by using the asterisk operator (*) with already defined or existing types.  In a declaration, the asterisk operator (*) must precede the identifier being declared as a pointer.

int  *a, *b;            // a and b are pointer variables 

2.   Each of the variables a and b will not directly store integers.  They will store the addresses where an integer is stored in memory.

3.   A declared pointer in C++ is uninitialized and therefore contains a garbage address value.  To use such garbage addresses in a program is to invite disaster and usually results in a program crash.

4.   Pointer variables are often used in conjunction with the address operator (&) which returns the address of a variable.  Here is a familiar example, used in Lesson 30.

Program 31-1

#include <iostream.h>

main
()
{
        int  num = 5;
        int  *where;          // where is a pointer, stores an address
        where = &num;    // where now stores the address of num
        cout << "The address stored in where = " << where << endl;
        cout << "The contents of *where = " << *where << endl;
        return 0;
}

        Run output (the address is machine-dependent):
     The address stored in where = a7b2
      The contents of *where = 5  

5.   The declaration  int  *where  implements a pointer variable.  The statement  where = &num  assigns the address of num to the pointer variable where.  The address stored in where is machine-dependent.

6.   The (*) operator is called the dereferencing operator.  This unary operator returns the contents of the address it is applied to.  The effect of  *where  is to return the contents of address a7b2.  Another way to translate  *where  is to think of it as *(a7b2).  The first cout statement prints out the address stored in where, address a7b2.  The second cout statement prints out the contents of *where, which is 5.

7.   Notice that the (*) operator is used in two different ways in this program.  When it is part of a variable declaration (int  *where) it is setting up a pointer variable.  When it is used as part of a statement, (*where) it is dereferencing a memory address.

8.   A pointer variable can point to no address.  This is accomplished in one of two ways.  Assume a is a pointer variable.

      a = 0;

      a = NULL;

      In either case, the pointer variable a is holding no address.

C.  Allocation of Dynamic Memory

 1.   The term dynamic refers to change.  A program which uses dynamic memory is one which adjusts the amount of memory used depending on the changing needs of a program.  For example a small word processing document needs little memory.  As the document gets longer, the word processing program allocates more memory to store the growing number of characters.

 2.   When a C++ program is run, a certain segment of RAM is available for the program to use.  This stack of memory is called a heap.  Memory is allocated for use from the top of the heap.

 3.   C++ provides a new command which allocates memory for use with pointer variables.  The general form of the new command is the following:

       new  type

       This command will return a block of memory large enough to hold an object with the specification of type.  If the allocation is successful, the operation returns the starting address of a block of memory, otherwise it returns the NULL address 0.

 4.   The new command is applied to a specific type and the memory block returned is stored in a pointer variable.  For example the code

       int  *a;

      a = new int;

       allocates a block of memory (address D2F6 in Diagram 31-1) to store an integer and returns that address to a.  It is in this address that the integer value will eventually be stored.

 5.   To store an integer in this newly allocated memory location requires the use of the indirection (asterisk) operator, *.

 *a = 5; 

    Diagram 31-1

                  

a.   The memory location allocated is D2F6.

b.   The pointer variable a stores a memory location, D2F6.

c.   The indirection operator (*) performs a dereference.  In a sense, the pointer variable a is a reference variable for another address.  However that address has no other variable name.  The statement (*a) means to dereference a and work directly with the memory location that was aliased.

d.   Another way of thinking about the statement (*a) is "to inspect the memory location referenced by a."

e.   This is an example of assignment to a location through a pointer variable.           

6.   The above is indirection in action.  The pointer variable a does not store an integer; it redirects you to where the desired information is located.

7.   We do not need to know the actual memory locations although C++ will allow us to print out such information.  The statement

      cout << a;

      will print out the address D2F6 if applied to diagram 31-1.

8.   To streamline our thinking about pointers and data structures it will be best to eliminate the actual addresses from our pictures.  Therefore, we will use abbreviated terminology and diagrams to illustrate pointers.  The previous diagram as shown above in key topic C, #5, will now be shown as the following:

     

D.  Pointer Variable Manipulations

1.   Assume the following declarations apply in section D.

 int  *a, *b;

 2.   This fragment of code demonstrates the assignment of one address stored by a pointer to another pointer.  Here is another way of thinking about this assignment:  a pointer is made to point to the same address as another pointer.

 a = new int;

*a = 5;

b = a;

 

 a.   The new command allocates memory (C6F2) and a stores this address.

b.   The statement  *a= 5;  places the integer value 5 in the address which the pointer a is storing.

c.   The pointer  b  now stores the same memory location (C6F2) which  a  is storing.  We now have two pointers pointing to the same memory address (C6F2). 

 3.   This next example demonstrates the transfer of data from one memory location to another, using pointer variables.

 a = new int;

b = new int;

*a = 8;

*b = 10;

*b = *a;

 

 a.   a points to an address holding the integer value 8 and b points to the address holding the integer value 10.

b.   In the statement  *b = *a, the value of 8 is transferred from one memory location to another.  We are using the pointer variables, going through them to access the integer values. 

 4.   A last example illustrates something we do not want to do with pointer variables:

 a = new int;

b = new int;

*a = 2;

*b = 5;

b = a;

            

 a.   The last statement causes the pointer variable b to receive the address stored in a.  This is legal code, except we have lost track of the original address of b and the value of 5 stored in that address.  There will be occasions when we wish to eliminate unwanted information stored in data structures, but we do not want to waste memory.

 b.   To avoid losing track of memory and wasting space, the delete operation is needed.  The delete operation causes the address stored by a pointer variable to be returned to the heap. 

 a = new int;

b = new int;

*a = 2;

*b = 5;

delete b;

b = a;

      

 c.   The delete operation returns unwanted memory back to the heap, which will be the first memory allocated when the next new operation is invoked.

 5.   And finally, a look at an incorrect pointer statement:

 a = 5;                               

      This attempts to assign the memory location 5 to the pointer a.  While this is possible in C++, you do not want to work with random memory locations.  Memory location 5 could be the location of critical system information.

E.   Dynamic Data Structures

1.   During the next month of the course, you will learn how to use pointer variables to build dynamic data structures.

2.   A dynamic data structure is one which is created during the running of the program.  Its size and memory usage fluctuate according to the needs of the program.

3.   The apvector class created objects that had characteristics of both static and dynamic data structures.  When the vector was allocated it had a certain fixed size of cells.  It is possible to resize a vector but constant resizing will slow down the program.

4.   A linked list is conceptually similar to a vector but the resizing will be handled in a more efficient and direct manner.

5.   The two most important types of dynamic data structures you will learn about are linked lists and binary trees.

SUMMARY/REVIEW:         Like a single record variable, a single pointer variable is interesting but not very useful.  It is when we put pointer variables together in a data structure that pointers become powerful programming tools.

ASSIGNMENT:                    Worksheet W.A.31.1, Pointer Type