C : Introduction to pointers

Last updated, April 1, 1997
Written by David Reilly, 1997
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Pointers in C, while being an exceedingly important aspect of the language, can also be exceedingly difficult for the novice programmer. We'll start by covering an overview of pointers and arrays, and then move on to more conceptually challenging topics such as passing by ref/value and a linked list structure.

Pointer basics

A pointer, simplistically, is a reference to an address in memory that contains some data type, such as an integer/character or structure. If we have a pointer to a data type, and our program "owns" the piece of memory it is pointing to, we can read and modify the data stores there.

A pointer to a data type is declared the same way as any other variable, save that an asterix ' * ' is placed before the name of the variable. To indicate that it is a pointer. When we refer to the pointer, we can use another asterix to indicate we are dealing with the data referenced by the pointer, rather than the pointer itself.

Examine the following example, which contains a pointer to an integer. We assign to our pointer the address ' & ' of an integer, and then modify the data stored there.

void main()
{
    int number;
    int *pointer_to_number;

    number = 5;
    printf ("Our number is currently %d\n", number);

    /* Assign to our pointer the address of 
       variable 'number' */
    pointer_to_number = &number;

    /* Modify the value stored at 
       pointer_to_number */
    *pointer_to_number = *pointer_to_number +1;

    printf ("Our number is now %d\n", number);
}

As you will see when you run the example, the value of number is incremented, even those we never added to number directly! Instead, we changed the value stored at the memory address shared by number, and hence indirectly modified it. Hopefully by now you will be starting to understand the powerful potential offered by pointers. With pointers, we can access memory locations directly - but with power comes responsibility.....

Imagine what would occur if we declared a pointer, as in the above code, but didn't assign a valid memory address to it. We could read, and even overwrite data belonging to some unknown variable or piece of code - perhaps even belonging to another program. This will cause unpredictable results, and can crash both your own program and others.

For this reason, unless specifically using memory address that are known to be safe (eg - the address of a locally defined variable as is the case with the code above), one should never read from, or write to, an undefined pointer.

Tip - A good programming practice is to declare pointer values to NULL, which indicates it doesn't refer to a valid memory address. You'll notice this later when we move on to linked-lists.

Pointers and arrays

If you've used C for awhile, you'll no doubt be confident that you have mastered the concept of an array. After all, without learning arrays, you can't define a string (which is defined as a null-terminated array of char).

Now we'll throw a spanner into the works. An array is actually a pointer. When you have a printf statement, such as

  printf ("%s\n", mystring)

you are actually passing as a parameter the address of a sequence of characters. Sound confusing? Take a look at this next example.

void main()
{
    /* 80 bytes of memory */
    char mystring[80], *mypointer; 

    /* Copy a dummy string in */
    strcpy (mystring, "Hello there!");

    /* Since an array is already a pointer,
       we don't need to use &mystring */
    mypointer = mystring;

    /* Prove its a pointer by printing it */
    printf ("%s\n", mypointer);

    /* Print string at offset 6 from mypointer */
    printf ("%s\n", mypointer + 6);
}

After running the program, you'll see the words "Hello there!" printed to screen, followed again by a second "there!". There are two print statements, one for mypointer, and another for mypointer + 6. Since a pointer is just a memory address, we can add to this address just like any other number. Remember though that in C, arrays are zero-indexed! To get the nth character in an array, we can either request array[n-1], or *(array + n -1).

Passing pointers to functions

Up until now, whenever you have passed a variable to a function in C, the function has been unable to change its value. A function could modify the value such that it changed within the scope of the function, but when control is returned to the line from which it was called, the changes are not present.

When a function is passed a variable, it is given a duplicate of the value, not the original variable itself. The distinction may be at first hard to comprehend, but in essence changes made to the variable exist only within the scope of the function. This is highlighted in the example below.

/* Function prototype */
void manipulate_int ( int number );

void main()
{
    int a;

    /* Assign five to integer A */
    a = 5;

    /* Display value of A to screen */
    printf ("Value of A : %d\n", a);

    manipulate_int (a);

    /* Display value of A to screen again,
       to prove A remains unchanged */
    printf ("Value of A : %d\n", a);

}

/* Increments and prints number passed to it,
   changing the number within its scope */
void manipulate_int ( int number )
{
   /* Modify the value of the number */
   number = number + 1;

   printf ("Number is now : %d\n", number);
}  

After running the program, you should see that while the number was modified within the manipulate_int function, it didn't change the value in main. This is because we are passing our integer by value, rather than by reference.

Remembering the principles learnt in our first example, where we increment an integer by manipulating the data stored at a pointer location, we can make changes from functions permanent. Pascal programmers achieve this by appending the ' VAR ' keyword before variables - C programmers do it with style, by passing a pointer to the variable!

The following code example is almost identical to the previous one, save that the function now requires a pointer to an integer, rather than an integer itself. After running it, you'll see that changes made by the manipulate_int function are permanent.

/* Function prototype */
void manipulate_int ( int *number );

void main()
{
    int a;

    /* Assign five to integer A */
    a = 5;

    /* Display value of A to screen */
    printf ("Value of A : %d\n", a);

    manipulate_int (&a);

    /* Display value of A to screen again,
       to prove A remains unchanged */
    printf ("Value of A : %d\n", a);

}

/* Increments and prints number passed to it
   as a pointer, changing the data stored at
   the address pointed to by the pointer :) */
void manipulate_int ( int *number )
{
   /* Modify the value of the number */
   *number = *number + 1;

   printf ("Number is now : %d\n", *number);
}  

Pointers and data structures

As a means of passing modifiable data to functions, pointers offer great potential. However, this only just begins to scratch the surface. Until now, we've been limited to defining individual structures, and grouping them together in an array.

This presents several problems however. If we want to delete a particular item in an array, we either have to mark it as logically deleted (through the use of some flag, which is inefficient because the structure still takes up memory), or we have to shuffle the array down by one to the point at which the physical deletion took place. Neither method is desirable.

There is however, a solution, thanks to the use of pointers! We start by creating a structure, which contains a pointer reference to a duplicate of the structure. We can link together a sequence of structures, chained together by pointers - a sequential linked-list of data that can be traversed by moving from one node onto the next.

Standard linked list

Linked-list structure

As long as there is a variable pointing to the head of the list, a program can cycle through a while loop, searching for data. When the tail of the list is reached (the pointer to the next node is equal to null), the program can append a node to the list. If the list has to be sorted, then a program can cycle through each node, comparing the data contained within the node, and inserting where appropriate.

Since each copy of the structure is linked into the list by a pointer, its possible to hook, and unhook, these links. We can quickly insert and delete nodes in this list, simply by hooking in or unhooking links in a chain of pointers. The linked-list deletion method (shown below) is vastly more efficient than shuffling down numerous array elements, and more memory effective than logical deletion, as the memory occupied by a deleted node can be made free for other purposes.

Node deletion in a linked-list

Node deletion in a linked-list

This deletion process works by traversing each node in the list, looking for the correct node to delete (ie - the one that contains a particular piece of data). When the right node is found, a copy of the pointer to the next node is made, and the node is then deleted. The previous node's pointer to the next node is set to the copied pointer, and the node is effectively cut out of the linked list. Its memory is freed, making more heap memory available to the program.

With the same ease that we can delete nodes to the list, so too can we dynamically allocate memory and insert new nodes into the list. Rather than reserving some large arbitrary amount of memory (such as a fixed array), we can allocate memory as we need it, to insert each node. We can traverse the list, looking for the end, or maintain a separate pointer to the tail of the list, and insert a node from that point on. The latter gives a slight performance increase, particularly as the length of the list increases.

Node insertion in a linked-list

Node insertion in a linked-list

To allocate memory dynamically, we must use the malloc system call. On Unix systems, you can include the malloc function in your programs by adding the following include statement :

#include <malloc.h>

If you are using a non-standard library, or perhaps compiling under an Intel system, you may need to substitute ' alloc.h ' for ' malloc.h '. Next, in our code, we make a call to malloc, and pass as its only parameter the size of memory we wish to reserve. It will return to us a pointer to this reserved memory location, which we can assign to the next pointer of our linked list structure.

tailnode->next = malloc(sizeof(struct c_linked_list));

The following example shows how a structure can refer to another structure through pointers, the creation of a linked list, the traversal of a linked-list, selective deletion of individual nodes, and finally the process of terminating a linked-list via the free() function. Its most important to traverse the list, and free all nodes, when the program is finished with the list - for without this final step, the memory may not be freed correctly and may leave ' memory leaks '.

NB - in the following example, the malloc call is placed in a separate function for the purposes of clarity


#include <stdio.h>
#include <malloc.h>

/* Linked list structure */
struct c_linked_list
{
        char data[20]; /* String */
        struct c_linked_list *next;
};

/* Type definition for linked list structure */
typedef struct c_linked_list link;

/* Function prototypes */
void set_element(link *element, char *name);
void dump(link *element);
void add(link *element, char *name);
void delete(link *element, char *name);
link *create_element();
void deallocate_list(link *element);

void main()
{
        link *head, *current;
        
        head = create_element();

        /* Enter the first name */
        strcpy (head->data, "David"); 
        head->next = NULL;

        current = head;

        /* Dump current list to screen */
        printf ("\nInitialised list : \n");
        dump(current);

        /* Add names to list */
        add(current, "John");
        add(current, "Jack");
        add(current, "Sue");
        add(current, "Larry");

        /* Dump list to screen */
        printf ("\nList after additions : \n");
        dump(current);

        /* Delete Jack */
        delete(current, "Jack");

        /* Dump list to screen */
        printf ("\nList after deletion of Jack : \n");
        dump(current);

        /* Kill list */
        deallocate_list(current);
        
}

link *create_element()
{
        return (link *) malloc(sizeof(link));
}

void set_element(link *element, char *name)
{
        strcpy (element->data, name);
}

void deallocate_list(link *element)
{
        link *current;

        /* Cycle through to the end of linked list */
        while (element != NULL)
        {
                /* Copy pointer */
                current = element;

                /* Get next pointer */
                element = (link *) element->next;               

                /* Free old pointer */
                free (current); 
        }
}

void dump(link *element)
{
        /* Cycle through to the end of linked list */
        while (element->next != NULL)
        {
                printf ("%s\n", element->data);
                element = (link *) element->next;
        }

        /* Print final entry in list */
        printf ("%s\n", element->data);


}

void add(link *current, char *name)
{
        link *element;

        element = current;

        /* Cycle through to the end of linked list */
        while (element->next != NULL)
        {
                element = (link *) element->next;
        }

        /* Create a new link */
        (link *) element->next = create_element();
        element = (link *) element->next;
        
        /* Set values for element */
        element->next = NULL; /* Empty pointer */
        strcpy(element->data, name);
}

void delete(link *current, char *name)
{
        link *element;

        element = current;

        /* Cycle through to the end of linked list, searching
           for a string that matches name */
        while (element->next != NULL)
        {
                /* If the next element's string matches name */
                if (strcmp (element->next->data, name) == 0)
                {
                     link *temp_pointer;

                     /* Get the element after next */
                     temp_pointer = element->next->next;

                     /* Free memory for the next pointer */
                     free (element->next);

                     /* Set the next element to the element after next */
                     element->next = temp_pointer;

                     /* break out of while loop */
                     break;
                }

                /* Continue checking next element */
                element = (link *) element->next;
        }
}