6.2. Arrays

  • An array is a sequence of data items that are of the same type and stored contiguously in memory.
  • In reality (resulting assembly language code), there is no such thing as an array. The array in C is just a convenient notation that simplifies two tasks.
    1. Arrays make it easy to allocate memory. When an array is declared, memory to store the array is allocated or reserved, otherwise memory allocation routines (calloc and malloc) must always be used.
    2. Arrays make it easy to reference individual data points without having to use pointer arithmetic.
  • Arrays can be one dimensional (single sequence of data), two, three or N dimensional. Regardless of the size and dimension of the data, it is all stored sequentially.
  • The index of the array always goes from 0 to N - 1, where the size of the array is N.

6.2.1. Declaring arrays

  • Square brackets, [size], are used to initialize an array. The size of the array is an integer constant, either a number or a #define constant.

  • A variable or expression may be used as an array index when referencing an array, but not when declaring it.

    char str[50];                 /* string */
    double mydata[5][10];         /* 5 x 10 (2 dimensions) */
    char *list[30];       /* an array of pointers, most likely to strings,
                             another type of 2-dimensional data */
    
  • Global and static (put in bss) arrays are initialized to zero by default.

  • Automatic (on the stack) arrays are not initialized by default. It is best to never assume an initial value of an array.

  • To initialize an array, put the values in brackets {}. All but the first dimension of a multi-dimension array must be specified.

    int a[3] = {3,4,5};
    int a[] = {3,4,5};      /* size determined by the initial brackets */
    int b[][3] = {{7,8,9},{4,5,6},{3,4,5}}; /* a 3x3 array */
    

6.2.2. An Array Example

#include <stdio.h>
#define  N 5

int main(void)
{
   int a[N], i, sum=0;

   for( i=0; i < N; i++ ) {
      a[i] = 7 * i * i;
      sum += a[i];
      printf( "a[%d] = %d\t", i, a[i] );
  }
  return 0;
}

6.2.3. Pointers and Arrays

  • If we have an array (int a[3]) and sizeof(int) = 4, then:

    • a = &a[0] is a constant pointer to the starting address of the array.
    • &a[1] == &a[0] + 4, &a[2] == &a[1] + 4, because size of int is 4.
    • a[i] is the same as *(a + i).
  • When we declare a pointer variable, we have a place to store the pointer variable, but the pointer does not point to any data yet. When we declare an array, the opposite is true. We have allocated a place in memory to store the values of the array, but the name of the array is a constant – not a variable with its own storage location. The pointer value associated with the identifier of the array is kept in the text part of memory. So we can do *(a + 1), but not *a++.

  • When pointer arithmetic is performed, the sizeof operation is implicit. If a[] is of type int, then *(a+i) can be thought to mean *(a + sizeof(int)*i). We can use pointer arithmetic to step through the array in a loop.

    #define SIZE       50;
    #define END_ADDR   (array + SIZE)
    
    double array[50];   /* a global array */
    
    int main(void)
    {
       double *q;
       int i;
    
       for( i = 0, q = array; q < END_ADDR; q++, i++ ) {
          /*
           * q points to an element in array, the loop increments q
           *  through all the elements in the array.
           */
          *q = (double)i;
          ...
       }
       .
       .
       .
    }
    

6.2.4. Passing Arrays to Functions

For a function to operate on an array, we pass a constant pointer to the array as an argument to the function.

int sum( int *, int );

int main(void)
{
   int a[N], sum;

   ...
   sum = sum( a, N );
   ...
}

int sum( int a[], int size ) /* could also use 'int *a' */
{
   int i, s=0;

   for( i = 0; i < size; i++ )
      s += a[i];
   return s;
}

6.3. Multi-dimensional arrays

  • Multi-dimensional arrays are declared like one dimensional arrays, except with a set of brackets and size for each dimension of the array.
  • The array index closest to the identifier has the largest data type.
  • The array index furthest to the right of the identifier has the smallest data type.
  • For an N dimensional array, using pointer arithmetic, we must dereference the identifier for the array (N - 1) times to get a pointer to an individual data element. We must dereference the identifier N times to access the contents of an individual data element.

6.3.1. A 2 dimensional array

If we declare a two dimensional array, such as int a[3][5];, it can be thought of as a matrix of rows and columns. The last dimension ([5] represented the columns. The columns of the first row have the lowest memory address regardless if the array is on the stack, or in the data or bss memory sections.

In Memory:
|-----------|       |-----------------------------------------|
|  a[2][4]  |       | a[0][0] a[0][1] a[0][2] a[0][3] a[0][4] |
|     .     |       | a[1][0] a[1][1] a[1][2] a[1][3] a[1][4] |
|     .     |       | a[2][0] a[2][1] a[2][2] a[2][3] a[2][4] |
|     .     |       |-----------------------------------------|
|  a[0][2]  |
|  a[0][1]  |
|  a[0][0]  |
|-----------|
  • Just like with one dimensional arrays, we can use pointer arithmetic to access elements of the array. The key to understanding pointer arithmetic with a multi-dimensional array is to first understand how to describe the data type of the identifier bearing the name of the array.

  • a is of type pointer to [5] of int. – points to a whole column.

  • *a is of type pointer to int.

    *(a + 1) = a[1][0]
    *(a[0] + 1) = *(*a + 1) = a[0][1]
    

Notice multiple dereference operators (*) above. The first one has the usual meaning of “contents of”. The rest of the dereference operators have a slightly different meaning. They mean “remove one pointer” in the description of the data type.

Where is a[2][3] in memory?                             *a+3
                                                          |
a is ptr to [5] of int.                     |-------------------|
*a is ptr to int.                   a = X ->|   |   |   |   |   |
                                            |-------------------|
                                            |   |   |   |   |   |
                                            |-------------------|
                                      a+2 ->|   |   |   | * |   |
                                            |-------------------|
                                                          |
 a[2][3] == *(*(a+2)+3)                                a[2][3]

 If a points to address X, then the address of a[2][3] is
 X + 5*sizeof(int)*2 + sizeof(int)*3 = X + 5*4*2 + 4*3 = X + 52

For an array of size [3][5], some expressions which are equivalent to a[i][j] are:

  • *(a[i] + j)
  • *(a + i)[j]
  • *(*(a + i) + j)
  • *(&a[0][0] + 5*i*sizeof(type) + j*sizeof(type))

Compile the following program and make sure you understand what is printed.

#include <stdio.h>
int main(void)
{
   int *p, r[2][3];
   double *q;


   p = (int *)1000;
   q = (double *)1000;

   printf("%d\t%d\t%d\n", p+1, q+1, ((int)(r+1) - (int)r));
   return 0;
}           /* printed output is: 1004, 1008, 12 */

6.3.2. A 3 dimensional array

int A[4][3][5];

What is the pointer arithmetic notation for A[2][2][3]?
A is type ptr to [3][5] of int. ==> A+2 points to the needed 2-dim array.
*A is type ptr to [5] of int.   ==> *(A+2)+2 points to the needed row.
**A is type ptr to int.  ==> *(*(A+2)+2)+3 points to the needed variable.
A[2][2][3] == *(*(*(A+2)+2)+3)

If the address pointed by A is X, then since sizeof(int) = 4:
&A[2][2][3] = X + 3*5*4*2 + 5*4*2 + 4*3 = X + 172