.. _arrays: Arrays ====== - An *array* is a sequence of data items that are of the same type and stored contiguously in memory. - In reality (resulting assembly language code), there is no such thing as an array. The array in C is just a convenient notation that simplifies two tasks. #. Arrays make it easy to allocate memory. When an array is declared, memory to store the array is allocated or reserved, otherwise memory allocation routines (`calloc` and `malloc`) must always be used. #. Arrays make it easy to reference individual data points without having to use pointer arithmetic. - Arrays can be one dimensional (single sequence of data), two, three or `N` dimensional. Regardless of the size and dimension of the data, it is all stored sequentially. - The index of the array always goes from `0` to `N - 1`, where the size of the array is `N`. Declaring arrays ---------------- - Square brackets, [size], are used to initialize an array. The size of the array is an integer constant, either a number or a #define constant. - A variable or expression may be used as an array index when referencing an array, but not when declaring it. :: char str[50]; /* string */ double mydata[5][10]; /* 5 x 10 (2 dimensions) */ char *list[30]; /* an array of pointers, most likely to strings, another type of 2-dimensional data */ - Global and static (put in bss) arrays are initialized to zero by default. - Automatic (on the stack) arrays are not initialized by default. It is best to never assume an initial value of an array. - To initialize an array, put the values in brackets `{}`. All but the first dimension of a multi-dimension array must be specified. :: int a[3] = {3,4,5}; int a[] = {3,4,5}; /* size determined by the initial brackets */ int b[][3] = {{7,8,9},{4,5,6},{3,4,5}}; /* a 3x3 array */ An Array Example ---------------- :: #include <stdio.h> #define N 5 int main(void) { int a[N], i, sum=0; for( i=0; i < N; i++ ) { a[i] = 7 * i * i; sum += a[i]; printf( "a[%d] = %d\t", i, a[i] ); } return 0; } Pointers and Arrays ------------------- - If we have an array (int a[3]) and sizeof(int) = 4, then: - ``a = &a[0]`` is a `constant` pointer to the starting address of the array. - ``&a[1] == &a[0] + 4``, ``&a[2] == &a[1] + 4``, because size of int is 4. - ``a[i]`` is the same as ``*(a + i)``. - When we declare a pointer variable, we have a place to store the pointer variable, but the pointer does not point to any data yet. When we declare an array, the opposite is true. We have allocated a place in memory to store the values of the array, but the name of the array is a *constant* -- not a variable with its own storage location. The pointer value associated with the identifier of the array is kept in the text part of memory. So we can do ``*(a + 1)``, but not ``*a++``. - When pointer arithmetic is performed, the `sizeof` operation is implicit. If `a[]` is of type int, then `\*(a+i)` can be thought to mean ``*(a + sizeof(int)*i)``. We can use pointer arithmetic to step through the array in a loop. :: #define SIZE 50; #define END_ADDR (array + SIZE) double array[50]; /* a global array */ int main(void) { double *q; int i; for( i = 0, q = array; q < END_ADDR; q++, i++ ) { /* * q points to an element in array, the loop increments q * through all the elements in the array. */ *q = (double)i; ... } . . . } Passing Arrays to Functions --------------------------- For a function to operate on an array, we pass a constant pointer to the array as an argument to the function. :: int sum( int *, int ); int main(void) { int a[N], sum; ... sum = sum( a, N ); ... } int sum( int a[], int size ) /* could also use 'int *a' */ { int i, s=0; for( i = 0; i < size; i++ ) s += a[i]; return s; } .. _multiD_arrays: Multi-dimensional arrays ======================== - Multi-dimensional arrays are declared like one dimensional arrays, except with a set of brackets and size for each dimension of the array. - The array index closest to the identifier has the largest data type. - The array index furthest to the right of the identifier has the smallest data type. - For an N dimensional array, using pointer arithmetic, we must dereference the identifier for the array (N - 1) times to get a pointer to an individual data element. We must dereference the identifier N times to access the contents of an individual data element. A 2 dimensional array ---------------------- If we declare a two dimensional array, such as `int a[3][5];`, it can be thought of as a matrix of rows and columns. The last dimension (`[5]` represented the columns. The columns of the first row have the lowest memory address regardless if the array is on the stack, or in the data or bss memory sections. :: In Memory: |-----------| |-----------------------------------------| | a[2][4] | | a[0][0] a[0][1] a[0][2] a[0][3] a[0][4] | | . | | a[1][0] a[1][1] a[1][2] a[1][3] a[1][4] | | . | | a[2][0] a[2][1] a[2][2] a[2][3] a[2][4] | | . | |-----------------------------------------| | a[0][2] | | a[0][1] | | a[0][0] | |-----------| - Just like with one dimensional arrays, we can use pointer arithmetic to access elements of the array. The key to understanding pointer arithmetic with a multi-dimensional array is to first understand how to *describe the data type* of the identifier bearing the name of the array. - `a` is of type pointer to [5] of int. -- points to a whole column. - `\*a` is of type pointer to int. :: *(a + 1) = a[1][0] *(a[0] + 1) = *(*a + 1) = a[0][1] Notice multiple dereference operators (\*) above. The first one has the usual meaning of "contents of". The rest of the dereference operators have a slightly different meaning. They mean "remove one pointer" in the description of the data type. :: Where is a[2][3] in memory? *a+3 | a is ptr to [5] of int. |-------------------| *a is ptr to int. a = X ->| | | | | | |-------------------| | | | | | | |-------------------| a+2 ->| | | | * | | |-------------------| | a[2][3] == *(*(a+2)+3) a[2][3] If a points to address X, then the address of a[2][3] is X + 5*sizeof(int)*2 + sizeof(int)*3 = X + 5*4*2 + 4*3 = X + 52 For an array of size [3][5], some expressions which are equivalent to `a[i][j]` are: - \*(a[i] + j) - \*(a + i)[j] - \*(\*(a + i) + j) - \*(&a[0][0] + 5\*i\*sizeof(type) + j\*sizeof(type)) Compile the following program and make sure you understand what is printed. :: #include <stdio.h> int main(void) { int *p, r[2][3]; double *q; p = (int *)1000; q = (double *)1000; printf("%d\t%d\t%d\n", p+1, q+1, ((int)(r+1) - (int)r)); return 0; } /* printed output is: 1004, 1008, 12 */ A 3 dimensional array ------------------------ :: int A[4][3][5]; What is the pointer arithmetic notation for A[2][2][3]? A is type ptr to [3][5] of int. ==> A+2 points to the needed 2-dim array. *A is type ptr to [5] of int. ==> *(A+2)+2 points to the needed row. **A is type ptr to int. ==> *(*(A+2)+2)+3 points to the needed variable. A[2][2][3] == *(*(*(A+2)+2)+3) If the address pointed by A is X, then since sizeof(int) = 4: &A[2][2][3] = X + 3*5*4*2 + 5*4*2 + 4*3 = X + 172