In numerical computations, we frequently have lists of objects we must deal with. In C, there are two types of containers commonly used for holding these lists: arrays and linked lists. Arrays are good for fast, random access to the list. However, they are a few drawbacks. It is easy to write outside of array bounds and crash the code. It is hard to add elements to the middle of the list. And if you need more space than you allocated, you need to allocate a new larger block of memory and copy everything over.
Linked lists are good for frequently changing lists. You can easily insert an element anywhere in the list. Since memory is allocated as each element is created, there is no problem running out of room. However, cycling through the list can be slower, since the elements can be scattered all over in memory, and it is time consuming to retrieve the ith element of a list. Numerical folks tend to favor arrays, but both tools are useful.
Here we will create a general array and a macro for looping through it. This will hopefully be useful for reducing coding errors.
Array Structure
Lets begin by creating a general array structure, and some functions to access it.
A generic structure, that can hold arrays of anything might look like:
typedef struct { long NumEl; //number of elements size_t ElSize; //size of each element void *Data; //ptr to data } AList;
This structure stores the size of the array, the size of each item in the array and a pointer to the first element in the array. Knowing the number of elements in the array is useful to staying in array bounds. The size is important for allocating memory and when we try looping. Notice that Data is a pointer to void. That is because we don’t know what type of array we will have. It could be intergers, floats or some arbitrary structure.
A new ArrayList could be created like so:
void CreateList(AList *list, long NumEl, size_t ElSize) { list->NumEl = NumEl; list->ElSize = ElSize; list->Data = (void *)malloc((NumEl + 1) * ElSize); if (list->Data == NULL) printf ("Error in CreateList: Can't allocate memory"); }
To destroy a list, we must deallocate the memory for the data array. The DestroyList function does this:
void DestroyList(AList *list) { if (list->Data != NULL) { free(list->Data); list->Data = NULL; } }
One can access an element in the array with
void *GetItem(AList list, long index) { if (index >= 0 || index < list.NumEl) {return list.Data + index * list.ElSize;} else { printf ("Wrting outside array bounds"); return NULL; } }
This may look a little strange. If Data is a pointer to an array why not access the elements in the normal fashion with [] brackets? The problem is that we don’t what ahead of time what type of data is stored in the array. For something like list.Data[5] to work, the compiler must know the size of each element, so it can calculate the distance (in memory) from the starting point. Since it doesn’t know that, we do the calculation for it.
One could also envision a SetItem routine, but this would be very dependent on the nature of the element being stored. So it is best to let the user set each element as it makes sense to do so.
For Each loop
One of the first lessons I learned in programming was: Don’t write outside array bounds! Its an important lesson, and a simple one, yet I still occasionally mess up. Why? Because its easy. Here is a very simple way to screw up in C:
int i; float myArray[10] for i=0; i<=10; i++) { //some code }
In C, arrays start at index 0, so the above declaration runs from myArray[0] to myArray[9]. Yet our loop runs to myArray[10]. Trouble!
Some languages handle the looping details for you. Consider this equivalent code in Python:
myList = [1,4,2,3,5,6,3,8,5,9] for each item in myList: #some code
The For statement knows there are ten items in myList and loops over them. Each pass through the loop, it takes the current item in the list and assigns it to Item. You can then do what you want with it.
This type of high level construct frees us to focus on the job at hand while leaving the bookkeeping to the computer. Unfortunately, in the case of Python, this comes at the cost of speed.
Lets try to emulate the For statement in Python. Consider the following macro:
#define FOR_EACH(index, item_ptr, list) \ item_ptr=list.Data; \ for (index=0; indexThe FOR_EACH macro has three arguments: an index, a pointer to the an item in the array and the list structure. Like the Python equivalent, with each pass through the loop, the item pointer is set to point to the current position in the array. To find that position, we need to know the size of the elements, as mentioned above. With a macro like this, it is impossible to write outside array bounds. Because the compiler substitutes the code in at compile time, the performance should be the same as hand coding the loop. The best of both worlds!
Trying it out
Lets try a simple example to make sure it is working. Here is a very simple structure:
typedef struct //toy structure to play with { int val1; double val2; } ElType;To loop through a list, we first create the list, then call the FOR_EACH macro. We must declare an index variable and a pointer to an element, but don’t need to initialize these. Here is the code:
int main() { int i; AList mylist; ElType myEl; ElType *pMyEl; CreateList(&mylist, 100, sizeof(myEl)); FOR_EACH(i, pMyEl, mylist) { pMyEl->val1 = i; pMyEl->val2 = (double)(2.*i); printf("i = %d, val1 = %d, val2 = %f\n", i, pMyEl->val1, pMyEl->val2); } DestroyList(&mylist); return 0; }A nice feature of this approach is one could later change the array to be 200 elements and all the loops will still work fine.
Download code here: Array Code
[...] Basic Numerics Blog Experiments in Numerical Methods « A Generalized Array in C [...]
[...] In an earlier entry, we developed a generic array in C and a way to loop through that array. The idea behind this is simple: if we have a standard way to working with arrays, we can eliminate common errors (like writing outside array bounds) while providing great flexibility. In playing with this tool, I found that I need a function for writing an element to an array. I quickly whipped something out and hit a roadblock. [...]