Archive for January, 2008

Half Edge Mesh – Intro

Saturday, January 26th, 2008

When I created the mesh generator for BFlow, I used a straightforward but unimaginative set of data structures for storage. A mesh is made up of cells, faces, edges and vertices. Unstructured meshes have arbitrary relationships between these (unlike a structured mesh), and thus those relationships must be stored.

In a 2d, a mesh looks like this:

mesh

My BFlow data structures to hold the information for a mesh look like this:

TYPE CellType node(4) AS LONG '1=SW,2=SE,3=NE,4=NW (1st 4 are the corners) Face(4) AS LONG '1=East, 2=west, 3=North, 4=South nb(4) AS LONG 'Neighbor cells. 1=East, 2=west, 3=North, 4=South END TYPE TYPE FaceType node(2) AS LONG 'ptrs to two nodes of face cell(2) AS LONG 'ptrs to neighbor cells END TYPE TYPE VertType face(4) AS LONG 'pointer to touching faces, 1=East,2-West,3=North,4=South cell(4) AS LONG 'pointer to touching cells, 1=SW,2=SE,3=NE,4=NW END TYPE

(These are Basic declarations, but they would look very similar in C.) You can see that everything is stored. A cell can have four faces and four nodes. Pointers to them are stored in that cell structure. A vertex can be touched by four faces. Pointers to all those faces are stored for each vertex.

There is a very elegant alternative storage scheme called the Half Edge Mesh. Amazingly little information is stored, but it provides everything you need! The key idea is that rather than store edges (or faces in 2d), you store half edges (half faces). While an ordinary face is defined by its two end points, a half face is directed and defined by just a single point. The picture shows the idea.

mesh2

This probably looks like a step in the wrong direction, if we are trying to simplify things. Yet, this change is really helpful. The data structure for a half face looks like:

struct face_type { VERT_TYPE *vert; //vertex at the end of the half-face FACE_TYPE *pair; //Oppositely oriented adjacent half-face CELL_TYPE *cell; //cell that this face belongs to FACE_TYPE *nextF; //next half-face around the cell };

The vertex and cell structures are given below:

struct vert_type { double x; double y; FACE_TYPE *face; //Ptr to a touching face }; struct cell_type { FACE_TYPE *face; //south face of cell };

The cell structure seems especially spartan, at least compared with what I started with. How do you access the faces of the cell? Or its vertices? It turns out to be very easy:

FACE_TYPE* face = cell->face; do { // ...some face calcs... face = face->nextF; } while (face != cell->face);

The cell faces are effectively organized into a circular linked list. Thus the above code works equally well for a quad mesh as a polyhedral mesh.

Accessing all of the vertices around the cell is nearly as easy:

VERT_TYPE *vert; FACE_TYPE *face = cell->face; do { vert = face->vert; // ...some vert calcs... face = face->next; } while (face != cell->face);

Now lets try something trickier. Say you needed to loop over all of the cells surrounding a vertex? This could come up in a CFD model, for instance, when interpolating a field value from the cell centers to the vertices. Here is the code:

CELL_TYPE *cell; FACE_TYPE* face = vert->face; do { cell = face->cell; // ...cell calcs... face = face->nextF->pair; } while (face != vert->face);

This takes advantage of the fact that pairs of faces are oriented in opposite directions. In the figure below, face F1 is pointed upward. Its pair face, F2, is pointed downward. Each face stores a pointer to the vertex that it is pointing to. F1 stores a pointer to V1, while F2 stores a pointer to V2.

Mesh3

You can trace out logic by picking one of the faces pointing toward vertex V2, then following the steps. It is very elegant.

Compared to my original scheme, the Half Edge approach is both more efficient (in terms of memory use) and more flexible (since its not limited to quad meshes). In my next entry, I will explore this in more detail.

Naming Conventions in C

Saturday, January 12th, 2008

Having done much of my programming in Basic, I have been sweetly oblivious to the importance of case in naming variables. In Basic, NumNodes is the same as numnodes, numNodes and NUMNODES. As a result, you can type any combination that suits you and all is well. Your eye doesn’t even notice the differences after awhile. That all changes when you program in C. I am translating some meshing code from PowerBasic to C for my next blog entry and the gcc compiler is giving me tons of errors like:

HalfMesher.c: In function `createMesh': HalfMesher.c:242: error: `OldNF' undeclared (first use in this function) HalfMesher.c:254: error: structure has no member named `Faces'

And that is after going through the code line by line looking for such things!

CompoundNames

This got me thinking about naming conventions for variable names and functions. There is an amazing amount written on the topic and feverous religious battles have been fought. Still, a few patterns emerged. In the C world, there is a tendency to use all lower case for simple variable and function names. For compound names, like CellVolume, there are several options:

  • Run together: cellvolume
  • Underscores: cell_volume
  • Lower mixed case: cellVolume
  • Upper mixed case: CellVolume

Mixed case is one of several names for capitalizing the first letter of each word. (Others are camel case, medial capitals, InterCaps.) In lower mixed case, this is applied only to the words after the first. In the list above, the first two combining methods seem the most common for C programmers, followed by lower and finally upper mixed case.

I have to say that I don’t care for the run together approach because it can be ambiguous for some word combinations. For instance consider the variable cellslip. It could be interpreted as cell_slip or cells_lip. As you read the code, you need to stop and decipher this variable name. That may only take a moment, but it is an extra burden on the reader. The underscore method addresses this issue nicely, but adds to the bulk of variables, and can clutter the code. Mixed case gives you the benefits of word separation, without the extra bulk. Because C programmers have tended to prefer lower case, lower mixed case seems like the best approach for compound names.

Other Issues

It turns out there are many more issues that come up with name conventions. For instance, terseness. The C examples in Kernighan and Ritchie are filled with one and two letter variable names. Their code is compact, but it can be hard to read. At the other extreme, there is the natural language approach of Kari Laitinen. Here is a snippet of code taken from his site:

cout << "\n Please, type in two integers separated " << "with spaces: " ; cin >> first_integer_from_keyboard >> second_integer_from_keyboard ; sum_of_two_integers = first_integer_from_keyboard + second_integer_from_keyboard ;

It is very readable, but that is partly because it is also very simple. If you are solving complex equations, with lots of variables in them, your code will be highly cluttered if you follow this approach. Which is easier to read:

y = a*x*x + b*x + c;

or

Answer_to_problem_7 = First_Coefficient*x_variable*x_variable + Second_Coefficient*x_variable + Third_Coefficient;

The example above introduces the importance of idioms. Even very terse names can convey meaning if they follow a common idiom. Any mathematically aware persion will recognize the above equation as that of a second order polynomial. For this case single letter names are fine. Similarly, using i and j as index names for loops is a common idiom in programming, and so is fine too.

What is the optimum length for variable names? Steve McConnell discusses this in Code Complete. He references a 1990 study by Gorla, Benander and Benander that suggests names in the range of 10 to 16 characters where the easiest to debug.

Another important point is that care is required with global variables. Local variables can be understood in the context of the function they reside in. Global variables are generally without context. As you read code, they pop up randomly, and you need to understand them. Thus, their names must be descriptive enough to standalone, within reason.

Naming Convention

Here are the rules I have come up with for myself when programming in C. Following these should both make my life easier, and hopefully make my code easier to read.

  • Single word variable names are lower case: face
  • Compound variable names use lower mixed case: theLongName
  • Functions follow same convention as variables
  • Macros are upper case: MAX
  • Typedefs are upper case and end in _TYPE for complex types: FACE_TYPE
  • Names for lists or collections typically end in ’s’: cells
  • Global variables start with g and are more detailed: gDiscretizationScheme
  • If it helps readability, pointers can be prefixed with p: pFace
  • Abbreviations are to be minimized.
  • Idioms are used to shorten names where it makes sense

Writing to generic arrays

Tuesday, January 1st, 2008

In an earlier entry, we developed a generic array in C and a way to loop through that array. The idea behind this is simple: if we have a standard way to working with arrays, we can eliminate common errors (like writing outside array bounds) while providing great flexibility. In playing with this tool, I found that I need a function for writing an element to an array. I quickly whipped something out and hit a roadblock.

Imagine a function, ListAppend, that adds a new element onto the end of a generic array. Its declaration might look like this:

int ListAppend(AList *list, void *item)

(The name ListAppend might seem odd in the context of C arrays, but I am picturing a python list, that you can easily manipulate without worries.) This function takes the list, and a pointer to the item to be added and returns the array index of this new item. The problem comes when we try to copy the data in item to its slot in the array. Imagine some code like this

void *pData = list->Data + list->NumEl * list->ElSize; //calc where to write data *pData = *item; // Write data <-- illegal! Can't deref void ptr.

The first line determines where to write the data (at the end of the array) and the second line writes it. Unfortunately, the compiler doesn’t like this. Because both pointers are of type void, the compiler doesn’t know how many bytes to copy from *item and doesn’t know how many *pData can safely hold. For this to work right, both pointers need to be of the correct type. The compiler needs this information at compile time, so we can’t be setting it at run time. What to do?

Solutions

I have thought of three ways to approach this, none totally satisfactory.

- User defined assignment function
- memcopy
- macros

In the first method, we have a single ListAppend function of the form:

int ListAppend(AList *list, void *item)

like before. But we add a function pointer to the list structure. This pointer points to a user written function that copies the data from item to the correct slot at the end of the array. Here is an example of such a function:

int AssignFuncVert(void *src, void *dest) { VERT_TYPE *pdest = (VERT_TYPE *)dest; VERT_TYPE *psrc = (VERT_TYPE *)src; *pdest = *psrc; return 0; }

This code first casts the pointers into the appropriate type (VERT_TYPE in this case) and then does the assignment. This approach is in the opposite direction of where we are trying to go with these generic arrays. We want our generic structures to be easy to use. Function pointers have complex syntax and are scary for some people.

The memcpy approach takes advantage of the memcpy function in the string.h file. This function copies n bytes from one location to the other. You don’t have to worry about types. As long as you copy the correct number of bytes, all is well. Since we are already storing the size of an element, this is no problem. The memcpy function is called like so:

memcpy(pData, item, list->ElSize)

This approach requires no additional work on the part of the user, which makes it very attractive. On the downside, we are circumventing the type checking the compiler does, which may set us up for problems.

In the third approach, we mimic templates in C++. A macro is used to generate a ListAppend function for each data type of interest. You would call the macro like so:

LIST_APPEND(ListAppendVert, VERT_TYPE)

and get a function that looks like:

int ListAppendVert(ALIST *list, VERT_TYPE *item)

The nice thing about this approach is it is type safe. If you pass the wrong thing to the ListAppendVert function, the compiler will complain. The downside is it is a bit unclear, just from looking at an include file containing this macro, what you are supposed to do. There would have to be some good comments included.

Pros and Cons

All three approaches have pros and cons. In my view, the most important criteria in deciding which approach is best are simplicity and robustness. We all have limited bandwidth, and we don’t want to waste it on stupid stuff. A generic array is nice, but if it takes too long to come up to speed each time we use it, we will just use ordinary arrays. Robustness is more subtle. Will the code stand up to typos and other errors, either by flagging them early or gracefully doing the right thing? Not being a strong C programmer, it is hard for me to anticipate what might go wrong.

Approach Simplicity Robustness --------------------------------------------------------------------------- -------------------------------------------- User Func low med memcpy high med Macros med high --------------------------------------------------------------------------- -------------------------------------------

The table is a stab at rating the three approaches. The memcpy approach doesn’t require any extra work by the user, so it ranks well simplicity. I think it should be safe, but can’t be positive. Copying bytes, regardless of type seems a bit risky. The user function approach could be a turn off to people. Function pointers are tricky enough to scare many off. The macro approach is very attractive from a robustness point of view, but I think there is a high potential for forgetting how to use them. Given all this, I think I will go with memcpy and see how it works out.

A small program that implements each of these three methods is in the file WriteToArray. It can be compiled with the command gcc -Wall WriteToArray.c.