This year I have tried to make it a habit to track how much time I spend on numerical programming. I’d like to put in 10 hours or so a week and this lets me know if I am doing that. Anyway, I went back to my records from February when I was developing the Algebraic Mesher in C. It took me 20.9 hours to complete, with about half of that time spent in a frustrating effort to track down numerous memory problems. The promise of scripting languages is they allow you to develop programs much more efficiently, at the price of slower run time performance. So I decided to create the mesher in Python and see how long it took me.
Method
Four months had elapsed since I finished the C version and I couldn’t recall how I had programmed it. All I remembered is it was a miserable experience. Thus it seemed that I could create a Python version from scratch without undue influence from the previous effort. To get started, I reviewed the equations linking the ID numbers for vertices, faces and cells from earlier, but nothing more. I didn’t look at the C code or my follow up blog entry on the experience.
So I opened up the IDLE editor and looked at a perfectly blank page. I wasn’t sure where to start. Then I thought: I need a cell object. So I made a very simple one:
class cellclass: def __init__(self,CID): self.CID = CID
All it does is store the ID number of the cell. I made a list of cells and stored some numbers. So far so good. I then made a vertex object, and created a list of vertices:
verts=[] for i in range(NumRows+1): for j in range(NumCols+1): VID = (NumCols+1)*i + j x = j*delx; y = i*dely temvert = vertclass(VID,x,y) verts.append(temvert)
In this way I added features incrementally, in baby steps, without really any planning. Proceeding along in this fashion I had the mesh generator finished in 1.5 hours. I couldn’t believe how fast it went. The code wasn’t especially pretty – nearly everything is lumped in one routine – but it seemed to run fine.
Testing
I wanted to make sure I tested the code as thoroughly as I had tested the C version, so I opened up the C code at this point, reviewed the tests there, and replicated them, one by one in Python. With the C version, every test revealed new bugs which had to be laboriously tracked down and fixed. With the Python version, the code passed the first four tests the first time. The fifth test found a bug where I was writing multiple copies of vertices to the face objects. I fixed that in a couple of minutes.
In a little more than 2 hours I had a working mesher. And it was fun to write. The code just seemed to flow out of my finger tips (and I am not especially skilled at Python). Yes, this code probably should be rewritten into a more modular style. And perhaps it could be made more ‘Pythonic’. So maybe another hour or two are needed to really polish it up. But there is something extremely satisfying about getting a rough version working with so little effort.
One major disappointment with the C version of the program was the memory problems. I had spent some time developing a generic list structure in C, so I could make my programming in that language less painful. The list and FOR_EACH macro were supposed to make C more like Python. The thought was good, but the experience was not.
Stats
The table shows the time and lines of code needed for each version of the code. From Prechelt’s study, we expect Python to be 3.8x faster and 3.6x shorter than C/C++. Roughly the speed up in development time mirrors that in lines of code. In my case, the speed up was nearly 9x, while the needed lines of code was about 2x less. This may be due to the inordinate amount debugging time the C code required. In any case the general trend holds: Python is a lot faster to develop in.
. Time --Lines of code -- Lang (hrs) Mesher Test Code Total ------------------------------------------------------------------------------------- C 20.9 299 163 462 Python 2.4 124 89 213 ------------------------------------------------------------------------------------- Ratio 8.7x 2.4x 1.8 2.2x
Run Times
The big downside of fast development time is slow execution time. Python does all kinds of nice things for you (like memory management), but that comes at a price. To measure this, I had both the C and Python meshers create a million cell mesh and timed them. Here are the results:
. Time RAM Lang (sec) (GB) ------------------------------------------------------------------------------------ C 6.9 0.366 Python 344 1.33 ------------------------------------------------------------------------------------ Ratio 50x 3.6x
This test was run on an Intel Core2 Duo (T7300) at 2 GHz, with 2GB of RAM. I think much of the time, for both programs, went into memory allocation. This was corroborated when I tried compiling the C version with various optimizations and got no change in the timing. There may some ways to allocate memory in bigger chunks, and thus speed up the code.
There are tools for speeding up the execution of Python programs. Tools like Psyco and Pyrex can offer dramatic improvements, at least under some conditions. That might be a good experiment for next time.
File: PyAlgMesher.py