I was reading the article “Destructors Considered Harmful” on DrDobbs by Andrew Koenig. Generally I agree with Andrew’s views on various programming topics but I thought I should just write down my thoughts on this particular article. I have a number of agreements and disagreements on certain points. One point that specifically sticks out especially with the new generation of dumb-down languages is that it looses the traditional power of kind of raw functionality that you can churn out from a piece of code. Of course we can all be happy with a walled garden approach if we are only implementing some business logic which does not need to actually utilize the full power of the underlying HW but instead rely on language/OS supported runtimes to do the required optimizations for them. But what happens when we start arguing against a programming concept/construct just because it requires some basic programming discipline and understanding of basic principles before one actually needs to start using it?
One such typical argumentation occurs with GoTo statements. And as mentioned in Dijkstra’s paper as well as that article, GoTo’s definitely create a branching pattern which will be hard to follow, and now comes the important part “If they are used haphazardly and just as a means of executing particular lines of code without taking into consideration what the state of the program might be!”. A structured language definitely has no place for GoTo. But what about assembly language? Just rejecting GoTo as a general programming construct is not a sensible way going forward. Instead, the programming discipline should be taught on how to understand the program state and use proper branching constructs. GoTo’s are essentially a boon when I am writing boot loaders or doing POST routine implementation. A failed device initialization or return code will not stop the POST from executing.
The article goes forward stating that GoTo affects our ability to think about programs! I really don’t agree there. Before I compile program or even before I add a variable, I follow certain strict rules. I will try to jot them down below which falls into the discipline category. If people follow that, everything will fall in order more or less.
Checklist for declaring/using variables:
– Why is the variable needed?
– What kind of variable is needed?
– Does it have to be on the stack or heap?
– What is the lifetime of the variable?
– What is the typical memory allocated on a particular architecture for that variable?
– Will that variable need more memory than it would be allocated?
– Is it necessary to consider endian-ness while evaluating the variable?
– What are the bounds for the data being help by variable?
– Does the variable need validation before evaluation?
– Is it an alias for holding other variables or memory addresses (in other words, is it a pointer)?
Once you answer the above questions, one very well understands why the variable is needed and how it should be used. Why am I talking about variables now? Because branching very much depends on the state of a variable or memory location (register, etc.). The author in the article also talks against pointers. Yes, pointers create problems when you don’t follow certain rules. I will try to put down my points using simple statements with the below C examples.
Prog1: | Prog2: |
#define CONST_SIZE 10
…
int n=5;
char ptr[CONST_SIZE];
…
while(n>0) {
… // Do some logic implementation here // …
n-=1;
}
…
|
#define CONST_SIZE 10
…
int n=5;
char *ptr = (char *) malloc(sizeof(char)*CONST_SIZE);
…
while(n>0) {
memset(ptr, 0, CONST_SIZE);
…// Do some logic implementation here // …
n-=1;
}
free(ptr);
ptr=null;
|
Most of the newbies that I see would use the programming construct on the left and not on the right. And believe me, even professionals do it simply because we don’t follow certain programming guidelines. Prog1 has advantages but is limiting in terms of scalability. Since the memory is allocated on stack, your code will simply be a bottleneck if you increase the CONST_SIZE. On the other hand, if this particular function is called multiple times or is recursive, it will lead to stack overflow and even end-up fragmenting the stack. Prog2 is much more dynamic and scalable. Increase in CONST_SIZE will not be an issue and it can even be passed as a dynamic parameter to a function in future versions if needed. The best part is that we reduce the stack usage and re-use the same memory locations during the whole run of the function.
Of course if some programmer ends up messing with ptr without storing the initial location in another variable, he will end up with the problems that Andrew is pointing to especially “pointer exceptions pointing to invalid memory locations”. But instead of saying pointers are bad, why not say that pointer manipulations are bad without storing the initial memory address location. All the reasons he has mentioned in the article happens just because a programmer is inefficient in his use of pointers.
The rest of the article than goes on comparing pointers to labels in GoTos. That I would say is absolute BS. First of all, labels in GoTo statements are “labels” or entry points to code fragments. i.e. they are addresses. Secondly, pointers are variables pointing to memory locations. i.e. they are not entry points to code fragments. Andrew philosophically tries to confuse basic constructs to prove his case.
A look at page 2 in the article, one immediately notices the fallacy of his arguments. Now he moves into C++ domains but let me try to prove why I disagree with him. Firstly he is not following the rules for using pointers. i.e. pointer validation before assignment is an important aspects. Most of the bugs arise because of no/improper pointer validation. Ownership of variable is an important aspect. Even in C++, the scope, extent and evaluation of the variable needs to be thought of. One important rule to remember is, “If your code is not owning the memory i.e. your variable has only been assigned the value and has not been passed the ownership, then you don’t own the memory!“
The code that he has written will not even pass my basic code review test. In the constructor for ThingHandle, he is simply assigning the variable to the pointer tp. i.e. he is simply assigning the value but not passing the ownership. And in the destructor, he is assuming ownership of memory location and trying to free it. Of course it would fail but not because pointers are bad in nature or destructor are stupid, it is only because some guy didn’t follow the proper coding discipline. He writes that he is encapsulating “memory allocation” but in the code fragment he shows, he is not encapsulating “memory allocation”!
All the other statements after that falls down as a pack of cards. The errors he points out only happens because somebody has messed up with the concept of ownership and data encapsulation. And then he goes on to state the analogy behind GoTo and destructors! And I say again, labels have to be unique and pointers can point to the same memory location. So his last paragraph is nothing but pure BS. And on page 3, he advises to use shared_ptr which I agree we should but I don’t agree with the case that has been build up by Andrew to push for shared_ptr. I would have more approved the statements if he had made it WRT inheritance (especially multiple) and tried to base the facts on relationship between objects.
In any case, programming disciple is a must and people should be made about possible pitfalls of using certain programming constructs instead of just telling them that it is bad to use. Programming guidelines, checklist and deep knowledge of the language constructs as well as knowledge about how a compiler will work will not only help us get better code but will also help us write code with lesser bugs. I have seen many c#/java programmers who lack understanding of basic concepts simply because they are not present in the language and they haven’t been taught to deal with it. I personally feel that the rise in bloat is because of programmers not understanding a few simple and basic constructs.
In my next article, I will point out to some coding mistakes that I have seen in professional real-time code!