This essay is basically an attempt to outline my beliefs regarding the development of software. I typically program in a fantastic language called "C++". More specifically, I generally use a dialect of this language provided by Borland. It is known as "C++ Builder."
Please note that this is a work in progress that I only work on when my development station is rebooting or something similar is going on. Expect it to be largely incomplete, to have sections that end mid-stream, and to be generally flawed for a LONG time.
As you read this, you will find that I refer to a number of things as "evil." What I mean is that such things have a tendency to reduce the quality of a software development project. For example, they may have the undesired effect of making it easier for bugs to creep in. They may make it harder to read the code later on down the road. They could make compilation times grow so high that it becomes difficult to work. Any number of effects can make the cause "evil."
The list is too long for any one person to provide, but I'll try and touch on the highlights:
"C" programmers are having heart attacks right now. "How could you say that pointers are evil? They allow us to do dynamic allocation, thereby freeing us from having to hard-code array sizes! They allow us to point to functions! They perform so many useful tasks as to make life without them pointless!" If this is your reaction, then you missed the key word : RAW. Pointers are not evil. RAW pointers are.
What's the difference? Let's start by examing what a raw pointer looks like in code:
What we have done here is declare an object of type int* and stored the value 0 in it. This seems pretty innocuous. We've initialized the variable, so there's no danger that anything could be amiss. Read on, however:
Do you see the bug? It's not very obvious right off the bat. The problem is that we use new int[32], but delete - without the []. How about this one?
This one's got a nasty little bug in it. Again, it isn't too obvious unless you've seen it or read about it before... Ready? Ok - what happens if the second new expression fails due to a lack of memory? If you answered that it returns NULL, then you're wildly out-of-date. No, new throws an exception! What does this do to raw_pointer_to_int1? If you answered "It leaks, duh!" then you're exactly right. It leaks like a sieve.
Both of these bugs (and a zillion others) can be avoided by simply not using raw pointers. Instead, use classes that wrap your pointers and manage them. In other words, always have an RAII library at work whenever you dynamically allocate ANYTHING. RAII stands for "Resource Acquisition is Initialization," and is probably the least descriptive name in all of computer science. The premise is that because raw pointers (and other dynamically allocated objects) are EVIL, it is best to get assistence from the compiler to make sure that they do not run amok. The exact mechanism is to always replace raw pointers (or handles, HWND's, etc) in utility classes whose sole purpose is to guarantee that said objects get destroyed and released at the appropriate time.
So how do we fix the examples above? The first one should be fixed by using a Standard C++ Library container, std::vector:
std::vector guarantees that it will dispose of the ints properly. If it uses new[] internally, then it will use delete[] to get rid of its sequence. Note: std::vector actually uses an object whose type is specified by one of its template parameters to do its allocation/deallocation, but you can generally think of it using new[] and delete[].
The second example is easily fixed with the aid of the Standard C++ Library. There is a utility class there called std::auto_ptr. It is the only smart-pointer provided in the Standard Library, and it has a few minor quirks that can be surprising to the uninitiated, but all-in-all it is a true workhorse. Here is how you would use it:
Notice the fact that we can drop the delete's. This is because when pointer_to_int1 and pointer_to_int2 fall out of scope, they will delete their respective ints. What's more, it doesn't matter how they fall out of scope. If the second new expression fails and throws an exception, pointer_to_int1 will still delete its int.
What about pointers to functions? They seem reasonably safe, that is, unlikely to introduce bugs. Don't they? Well, kind of. The problem is that they just aren't flexible enough. You can store a pointer-to-function OR a pointer-to-member-function. What's worse, for member functions, you can only store a pointer-to-member-function-of-a-SPECIFIC-class.
Borland has a pseudo-workaround, the __closure data type, which is a pointer-to-any-class-member-function, but with an instance of the class tied in as well. So with C++ Builder you can store the address of a member function for a specific instance of some class. Luckily, it doesn't matter which class that is, so long as the member has the correct signature. Sadly, __closure is limited to class members. In fact, it is also limited to a single object handling the event. This is fine and dandy for the VCL's event handling system. Well, kind of... It works, but it isn't exactly extensible and leaves a lot to be desired.
Enter a genius by the name of Andrei Alexandrescu. In his book, Modern C++ Design - Generic Programming and Design Patterns Applied he introduces a library called Functor. A functor is an object that can be called via operator(). A Functor (notice the capitalization) is a template for generating functors. What is so special about this? Alexandrescu's implementation allows the caller to store a Functor that calls a global function, a member function on a specific class instance, a series of other Functors, some random Functor, etc. etc. All of this while allowing you to add even more special types of Functors that the caller has no concept of. Very cool indeed.
Ok, you say, but is this really superior to plain-old-function pointers? Couldn't you implement a function that implements this functionality? Of course. You could also write all of your code using a hex editor to edit the executable itself. The point is that Functor provides a well-designed framework for a truly extensible and eminently usable callback mechanism. What's more, because Functor is a class rather than a variable, you can have its default constructor ensure that it is always initialized to something sensible.
Do you store HANDLEs, HWNDs, function pointers, and other raw pointers in your classes? Do these classes have a purpose other than to wrap those raw pointers and guarantee that they will get cleaned up? If so, then you are a member of the evil cult of dirty members.
Consider this code:
It's got a bug, very similar to the second example under "Raw Pointers are EVIL". The problem is that if m_int2( new int ) throws an exception, m_int1 will leak. This is because objects that are not fully constructed will not have their destructors called.
The solution, of course, is to replace the members m_int1 and m_int2 with auto_ptr's or other smart pointer, like so:
Ignore the foo( foo const& no_copies ) I'm merely trying to be consistent with some things I plan to add to this essay in the future.
The point I'm trying to make here is that if your class does anything other than manage a raw pointer, then it has no business storing them directly. If you have anything that needs cleanup, then it needs to be in its own class. I know it's a pain in the butt to write these piddly little classes to deal with resource management, but believe me, every time I have ignored my own advice and skipped that step, I have regretted it later. So learn from my mistakes and take the time to build a library of resource managers.
I guess you could argue that this is really the same as saying that raw pointers are evil. In a sense it is, but I wanted to point out that merely having a class manage your resources is not enough. You must have your resource management class do absolutely nothing else and manage exactly one resource.
How many times have you tracked down a bug to the fact that you didn't check the return code of a function? If you said "0", then I'm willing to bet that you're either lying, you forgot, or you're actually not a programmer.
When you write functions, do not use error codes to indicate failure. Use exceptions. They cannot be ignored. There are a million other reasons to use them over error codes, but this one really stands out to me. You cannot ignore exceptions. Your program will crash if you do. This is actually a good thing, believe it or not. If you forget to take a failure condition into account, and that condition occurs, is it safe to continue execution? No - your program is officially in the realm of untested behavior.
If you get into the habit of writing functions that throw exceptions, you will also get into the habit of not checking to see if the functions you call return failure codes. This can be quite hazardous if you are using "C" APIs or 3rd party libraries that return error codes.
Obviously, you don't want to rewrite those APIs or libraries. The whole point of having an API or library is that somebody else has done the hard work for you. However, sometimes they did it wrong and you need to correct the error of their ways. The best approach to this is to write a wrapper for their functions that checks the return code and throws an exception. You then call this wrapper exclusively.
| #include <windows.h> | |||||||||||||||
| HGLOBAL global_alloc | |||||||||||||||
|
( UINT uFlags , SIZE_T dwBytes ) { HGLOBAL result = GlobalAlloc( uFlags , dwBytes ); if ( !result ) | |||||||||||||||
|
throw std::runtime_error( "GlobalAlloc Failed" ); /* Note: you should really use something more specific than std::runtime_error... more on that later. */ | |||||||||||||||
|
return result; } | |||||||||||||||