Building Solid Code - Assertions in EGO

No programmer likes bugs in his or her code. Still, bugs do not creep into code by themselves. Programmers put bugs into the code, without really wanting to. This "technical note" is about how programmers can avoid some of the practises that result in buggy software.

Most of the text of this paper is general purpose; some of it applies to the EGO development system.

What exactly is a bug?

The introductory paragraph lays all blame with the programmer, and this is not quite fair, to say the least. Apart from bugs in the implementation, there are:

bugs in the feasibility study (the cause of budget overflows and cancelled projects),
bugs in the user expectations,
bugs in the specification(s),
bugs in the compiler and other development tools, and bugs in the operating system,
bugs in the testing phase of the product (the "why didn't testing catch that bug" syndrome),
and bugs in the documentation (we all know documentation errors by now).

That aside, the focus of this note is to prevent bugs in the implementation of a program.

I also loosely define a bug as "a functioning of the software that goes contrary to the programmer's intentions". If a programmer did not consider supporting Unicode in his product, that is not a bug (it may be a limitation that makes the software unusable for some applications, but it is not a bug). If, however, the software supports Unicode, but translates legacy texts using the wrong "codepage", that is a bug.

Should the compiler/interpreter not catch all bugs?

This issue has both technical and philosophical sides. I will forego all non-technical aspects and only mention that, in practice, there is a trade-off between the "expressiveness" of a computer language and the "enforced correctness" (or "provable correctness") of programs in that language. In extremes, making a programming language in which one cannot make an error, will be so severely restrictive that it cannot do much real work either.

Making a language very "strict" is not a solution if work needs to be done that exceeds the size of a toy program. A too strict language leaves the programmer struggling with the language, whereas a language is supposed to be a simple means to express algorithms in. When they need to work in a strict language, programmers will invent ways around the protection mechanisms:

The argument has been made that Pascal is a better language than C because the chance that you will accidentally type "begin" is much smaller than that you unintentionally type "{", but programmers were quick to create keyboard macros to be able to "type" the word "begin" with a single key stroke. Other verbose language constructs are conveniently circumvented by copy-and-paste actions, thereby also circumventing the safety checks in the language.
The current trend in strict type checking (in a reaction to the very lax type checking of the pre-ANSI "C" programming language) has led to an abundance of type casts. It has become common to see sample code from Microsoft Corporation that has type casts in cascades (as if the programmer was confused about which type cast was really necessary) and type casts being applied were none was necessary (for example, casting an expression to the type that it already has). Remember that a type cast is not the equivalent of a type conversion, rather it is the equivalent of telling the compiler/interpreter: "shut up, I know what I am doing" —a type cast temporarily disables the type checking mechanism.
For example, below is a code snippet from a demo application for the Microsoft Speech API (SAPI) 5.0:
```
SPVOICESTATUS  Stat;
WPARAM         nStart;

  (67 lines omitted)

nStart = (LPARAM)( Stat.ulInputWordPos / sizeof(char) );
```
Stat.ulInputWordPos has the type ULONG which turns out to be "unsigned long"; the sizeof operator gives the type size_t; WPARAM, finally, is defined as UINT which in turn is "unsigned int". A compiler for the Win32 API has size_t set to unsigned int, and both the int and long types are 32-bit. The upshot is that all three symbols used in the expression are unsigned 32-bit integers. The cast of the division result to LPARAM (a signed long type) is completely unnecessary, and actually confuses the matter. Looking at the expression alone, remember that that are 67 lines between the declaration of nStart and the expression, might lead you to think that nStart holds a signed integer value.

In conclusion, strict programming languages lead to programmers inventing work-arounds. Sometimes such work-arounds are required, sometimes they are just the easy way out. The most problematic area is that sometimes work-arounds are applied where they are not necessary and occasionally where they do harm.

The Microsoft terminology for doing things that you are not supposed to do, but do anyway because of sheer convenience or for lack of "clean" alternatives, is "partying". A pair of lines that I have seen in various Microsoft code examples is:
    RECT rectMain;
    GetClientRect(hwndMain, &rectMain);
    ClientToScreen(hwndMain, (LPPOINT)&rectMain.left);
    ClientToScreen(hwndMain, (LPPOINT)&rectMain.right);
The goal is to offset a rectangle from zero-based "client coordinates" to screen coordinates. However, the Windows API does not have a function that is immediately obvious for this task, but for moving a POINT structure there is ClientToScreen(). The RECT does not consist of two POINTs; the structures are independent of each other. Calling ClientToScreen() is therefore risky, but it is two lines shorter than the clean alternative. This is a typical case of partying with the Win32 API.
Partying has its drawbacks too: the above mentioned code snippet will not work correctly for bidirectional software (for the Arabian and Hebrew languages). In fact, this code snippet popped up again in a Microsoft technical article on writing international software, in the section discussing common API abuse (without naming the guilty party).

The EGO language was designed to avoid any artificial verbosity, and its type checking is fairly loose. The goal of the EGO language is to provide the developer with an informal, and convenient to use, mechanism to test whether the program behaves as was intended. This mechanism is called "assertions" and, although the concept of assertions pre-dates the idea of "design by contract", it is most easily explained through the idea of "design by contract".

Design by contract

The "design by contract" paradigm provides an alternative approach for dealing with erroneous conditions. The premise is that the programmer knows the task at hand, the conditions under which the software must operate and the environment.

In such an environment, each subroutine (or macro/co-module) specifies the specific conditions, in the form of assertions, that must hold true before a client may execute the subroutine. In addition, the subroutine may also specify any conditions that hold true after it completes its operation. This is the "contract" of the subroutine.

The name "design by contract" was coined by Bertrand Meyer, the designer of the language Eiffel, and the principles trace back to predicate logic and algorithmic analysis.

Preconditions specify the valid values of the input parameters and environmental attributes
Postconditions specify the output and the (possibly modified) environment
Invariants indicate the conditions that must hold true at key points in a subroutine, regardless of the path taken through the subroutine

For example, a function that computes a square root of a number may specify that its input parameter be non-negative. This is a precondition. It may also specify that its output, when squared, is the input value ±0.01%. This is a postcondition; it verifies that the routine operated correctly. A convenient way to calculate a square root is via Newton iteration. At each iteration, this algorithm gives at least one extra bit (binary digit) of accuracy. This is an invariant (it might be an invariant that is hard to check, though).

Preconditions, postconditions and invariants are similar in the sense that they all consist of a test and that a failed test indicates an error in the implementation. As a result, you can implement preconditions, postconditions and invariants with a single construct: the "assertion". In EGO, the assertion command is called ".verify()" or, in French, ".verifie()".

An assertion is a simple statement that contains a test. If the test outcome is "true", nothing happens. If the outcome is "false", the assert instruction pops up a message box and drops into the debugger.

One additional nicety of assertions is that when you build the retail version of your software ("DIDAX"), the assertions are silenced. In other words, adding many assertions to your code does not slow down the retail version of the software.

Using assertions

Whenever you track down a non-trivial error in your code, think...

Think how this error could occur and, before fixing the error, think what assertion would have trapped this error. Before fixing the error, put that assertion in the code and test that it really does work. Then fix it (and test again).

You have just fixed the error. So it will not happen again. So why leave the assertion in? What is it still good for? There are two goodies:

Bugs sometimes come back, especially if you are working in a team. It is not uncommon that a team member overwrites corrected modules with his or her local modules (I have seen this happen even with "version control" in place). Also, team members may "fix" your fix to the error, as surprising as this sounds.
Most bugs are caused by special circumstances: inputs that were not expected, synchronization errors due to an unforeseen heavy load of the network,... The exceptional conditions that caused this bug may also uncover other bugs.

As already stated, DIDAX removes assertions. So you should never put production code inside an assertion. Similarly, error conditions that can occur on the client's machine (such as a missing file) should be tested for (and handled) explicitly and not via an assertion.

Finally, if you are working in a team and you are adding assertions to existing code, inform people (your team members, your manager) what you are doing. As confident as you may be in yourself and in your colleagues, expect the error rate to go up when you first start using assertions in an existing project. The reason is that erroneous conditions frequently go unnoticed by tests, or they are silently corrected or rejected by other parts of the program code.

Projects are often managed on strict schedules and the status of a software project is often evaluated on the progress of features that are implemented and the amount of bugs that are fixed. If, due to adding assertions, the number of known bugs increases (not because there are more bugs, but because more existing bugs become "known"), some people may get nervous.

The upshot is that, due to there being less "hidden" bugs, the status of the project is clearer and the remaining work is easier to schedule. Assertions also help in gaining confidence in your code. Being able to say that a certain class of bugs can no longer go undetected in your code is nearly as good as saying that none of your team will ever make a bug of that class.