Why I Love C (and Hate C++)

C and C++ come from absolute opposite ends of a spectrum. The former is, in a real sense, the lowest-level high-level programming language (it gives most of the keystroke-saving benefits of high-level language while the user retains low-level control and flexibility). The latter is, in many ways, the highest-level programming language to achieve a following. By ``low-level'' I mean the C programmer is ``in the driver's seat'' (and with a manual tranmission at that) guiding the computer through its exact sequence of subtractions, function calls, etc. By ``high-level'' I mean that the C++ programmer can write, for example Bombay.destroy without even remembering what Bombay is. The C++ compiler will choose from its reportoire and ``do the right thing.'' If Bombay is just a memory array, a version of free() might be all that's needed, while if Bombay represents a real city, the compiler might generate code to open /dev/icbm and launch a first strike.

I'm sure many people will disagree with me, and I'll agree the previous sentence may seem a bit ... hyperbolic, but the low-level/high-level dichotomy of C and C++ is crystal clear.

It happens that I do hate C++ but that isn't the point. I don't like Ada, PL/1, Basic or Cobol either, but they're entitled to their niches. What ``rubs me the wrong way,'' however, is the managerial belief that C++ is ``just'' an improved C and that, just as Fortran IV users happily switched to Fortran-77, so C users should be grateful that their language is obsolete.

In fact the relationship between C++ and C is like night and day.


How Bill Gates and Mike Dell Conspired to Ruin my Hard Disk

I dimly recall reading that a Federal judge will help me get my $60 back from Bill Gates, since I never wanted the OS he forced me to purchase for my laptop. But I'm afraid I may not qualify anyway: I did need to boot Windows a few times. That's because the $3000 purchase price for the Dell laptop didn't include a printed manual; to get even the simplest answers about the hardware one is expected to boot Windows and click on a Help icon. Other than that I was quite happy to stick with software by Linus and Stallman and guys like that. Unfortunately Billy or Mikey's software insisted on taking charge at one point so they could trash my hard disk. Let me tell you about it.

Working with the laptop one day, the battery charge ran out and the machine dropped power. Since I keep the machine hooked into the wall outlet I was surprised ... until I looked at the AC connection and saw it had come loose. A dim LCD icon had been warning me of this, I'm sure, but I had never acquired the habit of checking.

There is a loud audible alarm for the low-battery condition but it doesn't turn on until the last few seconds, after the keyboard has been disabled and a built-in ROM routine is taking a ``checkpoint.'' If they're going to waste some of their last few ergs on a loud alarm anyway, it would have been thoughtful to send the signal early enough to have a purpose, like reminding me to save an edit buffer.

Lack of a rational alarm didn't bother me: with software like fsck and ex -r, crashes cause little trouble. I plugged the machine in, charged the battery a little and powered on, hoping to see the happy Lilo: prompt.

But Billy and Mikey knew better.

The billonaires at Microsoft and Dell had anticipated my every need and they didn't feel I needed a Lilo: prompt just then. Instead they invoked an automatic restart procedure which could not be interrupted or disabled. Generalized checkpoint/restart is always difficult due to hardware registers, and impossible on most systems, but it can often be achieved with OS cooperation. Linux couldn't cooperate, of course; Bill's contract with Michael requires that firmware and hardware details be kept secret from Linux developers, but this fact didn't stop the laptop from doing a restore. It assumed the OS was Windows, and if instead another OS had been running so that garbled control plops into Linux kernel code, why that can be called a user error!

The same thing had happened to me before. You just wait for the red disk-activity LED to go out, cycle power, wait for the Lilo: prompt, run fsck and so on. You get a little annoyed at the extra step, but can spend the wasted minute writing an anti-Microsoft polemic.

But this time the red disk-activity LED didn't go out.

I don't know what the ``smart boys'' would have done then (besides vowing to buy no more Dell hardware). None of the buttons or controls on the laptop would have any effect except the power off switch, but I suppose I could have disconnected the AC again, hoping to eventually transfer control back to the low-battery checkpoint firmware. I waited a good while, planning to jab the power switch as soon as the disk activity LED went out, but it never did. I jabbed the power-down button anyway but I wasn't happy about it. I've forgotten much of what I knew two decades ago when I repaired million-dollar mainframes for a living, but I did know dropping power on an active drive was a No-No. With write-gate off there'd be no problem on a properly designed disk controller, but none of Billy or Mikey's designs had ever struck me as ``proper.''

Sure enough, I got a permanent format error, and lost dozens of i-nodes. Laptop's still got the format error in fact, two years later. I could fix it, but it's got other problems and a new computer costs less than the emotional cost of tracking down format-track documentation for Mikey's obsolescent laptop. I still boot Linux on the broken laptop from time-to-time (carefully avoiding the bad i-nodes) to hunt for old source files.

Right now you're wondering what this has to do with the issue of C versus C++, especially since Microsoft encourages its customers to write their viruses in the language that brought Billy his original fame and fortune: BASIC. But it's the ``philosophy'' of C++ I object to most, the sadistic idea that controlling the user is a designer's highest goal, and in this anecdote the anuses of the billionaires who trashed my hard disk are seen to exhibit this ``philosophy'' in spades.

Background: Friendly vs KISS schools

Computing languages had been used earlier, but did not ``catch on'' until the 1960's. The early compilers usually implemented a convenient but sloppy subset of a hypothetical fine or pristine language. The 1970's was a period of strong development in languages and compilers, and two dominant, but conflicting, trends emerged.

One trend is labelled ``KISS'' (Keep it simple, sailor). It is arguably related to the ``WYSIWYG'' trend in text-processing. While earlier languages often had quirks, interpreters for LISP and Forth perfectly emulated a simple (elegant), clear (deterministic) language (abstract machine). The culmination of this trend was C, the portable Assembly language. (Lisp, Forth, Prolog, etc. are arguably more elegant but C has many systemic advantages: system control, speed, clarity.)

The other trend is called ``User Friendliness.'' This trend has become so omniprescent I may bore you with examples, but its flaws are often ignored. Designers from this cult often view the language spec as a flexible matter, adding special cases whenever they'll provide friendly features to the user!

I don't approve of the friendly-language school, so this webpage may seem slanted.

I even start to see trouble in the ``friendly'' Ansi C rule that sqrt(4) is silently replaced with sqrt(4.0), the idea being to ``correct an obvious error or shorthand.'' One sees how pervasive ``friendliness'' has become when one recalls that ``Ansi C'' is the paradigm of the antithetical KISS movement.

Like the defense attorney who advises his client to plead the fifth when asked for name and address, I would oppose the Ansi C rule just mentioned. No, it's not over-dangerous by itself, but it lets a tainted view into the room, and next thing you know, you end up with C++.

C -- A masterfully SIMPLE language

Let's take a look at C. It's a shame how it's acquired a reputation as an unfriendly `obfuscable' language: the reality is quite different. First note that many of the `obfuscating' mechanisms are just simple abbreviations. Thus
 
 
Simple meanings of C operators
``Obfuscated'' operation Simple translation
  (++p)   (p += 1)
  (p++)   ((p += 1) - 1)
  (p && q)   (p ? q ? 1 : 0 : 0)
  (p || q)   (p ? 1 : q ? 1 : 0)
  (!p)   (p ? 0 : 1)
  (p == 0)   (p ? 0 : 1)

Thus the wierd operators, which some consider ``obfuscating,'' are just abbrevations for combinations of a smaller set of operators. Once you commit to memorize just this, any remaining mystery about these five operators disappears. You'll still have to memorize the meanings of


        (p += 1)
and

        (p ? r : s)
but that should be easy, with their simple mathematical definitions. (This is in contrast to constructions in C++ where one may, in principle, need to solve an inheritance maze before tracing an overloaded operator to a named function. Often, in keeping with broader C++ philosophy and market configuration, the named function is provided as object code only.)

It is important to stress that the identity between the C expressions in the above table is exact. In our fuzzy world one often says, ``classes are like structures'' or ``functions are like operators'' meaning that there is an interesting set of similarities. But here, in our specifications of certain C constructions, the identity of result must prevail in all cases.

Another simple translation, which should clarify use of for and its associated break and continue is as follows:


        /* Example fragment with `for', `break' and `continue' */
        for (A; B; C) {
                ...
                if (D)
                        break;
                else if (E)
                        continue;
                ...
        }

Here's the same program, rewritten to avoid unneccessary keywords:


        /* Same code without `for', `break' and `continue' */
        if (A, B) do {
                ...
                if (D)
                        goto Out;
                else if (E)
                        goto Next;
                ...
        Next:
        } while (C, B);
        Out:

Again, be confident this translation is exact; there's no looseness or ambiguity here, as there was for example in Fortran IV where index variables or `CONTINUE' statements had peculiar semantics in some compilers.

Pointer syntax

Any C source will be either (a) comments, (b) preprocessor-related, (c) a type declaration, or (d) ordinary procedural expressions. There's much to say here: this web page isn't finished. In particular I want to say that all the following are absolutely identical when they occur in ordinary procedural expressions:


        *(Arr + 7)
        *(7 + Arr)
        Arr[7]
        7[Arr]
        *& 7[Arr]

The last line here is worth special attention: it shows that `*' and `&' are each other's inverse operation. Thus the following expressions are all equal to each other, and denote the address of the datum expressed in the previous table.


        (Arr + 7)
        (7 + Arr)
        & Arr[7]
        & 7[Arr]
        &*& 7[Arr]

Subtraction in the C Language

Several kinds of subtraction are possible in C. Letting i, f, and p denote integer, floating-point number and pointer respectively, all of the following forms are valid:


	i - i
	f - f
	f - i
	p - i
	p - p

It is straightforward to take the difference of two integers or of two floating-point numbers, but already the form f - i represents something non-trivial: the C compiler will need to convert the integer to a float and develop a floating result.

Subtractions with pointers is more special. Here's a program which computes the difference of two pointers:


	char	Array[30][7];
        char    (*Arr11)[7] = &Array[11];
        char    (*Arr13)[7] = Array + 13; /* same as &Array[13] */

	printf("Value of Arr13 - Arr11 = %d\n", Arr13 - Arr11);
	printf("Value of Arr13 - Arr11 = %d\n", (int)Arr13 - (int)Arr11);

The two numbers printed by this program will be 2 and 14. The `2' comes as no surprise considering that (Array + 13) - (Array + 11) is what is computed. The answer must be 2 as long as ordinary rules like associativity apply. Indeed much of the confusion beginners have about C pointers would disappear if they just remember that p + i is a synonym of &p[i] and that pointer arithmetic obeys the rules of ordinary arithmetic. (This is one of the marvelous facts that make C the favorite language of so many professional programmers; yet there are continual suggestions by well-intentioned ``experts'' to ``improve'' C by making p == &p[1] or even p == &p.)

With this example of the subtraction form p - p, rules of arithmetic dictate what p - i must mean: (Arr13 - 2) in the example gives a pointer to the 11th object in Array. This isn't all quite as trivial as it might seem; since these particular objects are 7 bytes in size, the actual difference between the machine representations of the 13th and 11th pointers will be 14 (in most implementations). This is shown in the second number printed by the example program.

These simple rules lead directly to the elegant treatment of pointer and array syntax described above.

Addition and subtraction thus have multiple meanings in C, but there are very few cases; and the cases are simple, fixed and easily remembered. The ordinary rules of arithmetic are obeyed. I like C's treatment of addition and subtraction.

In C++, the programmer can create as many versions of addition and subtraction as he likes, and there is no obligation to obey rules like associativity. Someone reading a program may need to examine many source files, and even solve an ``inheritance maze'' just to learn the name of the function that will perform a given + or -. The experts are still debating how the C++ compilers should resolve an inheritance maze in ambiguous cases.

Even worse, when you finally determine the name of the function invoked by a C++ operator, you will often discover that its source code is unavailable, that it was purchased in an object-only form. (Of course there's nothing in the language specs that makes C open source and C++ secret source, but this is the philosophy and indeed ``secret source'' is used as a major advertising point for C++!)

I despise C++ and its philosophy from beginning to end. Oddly, C++ principles are not an isolated travesty but symptomatic of an unfortunate trend which has polluted many computer design choices. How this came about and why it is lamentable is the subject of this essay.

Smart Objects --- Just Say No

A friend handed me a CD saying, ``Funny pictures and stories: check 'em out.'' I don't boot MS-Windows very often, but I remembered how to type `Braindead' at the Lilo prompt, and soon I was looking at some strange-looking icons.

Some were video clips, some still images, and some just appeared like text files, but even the text files invoked Microsoft Outlook when clicked and kept my hard disk blinking far longer than seemed appropriate. I was worried about viruses: MS-Windows wouldn't be missed but I keep Linux, my family tree, etc. on the same hard disk, so caution seemed in order. The videos and pictures weren't funny, but I thought I might as well read the text files. I didn't understand why Outlook needed to spin the disk for a full minute everytime it read a one-paragraph text file, so I decided to read the files with less presumptuous software.

The operating system I use has the concept of programs and data, which can be very convenient. Among the many very simple programs which can be used to display the text of a data file are more, less, od, tail, head, cat, strings, grep, emacs, and vi. These programs are all very innocuous. In particular, their simplistic outlook means that if their input data is a virus the virus will not spread. I guess in the modern philosophy that makes them ``stupid'' programs.

I know there are Microsoft fans out there. If you're one you're probably thinking ``But Microsoft can do anything that those programs can do and more.'' Ignoring that this statement is untrue anyway, I didn't want to do ``more than'' just display my text file -- I just wanted to display it.

As I say, there are many such programs to choose from in Linux, but I'd already booted MS-Windows. I know very little about Microsoft software, but it seemed plausible that there would be at least one program that could display a text file (and at least fail gracefully if the input wasn't a text file). I know Microsoft has spent many billions of dollars serving humanity; it didn't seem farfetched that at some point they'd invested a few million dollars implementing at least a subset of what my favorite Unix utility, cat, can do.

I decided to try Notepad. I'd had good luck with it in the past. (I don't know if they still ship the only other Microsoft editor I've ever had use for: Edlin.)

From Notepad I selected the Open option and was presented with a list of files. In keeping with the Microsoft philosophy of user-friendliness, the file-suffixes were not shown: these tell Windows what to do with the files, but apparently Microsoft thought it would confuse me if I knew what to expect. I clicked somewhere at random, expecting to either see some text, or a message with a meaning like ``Notepad doesn't think this is a Notepad data file.''

Instead a video clip was launched. If `Microsoft documentation' isn't an oxymoron altogether, perhaps there's a write-up somewhere that describes Notepad as ``the world's simplest text-editor ... which BTW launches a video clip when that seems like a friendlier idea.'' I'm sure any Unix user would be flabbergasted at the idea that od foobar should launch an MPEG driver from the shell prompt whenever foobar looks like a video clip instead of an ``octal file.'' I'm also sure that most of a generation brainwashed by Microsoft hasn't the foggiest idea what I'm talking about.

Well, that's the end of this anecdote; I hope you're not waiting for a better punchline, or formulating some sincere e-mail to educate James Allen that MS WordTM is a better editor than Edlin, or that Microsoft would implement cat if I rounded up a million customers willing to pay $19 each. I would strongly support the distinction between programs and data even if there were no such thing as computer viruses, but since I clicked on Notepad only in an effort to avoid a possible virus, let me offer a comment on that topic:

Some members of the general public probably imagine the virus industry as analogous to a competition between expert lock manufacturers and very clever burglars. In fact, however, Microsoft and many other vendors not only provide no `locks' but seem to deliberately make things as easy as possible for hapless `burglars.' The worldwide spread of the Lovebug virus, for example, which caused billions of dollars in damages, was inadvertant.

I use the term ``virus industry'' deliberately. A lot of people would be out of work if users stopped buying regular upgrades to their virus protection software ... or if programs like Outlook were redesigned to behave sensibly.

Smart Objects --- Just Say No, part 2

When I clicked on the icon, whose meaning was concealed from me, MS-Windows ignored the program I had told it to run, and reacted to the data it was presented with. This is a very up-to-date philosophy, though I don't approve.

Here's a simple program to illustrate a key difference between old-fashioned C and Ansi C.


#include        <math.h>

float squareroot(x) float x;
{
        return sqrt(x);
}

main()
{
        printf("squareroot(4) = %f\n", squareroot(4)); 
        printf("sqrt(4) = %f\n", sqrt(4));
}

Here's the output of this program:


squareroot(4) = nan
sqrt(4) = 2.000000

Since the square root of 4 is indeed 2, many people will consider it obvious that sqrt() is better behaved here than squareroot(), with its ``improper'' prototype.

But not me.

Go back to my home page.