Tuesday, October 18, 2011

Dennis Ritchie's Legacy

Dennis Ritchie passed away on October 12, 2011 at the age of 70. I never met the man, but we have all been profoundly affected by his work, particularly in the development of C.

It's hard to imagine a world without C, but prior to its development in the early 1970s, essentially all system-level program was done in assembly code. That had the undesirable features that 1) there was no portability from one machine to another and 2) writing assembly code is tedious and error-prone. Although higher level languages, such as Cobol, Fortran, and Algol, were well established at the time, they sought to hide away many of the low-level details that a system-level programmer must use. Here are some examples of features that were not available in these languages:
  • Bitwise logical operators
  • Integer variables of different sizes
  • Byte-level memory access
  • The ability to selectively bypass the type system via unions and casts

C had predecessors: The B language developed by Ritchie's colleague Ken Thompson, which in turn arose from BCPL, developed by Martin Richards of Cambridge University in 1966. These predecessors had many limitations, however, and so C really is the first, and arguably still the best, language for implementing system software.

C was, to my knowledge, the first high-level language that embraced the byte (although referred to as char) as a first-class data element---something that could be used as a data value, read from or written to memory, and replicated as an array. Having the ability to operate on byte arrays enables C programmers to create arbitrary system-level data structures, such as page tables, I/O buffers, and network packets. Such a capability was previously only available to assembly language programmers.

The early C compilers did not produce very good code. To get C programs to run fast, programmers had to do a lot of low-level tweaking of the code, such as declaring local variables to be of type register, converting array references to pointers, and explicitly performing such optimizations as code motion, common subexpression elimination, and strength reduction. This had the property of making the programs nearly illegible. Serious code tuners often resorted to assembly language. Fortunately, modern compilers make most of this low-level hacking unnecessary.

Consider how much code is still being written in C: all of Linux, all GNU software, and much more. If we extend to its younger brother C++, we get the vast majority of computer systems worldwide.

Dennis, thanks for the great language you created.