Wednesday, August 26, 2015

Diane's silk dress costs $89

What could a woman's wardrobe have to do with computer systems?

This is a clever mnemonic devised by Geoff Kuenning of Harvey Mudd College to help him remember which registers are used for passing arguments in a Linux x86-64 system:

%rdi:   Diane's
%rsi:   Silk
%rdx:   dress
%rcx:   costs
%r8:    $8
%r9:    9

Thanks to Geoff for providing this helpful aid!

Tuesday, June 2, 2015

Third Edition: Ready for Fall Courses

The Third Edition of Computer Systems: A Programmer's Perspective came out in March.  The CS:APP web page now contains information for this edition, with a link to the web pages for the second edition.  We already have a (fortunately small) errata page.

This fall, we will be teaching 15-213, the CMU course that inspired the book originally.  Leading up to that, we will update the lecture slides and the labs, and we will be making that available on the instructors' site.

Wednesday, February 11, 2015

The third edition will be out March 11, 2015

We spent many of our 2014 hours writing and revising the book.  We feel it will bring the book up to date, and that the presentation of some of the material will be more clear.

According to Amazon, the book will be available starting March 11.

Here are some chapter-by-chapter highlights:
  • Ch. 2 (Data): After hearing many students saying ``It's too hard!'' we took a closer look and decided that the presentation could be improved by more clearly indicating which sections should be treated as informal discussion and which should be studied as formal derivations (and possibly skipped on first reading).  Hopefully, these guideposts will help the students navigate the material, without us reducing the rigor of the presentation.
  • Ch. 3 (Machine Programming): It's x86-64 all the way!  The entire presentation of machine language is based on x86-64.  Now that even cellphones run 64-bit processors, it seemed like it was time to make this change.  Eliminating IA32 also freed up space to put floating-point machine code back in (it was present in the 1st edition and moved to the web for the 2nd edition).  We generated a web aside describing IA32.  Once students know x86-64, the step (back) to IA32 is fairly simple.
  • Ch. 4 (Architecture): Welcome to Y86-64!  We made the simple change of expanding all of the data widths to 64 bits.  We also rewrote all of the machine code to use x86-64 procedure conventions.
  • Ch. 5 (Optimization): We brought the machine-dependent performance optimization up to date based on more recent versions of x86 processors.  The web aside on SIMD programming has been updated for AVX2.  This material becomes even more relevant as industry looks to the SIMD instructions to juice up performance.
  • Ch. 7 (Linking): Linking as been updated for x86-64.  We expanded the discussion of position-independent code and introduce library inter positioning.
  • Ch. 8 (Exceptional Control Flow): We have added a more rigorous treatment of signal handlers, including signal-safe functions.
  • Ch. 11 (Network Programming): We have rewritten all of the code  to use new libraries that support protocol-independent and thread-safe programming.
  • Ch. 12 (Concurrent Programming): We have increased our coverage of thread-level parallelism to make programs run faster on multi-core processors.

Friday, June 13, 2014

Third edition in the works

We've gotten started on the third edition of CS:APP.  The biggest change will be that we will shift entirely to 64 bits.  It seems like that shift has finally occurred across most systems, and so we can say goodbye to 32-bit systems.

Here's a summary of the planned changes for each chapter.
  1. Introduction.  Minor revisions.  Move the discussion of Amdahl's Law to here, since it applies across many aspects of computer systems
  2. Data.  Do some tuning to improve the presentation, without diminishing the core content.  Present fixed word size data types.
  3. Machine code.  A complete rewrite, using x86-64 as the machine language, rather than IA32.  Also update examples based on more a recent version of GCC (4.8.1).  Thankfully, GCC has introduced a new opimization level, specified with the command-line option `-Og' that provides a fairly direct mapping between the C and assembly code.  We will provide a web aside describing IA32.
  4. Architecture.  Shift from Y86 to y86-64.  This includes having 15 registers (omitting %r15 simplifies instruction encoding.), and all data and addresses being 64 bits.  Also update all of the code examples to following the x86-64 ABI conventions.
  5. Optimization.  All examples will be updated (they're mostly x86-64 already).
  6. Memory Hierarchy.  Updated to reflect more recent technology.
  7. Linking.  Rewritten for x86-64.  We've also expanded the discussion of using the GOT and PLT to create position-independent code, and added a new section on the very cool technique of library interpositioning.
  8. Exceptional Control Flow.  More rigorous treatment of signal handlers, including async-signal-safe functions, specific guidelines for writing signal handlers, and using sigsuspend to wait for handlers.
  9. VM.  Minor revisions.
  10. System-Level I/O.  Added a new section on files and the file hierarchy.
  11. Network programming.  Protocol-independent and thread-safe sockets programming using the modern getaddrinfo and getnameinfo functions, replacing the obsolete and non-reentrant gethostbyname and gethostbyaddr functions.
  12. Concurrent programming.  Enhanced coverage of performance aspects of parallel multicore programs.
The new edition will be available March, 2015.

Wednesday, March 27, 2013

Updated the CS:APP Bomb Lab

We've just released an update to the Bomb Lab on the CS:APP site. It fixes a bug caused by the fact that on some systems, hostname() doesn't return a fully qualified domain name.

Tuesday, January 22, 2013

The CS:APP Cache Lab

We've released a new lab, called the Cache Lab, that we've been using at CMU in place of the Performance Lab for a few semesters. In this lab, students write their own general-purpose cache simulator, and then optimize a small matrix transpose kernel to minimize the number of cache misses. We've found that it really helps students to understand how cache memories work, and to clarify key ideas such as spatial and temporal locality.

Monday, November 12, 2012

Peking University Report

I just returned from a trip to Peking University (PKU).  They have recently adopted CS:APP as the textbook for their course "Introduction to Computer Systems," (ICS) patterned after the course we teach at CMU, (the course for which CS:APP was originally written.)

They now require ICS for all CS majors.  Moreover, as part of an initiative by the president of the university, they are teaching it in a form where they have the usual lectures, but they also hold weekly recitation instructions taught by faculty members.  It is one of six courses being taught across the entire university in this format this term.  Here are some statistics for this term:

  • 167 students
  • 14 recitations sections (12 students each)
  • 14 faculty doing recitations
  • 8 faculty doing lectures
That's a lot of resources to devote to a single course!

Monday, June 11, 2012

Chinese Translations of CS:APP

In a recent blog post, I noted that 52% of all copies of the CS:APP that have been sold were in Chinese.  Prof. Yili Gong of Wuhan University did the translations for both the first and second editions of the book.  Prof. Gong has also been a valuable contributor to our errata.

I recently came back from a trip to China, where I gave lectures about CS:APP at both Peking University and Tsinghua University, both of which use the book in their courses.  Looking at our adoptions list, there are only 8 universities in China that we know of using CS:APP as a textbook.  Apparently, the vast majority of copies sold in China are being used by individuals for self study.

Wednesday, May 30, 2012

Who Reads CS:APP?

I gathered some data on the total sales of the various versions of CS:APP.  It's now in its second edition, and it has appeared in multiple languages:
  • English.  Including versions published in India (1st edition only) and China (1st and 2nd edition) for readers in those two countries
  • Chinese (1st and 2nd edition)
  • Korean (2nd edition)
  • Russian (1st edition)
  • Macedonian (1st edition)
All told, as of Dec. 31, 2011, a total of  116,574 books have been sold, across all editions, versions, and formats (paperback, hardcopy, e-book).  The following pie chart shows how this divides across the language categories (sorry, no statistics on Macedonian, but I imagine the numbers are fairly small):

One thing that's clear is that we're very popular in China: fully 52% of the total has been in Chinese, and another 15% has been the English version for the Chinese market.

Thursday, May 17, 2012

Update to the Bomb Lab

We've updated the Bomb Lab sources on the CS:APP site to address a problem that arises when students from previous semesters run their old bombs while the current instance of the lab is underway.

The Bomb Lab servers assign diffusions and explosions to Bomb IDs, rather than users, and Bomb IDs start over from scratch each term. Thus, if a student  who took the class last semester ran their old bomb while the lab was as underway this semester, then the explosions and diffusions from the old bomb would be incorrectly assigned to the current bomb with the same Bomb ID.

To address this, we've added a per-semester identifier, called $LABID,  to the Bomb Lab config file. Instructors can set this variable each term (for example $LABID="f12") to uniquely identify each offering. Any results from previous bombs with different $LABIDs are ignored.

Thanks for Prof. Godmar Bak, Virginia Tech, for pointing this out.

Monday, April 23, 2012

Update to Buffer Lab

We've uploaded another update to the Buffer Lab to fix a couple of issues.

(1) Some recent gcc builds automatically enable the -fstack-protector option. We now explicitly disable this by compiling the buffer bomb with -fno-stack-protector.

(2) In order to avoid infinite loops during autograding, the previous update from February 2012 introduced a timeout feature that was always enabled. However, this was a problem for students who were debugging their bombs in gdb. We now enable timeouts only during autograding.

Thanks to Prof. James Riely, DePaul University for pointing these out to us.

Tuesday, February 21, 2012

Updated Buffer Lab

We've updated the Buffer Lab on the CS:APP site to be more portable, more robust to infinite loops in student exploits, and more random during the nitro phase. Thanks to Prof. Godmar Back from Virginia Tech for identifying the issues and helping us with the solution.

Saturday, February 4, 2012

CS:APP Visits Africa

I am on a visit to Nairobi, Kenya as part of a project to create a test that we hope will be used worldwide to determine whether someone is qualified to be an entry-level programmer. You can read more about that from our CMU Press Release.

Shortly after arriving, we visited Strathmore University, where I gave a presentation about CS:APP.  There were students and faculty members from several area universities.  The talk went very well, with interesting and insightful questions from the audience.  Perhaps the most striking response occurred when I showed our  map of schools using CS:APP as a text book as of Jan. 1, 2012:

 What stood out on this map, especially for Kenyans, was that we don't have a single adoption on the African continent!  There was a lot of discussion about why that could be.  I wish or knew!  Better yet, I hope that schools in Africa will see the value in teaching about computer systems from a programmer's perspective.

Sunday, January 22, 2012

Map of Schools Using CS:APP

Using tools from Google for geocoding and for generating maps, we've created a
map displaying all of the schools we know of that are using CS:APP as a textbook.
Check out

Tuesday, October 18, 2011

Dennis Ritchie's Legacy

Dennis Ritchie passed away on October 12, 2011 at the age of 70. I never met the man, but we have all been profoundly affected by his work, particularly in the development of C.

It's hard to imagine a world without C, but prior to its development in the early 1970s, essentially all system-level program was done in assembly code. That had the undesirable features that 1) there was no portability from one machine to another and 2) writing assembly code is tedious and error-prone. Although higher level languages, such as Cobol, Fortran, and Algol, were well established at the time, they sought to hide away many of the low-level details that a system-level programmer must use. Here are some examples of features that were not available in these languages:
  • Bitwise logical operators
  • Integer variables of different sizes
  • Byte-level memory access
  • The ability to selectively bypass the type system via unions and casts

C had predecessors: The B language developed by Ritchie's colleague Ken Thompson, which in turn arose from BCPL, developed by Martin Richards of Cambridge University in 1966. These predecessors had many limitations, however, and so C really is the first, and arguably still the best, language for implementing system software.

C was, to my knowledge, the first high-level language that embraced the byte (although referred to as char) as a first-class data element---something that could be used as a data value, read from or written to memory, and replicated as an array. Having the ability to operate on byte arrays enables C programmers to create arbitrary system-level data structures, such as page tables, I/O buffers, and network packets. Such a capability was previously only available to assembly language programmers.

The early C compilers did not produce very good code. To get C programs to run fast, programmers had to do a lot of low-level tweaking of the code, such as declaring local variables to be of type register, converting array references to pointers, and explicitly performing such optimizations as code motion, common subexpression elimination, and strength reduction. This had the property of making the programs nearly illegible. Serious code tuners often resorted to assembly language. Fortunately, modern compilers make most of this low-level hacking unnecessary.

Consider how much code is still being written in C: all of Linux, all GNU software, and much more. If we extend to its younger brother C++, we get the vast majority of computer systems worldwide.

Dennis, thanks for the great language you created.

Friday, September 16, 2011

CS:APP Now Available in Macedonian

Prentice-Hall just sent us several copies of "Компјутерски системи" by Рендал И. Брајант (me) and Дејвид Р. О’Халарон (Dave). According to Wikipedia, there are around 2-3 million native speakers of Macedonian worldwide. It's good to know they'll have access to the material of CS:APP.

The publisher has a translation of our book, as well as books on wine, smoking cessation, and the chemistry of explosives.

Wednesday, September 14, 2011

Labs for self-study students

Over the years, we've seen an increasing number of students using CS:APP for self-study, outside of any organized classes. In response to the demand from these students, we've decided to make the lab assignments (without solutions) available to the public from the CS:APP labs page. Of course, solutions will remain available only to instructors with CS:APP accounts.

Dave O'Hallaron

Friday, August 5, 2011

PEDAGOGY: How to Design an x86-like Processor

In switching from RISC to x86 with our Introduction to Computer Systems course, we had an advantage over traditional computer architecture courses in that we didn’t have to worry about how to actually implement a processor.  We could skip all the complexities of instruction formats and instruction encoding, which is definitely not pretty with x86.  When it came time to write our chapter on processor design (Chapter 4) of Computer Systems:  A Programmer's Perspective, we had to confront this unpleasant aspect of x86.  Here we did a bit of a sleight of hand, creating a new instruction set we called “Y86” (get it?).  The idea was to have a notation that looked like x86, but was simple to decode and could be executed by an in-order pipelined processor.  Thus, we limited the addressing modes and made it so that arithmetic instructions could only operate on registers, but we retained the stack-oriented procedure execution conventions.  We also used a very simple encoding.  This made it feasible to go through the entire implementation of two processors---one executing a complete instruction every clock cycle, and one based on a 5-stage pipeline---in a single chapter.  We could even generate complete a Verilog implementation of the pipelined processor and map it through synthesis tools onto an FPGA.  I’ll admit that this approach is a compromise from our goal of presenting real machines executing real programs, but it seems to have worked fairly well.

Interestingly, in their textbook Digitaltechnik—Eine praxisnahe Einführung (Digital Technology, a Practical Introduction), Armin Biere and Daniel Kroening present the design of a processor that executes a subset of the Y86 instructions using the actual IA32 encodings of the instructions.

Apparently, we weren’t the only ones to think of the name “Y86” as a variant of “x86.”  Randall Hyde introduces a stripped down 16-bit instruction set, which he names “Y86” in his book Write Great Code, published in 2004, several years after the first edition of CS:APP came out.

The domain names “” and “” are already taken, but it looks like they’re been occupied by a cybersquatter named Richard Strickland since 2003.  Perhaps he’s just waiting for us to buy him out!

Randy Bryant

PEDAGOGY: Why Teach x86 rather than RISC?

I am a lapsed member of the Church of RISC.  When I first started teaching computer architecture courses in the late 1970s (at MIT & Caltech), the prevailing wisdom was that machines should “close the semantic gap,” meaning to have instruction sets that closely matched the constructs used in high-level programming languages.  We would wax poetic about machines such as the Burroughs B6700, which directly executed Algol programs, and the Digital Equipment Corporation VAX, which had special instructions for such operations as array indexing (with bounds checking) and polynomial evaluation.  Surprisingly, there was little discussion in those days about maximizing performance; there was not even an accepted standard for measuring performance, such as the SPEC benchmarks.
In 1982, I heard a presentation about the principles of Reduced Instruction Set Computers (RISC) from David Patterson.  It was a real eye-opener!  He pointed out that all that business about closing the semantic gap was implemented using microcode, which simply added an unnecessary layer of interpretation between the software and the hardware.  Advanced compilers could do a much better job of taking advantage of special cases than could a general-purpose microcode interpreter.  I returned from that talk and told the students in my computer architecture course that I felt like I’d be teaching the wrong material.

When I started teaching computer architecture at CMU, to both our PhD students and to CS undergraduates, I fully embraced the RISC philosophy, partly because of the then-new textbook, Computer Architecture: A Quantitative Approach, by Patterson & Hennessy.  We made use of a set of MIPS-based machines available to the students, and later a set of Alphas, provided by Digital Equipment Corporation.  Being able to compile and execute C programs on actual machines proved to be an important aspect of the courses.  Needless to say, I was a true believer in RISC, and I would scoff at x86 as a big bag of bumps and warts with all of its addressing models, weirdly named registers, and truly icky floating-point architecture.

As mentioned in our earlier post, the initial offering of 15-213, our Introduction to Computer Systems course, made use of our collection of Alpha machines.  But, we could see that we were on a dead-end path with these machines. In spite of their clean design and initial high performance, Alphas did not fare well in the marketplace.  The steady progress by Intel, first with the Pentium and then with the PentiumPro, slowly took over the market for desktop machines, including for high-end engineering workstations.  We also were thinking at that point about writing a textbook, to encourage others to teach about computer systems from a programmer’s perspective, and so we wanted to find a platform that would be widely available.  We considered both SUN Sparc and IBM/Motorola PowerPC, but these had their own funky features (register windows, branch counters), and also lacked the universality of x86.

As an experiment, I tried compiling some of the C programs we had used to demonstrate the constructs found in machine-level programs on a Linux Pentium machine.  Much to my surprise, I discovered that the assembly code generated by GCC wasn’t so bad after all.  All of those different addressing models didn’t really matter---Linux uses “flat mode” addressing, which avoids all the weird segmentation stuff.  The oddball instructions for decimal arithmetic and string comparison didn’t show up in normal programs.  Floating-point code was pretty ugly, but we could simply avoid that.  Moreover, it didn’t take a particularly magical crystal ball to see that x86 was going to be the dominant instruction set for the foreseeable future.

So, for the second offering of 15-213 in Fall, 1999, Dave O’Hallaron and I decided to make a break from the RISC philosophy and go with x86 (or more properly IA32 for “Intel Architecture 32 bits”).  This turned out to be one of the best decisions we ever made.  Students could use any of the Linux-based workstations being deployed on campus.  By installing the Cygwin tools, they could even do much of the work for class on their Windows laptops.  The feeling of working with real code running on real machines was very compelling.  When we then went to write Computer Systems: A Programmer’s Perspective, we were certain that x86 was the way to go.  Now that Apple Macintosh has transitioned to Intel processors, there are really 3 viable platforms for presenting x86.

One thing we learned is that every machine has awkward features that students must learn if they are going to look at real machine programs.  Here are some examples:
  • In MIPS, you cannot load a 32-bit constant value with a single instruction.  Instead, you load two 16-bit constants, first using the lui instruction to load the upper 16 bits, and then an addi instruction to add a constant to the lower 16 bits.  With a byte-oriented instruction set, such as x86, constants of any length can be encoded within a single instruction.
  • C code compiled to MIPS uses the addu (unsigned add) instruction for adding signed numbers, since the add (signed add) instruction will trap on overflow.
  • With the earlier Alphas, there was no instruction to load or store a single byte.  Loading required a truly baroque pair of instructions: ldq_u (load quadword unaligned) and extbl (extract byte low), followed by two shifts to do a sign extension.  This is all done in x86 with a movb (move byte) instruction, followed by a movsbl (move signed byte to long) to do the sign extension.

My point here isn’t that x86 is superior to a RISC instruction set, but rather that all machines have their bumps and warts, and that’s part of what students need to learn.

In the end, I think the choice of teaching x86 vs. a cleaner language to a computer scientist or computer engineer is a bit like teaching English, rather than Spanish, to someone from China.  Spanish is a much cleaner language, with predictable rules for how to pronounce words, far fewer irregularities, and a smaller vocabulary.  It’s even useful for communicating with many other people, just as learning MIPS (or better yet, ARM) would be for programming embedded processors.  But, as English is the main language for commerce and culture in this world, so x86 is the main language for machines that our students are likely to actually program.  Like Chinese parents who send their children to English-language school, I’m content teaching my students x86!

Randy Bryant