Saturday, 6 April 2013

Designing Dictionaries

Just about every school of Computing Science in the known universe has a course in first or second year called something like "data structures" that exposes students to several (sometimes many) techniques for organizing data in ways more complex than a simple sequence. There's often a fat textbook with dozens of them, of which an instructor selects the two or three most important and a smattering of others.

The things is, most of those structures are special implementations of the abstract idea of a "dictionary": given one of these, find one of those. Given a name, find a telephone number. Given a street address, find the GPS coordinates. Most of the ways of doing this have already been programmed hundreds of times, so some students wonder "why should I bother learning all this, when I can just reuse what somebody else has done?"  They especially ask this if you're showing them a particularly tricky piece of code from one of the cleverer dictionaries.

So how should we teach this material? Back in the late 1980's, I thought about that question a lot, and came up with an answer I liked: Software design is about making deliberate choices among alternatives, and this is an ideal course for driving that home.

I'm going to save the most first point, the one that motivates the material, for last.

The second step is to point out that all these data structures vary in how much computer memory they occupy, in how much computer time they cost when you use them, and that both of these vary depending on details of what specific sort of data you're storing. Plus where you're storing it -- what goes on a computer disk has different characteristics from what's in the computer's main memory. What's  trickier is that there are several things ("operations") one does with the data, and some structures make one thing fast at the expense of making another slow, or making something fast at the expense of occupying more memory.

The third thing do do is to teach how to figure out how much time and memory each of those structures costs -- and that's a critical lesson we want the students to apply throughout their career, when they're tackling problems we never taught them about. Some instructors stop there, which from one perspective is a perfectly reasonable place to stop.

The first step is to point out that you're going to have to learn to choose which dictionary is best for a specific problem (actually, the first thing is that they're dictionaries, which I've seen left out occasionally). A common mistake (one that I made, too, the first time around) is to treat each data structure as its own thing, and fail to point out that they share what we call "a common interface" -- a common set of specific tasks they can carry out. The common interface means that any of the data structures will "work", and most students start out thinking there's only one answer to any given problem. So they resist learning alternatives in the first place.

One of the lessons of modern pedagogy is that students learn what you are going to test them on instead of what you try to teach them, so you have to ask over and over again about making choices. The best way to do this is with "word problems" -- here's a real-world scenario; which data structure do we use to solve it? How often are each of those operations going to get used? How constrained are we for space? Show your work (that is, show the analysis where you compared more than one alternative).

Teaching this way has a social cost.  Many students hate word problems, and hate to "show their work". A colleague of mine came up with the same idea, and at the end of the course assigned exactly this sort of problem. The students hated that question. But it was the right question to ask.