Wednesday, 18 June 2008

One task, One place

Today's advice is nicely summed up by the saying "A place for everything, and everything in it's place." Like any of my advice, you can choose to ignore it, or even contradict it. If anyone has advice I like better, I'll follow their advice instead.

The DRY principal (don't repeat yourself) says that any complex task should only be carried out in one place.

Why? The end user doesn't care if you repeat yourself. The computer certainly doesn't care if you repeat yourself. In the end, it turns out that the people who care whether you repeat yourself are the developers of your code (including yourself). Again, it boils down to programming for people.

Duplication is a Bad Thing(TM) for several reasons. The most obvious reason is that if you have some duplicate code, and discover a bug in that code, you could fix it in one of the places and completely forget about the other places. So in fact, the bug isn't fixed. Now if the duplicate code had been written as one function that was called from the places where that code was needed, instead of duplicated, the bug fix would only need to go in one place.

(I'm going to ignore the advice and repeat myself here because this is important!) The bug fix would only need to go in one place! When you've come up with a fix, there's no need to trawl through the codebase looking for all the places that the fix needs applying because the places that need this code all pass through the single place that you have applied the fix.

The second reason to avoid duplication is that duplication makes code more difficult to read, and comprehend. Which is easier to understand - 10 lines of code, or 20 lines of code? For the sake of argument, let's assume that each of the individual lines is as understandable (or obtuse!) as each other line.

Less lines of code means less to understand, means less to misunderstand! Sure, if someone takes the time to pick apart the code line by line, they might eventually understand it. But first glance isn't going to give us as quick an overview if there is twice as much code there than is really necessary. Also, if you wrap the duplicate code up into a single function, even the function name can help comprehension. Compare


function bigFunc()
baz(bob.foobar(X, Y, Z));
abc_def(fred, bob, saz.ABC);
baz(jane.foobar(X, Y, Z));
abc_def(sam, jane, saz.ABC);

vs Sample2:

function bigFunc()
arrangeMeeting(fred, bob);

arrangeMeeting(sam, jane);

function arrangeMeeting(Person p1, Person p2)
baz(p2.foobar(X, Y, Z));
abc_def(p1, p2, saz.ABC);

Now the arrangeMeeting() function may still not make much sense, but at least we now know that this chunk of code is meant to arrange a meeting between people. If we are interested in the details of what the bigFunc() does, we only have to read through the horrible mess of bar(), baz() and abc_def() once (if at all). And of course, any mistakes in the meeting code will only need to be corrected once in the arrangeMeeting function.

The two samples lead quite nicely to my second point today, which is:

In any one place, only one task should be carried out.

Instead of having to digest the entirety of the bigFunc() in sample1, (and possibly getting indigestion), the code in bigFunc() in sample2 is much easier to digest. The code that deals with arranging meetings has been moved off into its own function, and we are left with much smaller, bite-size chunks of code to deal with.

If you find that you're writing one comment about several lines of code, several times in a function, then each of those chunks of code could well be ripe for turning into a function of its own.

A program should be like a well structured document, with an overall view, chapters (modules), headings (classes), and sub headings (functions), and it should be easy for a reader to drill down to the place that they're interested in. You can go a long way towards achieving this by having a place for everything, and keeping everything in it's place.

No comments: