Computer Floss » General software engineering

Software development and your natural rhythms

Mon, 11 Nov 2013 10:27:05 +0000

One of the lessons from Joel Spolsky I took to heart right from the start of my career as a software developer was to schedule tasks honestly. Scheduling dishonestly means estimating a task when you don’t really know what work is involved. As a result, you end up putting an incorrect time estimate on it.

To combat this, you should schedule honestly and Joel proposes this simple rule: tasks should be measured in hours not days, with a realistic estimate being no larger than 16 hours. If your estimate is larger than that, then you probably don’t appreciate the true size of the task. Experience tells us that programmers usually underestimate tasks they don’t fully understand, so estimating in terms of days or weeks risks schedule overrun. Tasks which will take more than a day or two should therefore be broken down into smaller sub-tasks, because this forces you to think about the work in more detail and gain a better understanding of it.

But how much time is in a day?

So an 8-hour task will take about a day, yes? Well, maybe not. One of the other things I recall Joel saying is that there is less time in the day than you think. Yes, we ‘lose’ time to lunch breaks and meetings, but that’s quite easily taken account of in a schedule.

What about that other ‘lost time’? You know what I’m talking about: all that web-surfing, checking email, chatting at the water-cooler. How much time every day is ‘lost’ to these? And how guilty do they make us feel when work ends, because we know we spent a cumulative hour on them instead of completing that task which should have taken only a day.

In actuality, we probably shouldn’t be feeling guilty about this at all. We should just be scheduling better.

Scheduling in cycles

I recently read this article: “The origin of the 8 hour work day and why we should rethink it“. It presents interesting arguments for switching away from a traditional 8-hour work day and has several tips for working more effectively. The one that really intrigued me was the argument that human concentration follows a natural rhythm whereby we can focus on a task for a certain amount of time before we start to lose it and need a break. The article claims that most of us can hold focus for about 90 minutes before requiring a break of around 20 minutes, and so advocates planning work time in 90-minute windows (let’s call them cycles).

So what happens when we combine this with Joel’s advice? A task estimated to take 6 hours effort (360 minutes) would require 4 cycles (4 x 90 minutes). When you factor in the 20-minute breaks between tasks that’s an extra hour.

If you can manage more than four cycles in a day then good for you, but four is enough for me. Under this cycle-scheduling system then, “one day” (if we stick to an 8-hour day) clearly gives you six hours work time, a one-hour mealtime, and recognises that we all give in to the temptation to take little breaks now and again.

Can it work for programmers?

Let’s consider a few things.

It factors in those more unpredictable distractions (Facebook, email, coffee etc.) which are actually symptoms of us reaching a loss of focus. Using cycles allow you to actually allocate time for them, which recognises we need them and stops us from feeling any guilt about ‘wasting’ time.
It can be encouraging to know that your current cycle of work will soon be followed by a scheduled break, especially when the work is intense.
You can more easily track the work you did. At the end a day, can I really be sure I put in X hours of work? How much cumulative time was lost to those little breaks? If you follow the cycles, you have more confidence in how much time you really spent on tasks.
Granularity. Most of the time, programmers can’t measure progress with the granularity of certain other professions (like a bricklayer who can measure bricks laid per hour). But there are still activities we can plan which fit into a 90-minute cycle. Think along the lines of: layout one basic screen design, implement that setter function, write the code that makes TestX pass. And speaking of testing…
Cycles can also help to make the distinction between coding and testing clearer. When you estimate a task as 8 hours, it’s easier and tempting to lump the two together in a schedule only to find that coding takes up all the allocated time and testing gets pushed back or even dropped. Using 90-minute cycles encourages the approach of spending one cycle getting something working and then spending the following cycle reviewing and testing what you’ve done. Most programmers can relate to how different code looks when it’s examined again after a break and how this makes the flaws easier to find.
Interrupting the flow. One objection I’ve heard goes along the line of: “It can take programmers several hours just to get into a task, so you shouldn’t interrupt the flow.” This is true, but getting into a task still needs factoring into a schedule somehow. Just like cycles force you to think about the actual small details of a task, they also force you to plan your research or preparation.
- Try reducing the breaks in-between preparation cycles to 5 minutes.
- Be more detailed about what “getting into a task” means: How many pages of a resource can you read in one cycle? How much of a prototype/test code can you write in one cycle? etc.

Summary

So, instead of estimating a task or feature in terms of the number of hours it will take, why not in terms of the number of cycles instead? Personally, I like the approach and I already work in a similar pattern anyway. If it is indeed effective and quite natural to all of us then why not make scheduling explicitly take it all into account and give yourself a more accurate project plan?

Edit: One reader seems to have the impression that I’m saying non-peak time is exclusively time to slack off, even though I never stated such a thing and even included work tasks in the examples of non-peak time. Let me make it clear: non-peak time includes unscheduled stuff that doesn’t contribute to the development project you’re working on and billing for. It could be slacking off time, but it could also be reading your email, checking in with work colleagues, looking at or updating the project plan, and so on…. anything that takes your focus away from the scheduled development work.

]]>

How being open with your data keeps it safe

Thu, 22 Aug 2013 09:23:02 +0000

In a previous post, I gave some examples of FLOSS programs that performed essential computing tasks (namely operating system, email and cloud storage) which you could use instead of proprietary alternatives. I claimed that the FLOSS versions had the potential to improve the security of your data (in case you’re concerned about unauthorised third-parties accessing it) without going into detail about how.

This post will explain why entrusting your data to open source software can protect it from snoopers. Please be aware that there are many ways to spy electronically; I’ll only be addressing some that the use of FLOSS can guard against. Also be aware that computer security is not my speciality, I’m just trying to present the issues as best I understand them from several perspectives.

The “Many Eyeballs” Perspective

When I discussed operating systems in my previous post on this subject, I pointed out how it’s hard to know exactly what a closed source program does. This has a lot to do with how software is built, which I’ve explained before: programmers write software in a human-readable source code and then run that code through a compiler to produce the binary program that’s readable only by a computer. While closed source (a.k.a. proprietary) software is released in binary form only, software released under a FLOSS licence comes with the source code as well, a bit like getting a Haynes manual (those trusty books that detail every single component of an automobile) supplied with your new car. Conversely, with a proprietary program, you get the equivalent of no manual and a car with a sealed bonnet.

Without the source code, you’re forced to look at the program like a black box, so all you can see is its external behaviour. This can make it extremely difficult to determine its internal workings. You may wonder whether there’s any code in the program that makes it “phone home”, reporting information about you to a remote server. You may ask yourself if there are vulnerabilities in the program which would allow someone to log into your system without your knowledge. For questions like these, there’s no easy way to find the answers without the source code.

With the source code, you have the opportunity to search for such vulnerabilities. When many people are able to look through the code, it should go without saying that all these eyeballs act as strong defence against the insertion of secret spying routines and hard-coded back doors. This is not just all theoretical speculation. In the 1990s, for example, Borland released a database server called InterBase that had a back door intentionally engineered into it (see David Wheeler’s article mentioning this), which allowed people with the knowledge to break in. As long as the program remained proprietary, this vulnerability remained secret (at least six years). After InterBase was eventually open sourced, the back door was discovered within months.

Bruce Schneier

As well as gaining access to the program, the availability of source code helps you verify what happens when sending stuff out from the program. Whenever you want to transmit data securely from your computer, you need to encrypt it using an encryption algorithm, which turns your readable data into “meaningless” cipher text. There are many such algorithms available, some strong, some weak. With access to the source code, you can verify for yourself whether or not the program has poor quality encryption routines and thus makes it a simple matter for third parties to decode your intercepted messages. It’s partly for this reason that famed IT security expert Bruce Schneier recommends that security-conscious engineers use FLOSS.

Keep in mind that the benefits of access to the source code are not automatic. Just because it’s possible to review the code for vulnerabilities, that doesn’t necessarily mean it gets done. The community that develops the software needs to be sufficiently large, active and knowledgeable about IT security.

The “Keys and Padlocks” Perspective

What I termed the “Many Eyeballs” Perspective is worth knowing, but from what we know about the current spying scandal it isn’t actually the most relevant one. The stories that have dominated the front pages for most of this summer are not about secret source, but rather who has the keys to your information.

One of the most important concepts in contemporary computer security is public key encryption. It’s a bit complicated, but it’s basically a system that simulates keys and padlocks in software. Each user has a public key and a private key, which are both actually just very long numbers in a file and are mathematically linked with each other. The public key can be passed around freely to anyone you like, whereas the private key must be kept secret. Although it’s called a public key, by analogy it’s more like a padlock. I’ll explain…

The classic application of this idea is message encryption. If your friend wants to send a message intended only for you, they have to encrypt it so anyone who might intercept it is unable to read it. Encrypting a message is easy: your friend just uses your freely available public key to turn the plain text into cipher text. However, decrypting the message can only be done with the corresponding private key (hence why you must keep this one secret). By analogy, it’s as though your friend wants to send you a box and be certain that only you can open it. You send your open padlock (public key) to your friend, who then uses it to seal the box. Only you have the (private) key for this padlock, so your friend can send the box knowing no-one other than you can unlock the padlock if it’s intercepted. If you want to keep your emails safe from prying eyes, encrypting them prior to dispatch using algorithms such as PGP is always a good idea, but there’s another way to apply this idea other than encryption, and that’s authentication.

Authentication deals with who is allowed access to a computer system. The most common way of logging into a computer (i.e. being authenticated) is with a username and password. However, passwords are sometimes unreliable, particularly when the password owner is sloppy. Choosing the name of your pet cat as a password is pretty easy for a nefarious hacker to guess and a brute force attack can try every word in the dictionary in a very short time. (Hence the reason why your system administrator forces you to use passwords with a minimum of 32 characters and utilising letters, numbers, mathematical symbols and Egyptian hieroglyphs.) An alternative is to use public and private keys. Instead of asking you for a password, you give your public key to the server administrator, who puts it on the computer, while your private key remains on your own computer. Thereafter, when you attempt to log in to that server via your computer, the server checks whether the public key and private key match. If they do, you are granted access to system — and all without a single bloody password!

Your public key might be the only one on the server or it could be one of many, it makes no difference. When your public key is added to a server, it’s as though a new door into the system is built and your padlock is used to lock that door. This gives you your very own exclusive entry point. There could be dozens of other doors, but you can only open your own; anyone without a door has no way to get into the system.

Now, here’s the rub. The person in charge of the system gets to set up the doors. If you are the system administrator, then you know who has access to your system. However, if responsibility for the system is delegated to an external party (as is the case with most popular web services we all use) it can be really hard to know for sure who has access. They own and maintain the servers not you. Of course, there’s a secured door that only you can pass through, but how do you know whether or not other secret entry points exist, the so-called “back doors” that let someone else access your data?

This is pretty much the situation with many web-based services like GMail and Dropbox. They stand accused of setting up back doors on their services which allow intelligence agencies to step in and poke around at will. Sadly, I’m afraid there’s not much you can do about this. They certainly won’t provide you with a copy of their software which you can use to set up your own email servers or cloud-storage services.

But you do have a choice with FLOSS programs, which is why (in the previous post) I recommended Kolab and ownCloud as open source alternatives to their proprietary counterparts. By deploying these programs on your own server, you get to exercise control yourself and prevent the installation of secret back doors.

Bonus Perspective: “The Wiretap”

(This last perspective follows on from the previous one, although is not exclusively related to FLOSS. )

Another aspect to the spying scandal is the alleged listening in on communications as they whizz through the Internet, essentially wiretapping the world wide web. If you’re concerned about the security of your messages, you should encrypt your outgoing traffic with encryption methods like PGP. With FLOSS programs this is usually quite simple to set up and, as a bonus, you have the chance to control the encryption keys.

However, proprietary programs don’t always make it easy to encrypt traffic, or they might deny you the option altogether. (Just try and encrypt your GMail traffic and see how difficult things get.) What’s more, where traffic is encrypted, the service providers may be in control of the encryption keys and we’re back to the problem in the “Keys and Padlocks” perspective: how far do you trust them not to reveal the keys?

Conclusion

As I’ve shown over these last two posts, FLOSS-based programs have the potential to improve the security of your computer-based data and keep prying eyes away. An operating system like Linux, email services with Kolab or cloud-sharing via ownCloud are all safe bets.

But I must stress the benefits are not automatic. There are prices to pay.

For one thing, more proactivity is required. Of course, software developers need to be well-clued up on writing secure software, but privacy-concerned users also have to be vigilant. They have to take on more responsibility for the data and ask the right questions, like: Is this software FLOSS? Who is responsible for controlling access to my computer and its programs? Is proper encryption being used? etc.

Also, there may be a price to pay in a more literal sense. A famous comic depicts two pigs chatting with each other, remarking how lucky they are to be living on a farm that feeds them and houses them — all for free. Too good to be true? As Georg Greve points out, when you’re the product being sold, being put under surveillance is part of your payment to the service provider. I will venture to say that when you are instead the paying customer of a service provider, particularly one that uses FLOSS, surveillance is probably a much lesser risk… but still don’t forget to read those terms and conditions.

In short, paying those prices gives you the chance to be an empowered customer rather than a product sold at the market.

]]>

Pursuing Code Simplicity – Does Dr. Dobbs Miss the Point?

Mon, 01 Jul 2013 09:43:08 +0000

This article in Dr. Dobbs claims an obsession with code simplicity exists among some (i.e. agile) programmers. It refers to that old received hacker wisdom that any fool can write complex code, but it takes real talent to write simple code. The author attacks this point of view, saying that in reality some problems are genuinely complex and must be implemented in complex code.

I happen to agree that inherently complex problems sometimes can’t be reduced to “simple” solutions. I wasn’t aware any developer seriously thinks any different.

But I also think that this analysis misses the point. As I understand it, this old wisdom refers to problems that do have a simple solution, but require some time and skill to work out. I think every programmer can hark back to their very earliest coding days and recall the typical quality of programs from their youth. Or maybe they have a colleague who couldn’t give a stuff about code quality and sees what sort of code they produce? In any case, a solution that’s not sufficiently thought-out prior to implementation tends to become a mass of multiple-level if-statements, endless switch-case blocks or functions hundreds of lines long.

I see a striking similarity with writers. They also need to produce something that’s simple to digest, ideally so that the reader becomes so absorbed in the story they lose all awareness that they’re actually reading. Rare is the writer who composes a novel by jumping straight into the implementation without any planning. Writers who do, as J. R. R. Tolkien famously did, will usually get into a mess. (Unconventionally, Tolkien wrote Lord of the Rings with lots of ideas but without any detailed story plan and had to restart multiple times. Luckily for us, he persisted and got his story worked out in the end.)

A writer needs to develop certain skills — like plot development, pacing, characterisation — and use these to compose a “plan of action”. Once that’s done, everything is in place: the characters are worked out, their relationships with one another are established, what happens when is settled, and so on. With a plan in place, the actual writing practically takes care of itself and the writer can focus purely on making the prose pleasurable to read.

Similarly, a programmer has a set of skills that he/she uses to analyse the initial problem, using their knowledge and experience to anticipate probable hotspots of complexity, and work out the steps, structures and relationships to mitigate them. If, let’s say, your new e-commerce app has to deal with different sales tax rates, it’s very easy to just choose between rates by way of a switch statement because we learn about switching very early in our programming career.

switch (country) {
  case UK:
    return 17.5;
    break;
  case IRELAND:
    return 23.0;
    break;
  // etc
}

But experienced programmers know that some parts of any program are volatile, changing often in response to new requirements. Just imagine that this e-commerce application became widespread and suddenly had to support the tax rates of more and more countries. Or maybe additional types of tax suddenly needed to be taken into account. The code could potentially become horrific.

As an alternative to switch statements, you could use a strategy pattern to implement taxation instead, sub-classing each variant. This would make the resulting program simpler and less brittle. But the pattern approach means you would need to possess knowledge about patterns, and patterns aren’t learned quickly or easily.

This example illustrates what I’ve always taken this old wisdom to mean. I’ve seen a fair few inexperienced programmers by now (I was one once), and I’ve even known a few coders who just didn’t give a damn about writing simple code. Their efforts, even when an alternative, simpler solution existed, was usually a nightmarish example of code spaghettification. With knowledge, experience and the right attitude, their solutions would no doubt have been much less complex.

Still, there’s plenty of room for agreement here, specifically the article’s championing of readability. Even some of the best programmers forget that source code is somewhat like a novel – a story from one coder to another. Like a novel, the source code should be a pleasurable read too… but I’m not convinced readability comes before simplicity. It wouldn’t matter how beautiful the writer’s language — if their story was a structurally complex mess then I would be putting it back on the bookshelf pretty quickly.

]]>

Teaching Web Applications and Arguing to Let Students Loose

Wed, 27 Mar 2013 11:30:26 +0000

As I mentioned recently, I’ve come to the end of my latest teaching course: a semester in Web Applications at one of Berlin’s universities (see my previous post for more details). As well as theory, there was a practical element to the course, where the students divided into teams to produce their own web apps from scratch. I’ve done similar things before, but in those cases I was forced by circumstance to assign the students to work on a specific project. What I want to do here is briefly explain how much more a successful strategy it is (IMHO) to allow students to be largely free to determine the course of their work themselves.

The Course

The Web Applications course was eight sessions delivered bi-weekly over a single semester. Each session consisted of a lecture followed by practical work, where the students either worked on an assignment I had given them or developed their web app. By the second session, the students had pretty much settled into their teams of 2 or 3 members. (That was one thing I did enforce by the way. 2-3 students per team works really well, and I would not recommend going above 3.) The university infrastructure was relatively impressive and provided hosting software, databases, project management tools etc. Use of this infrastructure was optiona.

The Choices

When it came to deciding what application the students should develop, I allowed them to choose both their own topic and which technology they would use to implement it. I figured:

When you give someone the freedom to choose their own topic, they’re much more likely to end up caring strongly for it, invest in it and make a good job of it. Assigning them something risks sapping their creativity because they may have little or no interest in the topic.
Nevertheless, I gave about a half dozen example topics. This gave both a feel for the kind of scope I was looking for and also some ideas to any teams who couldn’t come up with their own ideas. Using one of these examples had no effect on the final grade. Students were marked on their ability to understand the course material, put it into practice, write reports and present it to their peers; there were no prizes for originality.
Allowing the students to choose their own implementation technologies was also a means to maximise their creativity. Programming languages, frameworks, operating systems… these are all ‘religious’ issues and every programmer has their own preferences. Enforcing the use of the university-maintained PHP/MySQL servers would have been a good way to create quite a few frustrated students who felt stifled and uninspired by the choice of tools. Of course, I made sure to verify all technology choices, in case a team wanted to use a completely unfamiliar or inappropriate technology.
Finally, it was important for students to experience first-hand the consequences of their own choices. There are risks when embarking on a software development project:
- Have we bitten off more than we can chew by choosing this topic?
- Are we making a mistake choosing a cool but unfamiliar language?
- Have we chosen a poor programming framework?
- Have we picked the right ways of working together?
- Is the infrastructure we’re using sufficient?
Getting some of these wrong can lead to a project ‘failing’. Now, there’s a lot to be said for failure as a learning method. And also, what better time is there to make a mistake than in your formative/educational years? It’s a more forgiving time than when you have a career, that’s for certain. To be sure, some choices the students made wouldn’t work out, and, as long as they subsequently understood why, they serve as valuable lessons.

Outcome

As it went, the choices made by the students were varied and interesting. PHP was the most popular choice (roughly half the groups chose it), but other technologies like Ruby on Rails, Grails, .NET and JavaScript frameworks were also used. Furthermore, while few groups used the University’s hosting services, many elected to arrange their own. Clearly, they knew their own minds.

What I found particularly impressive, even though I wasn’t evaluating the students on this, was their originality and imagination. There’s clearly still a wealth of ideas for web-based applications out there if my students were a fair sample of the programming population.

I’ll leave you with a selection of the ideas to see for yourself (links correct at time of writing):

Crowdstory: An application for the community-driven writing of stories. Allows voting and discussion on sections of the story.
Div@: A place where you can discuss sites you disapprove of without linking to them (and thus giving them traffic) — instead, the app samples the target site as screenshots.
MoPad: A mobile game controller system. Multiplayer games are hosted in the browser and you use your mobile device (e.g. smartphone, tablet) as the controller.
Tourathon: An app which generates a tour of specific types of places between waypoints. For example, you can go on a pub crawl by specifying two points and requesting pubs/bars between them; a route which takes you along the pubs is generated. Integrates with Google Maps and Facebook (so you can share tours with friends).
Yum.is: Taking photos of your food is a popular pastime. This app allows you to upload your food photos to the site and link them with places they’re served on Google Maps. You can also use GPS positioning to get a selection of photos (or yums as they’re known here) in your area.

]]>

Why choose Python for teaching?

Mon, 07 May 2012 09:49:58 +0000

I recently read a tweet by a computer science educator claiming the superiority of a particular programming language for teaching purposes (Pascal, if you must know). Now, I don’t really go for religious wars — each to his own and all that — but I did reply with my opinion that Python might generally be a better choice.

Of course, language choice depends on the audience and what you’re trying to achieve. For very young students (younger than 11 or 12), I’d say a language like Logo is most suitable. Older programmers with experience might have particular requirements that Python can’t fulfil; they might need to learn low-level stuff for example. But for a general introduction to programming, I think Python is ideal.

My suggestion of Python was dismissed by the original tweeter on the grounds that “Pascal is easier to learn” and we have “more experience of teaching Pascal.” I dispute the first reason. And to be honest, I don’t really know what that last reason actually means. Someone able to teach programming languages shouldn’t take too long to learn and adapt to a new one; plus, those doing the teaching in universities are often teaching assistants or graduate students anyway who have probably no experience of Pascal on account of their youth.

Anyway, I thought I’d make clear my reasons for Python preference in a blog post — after all, it’s a bit more flexible than a 140-character tweet.

Clean

Python is famed for how it departs from the norm by assigning meaning to a program’s layout. Instead of using curly brackets or keywords like begin and end, the code’s structure is made clear by how it’s indented. This has resulted in Python being given the nickname “readable pseudo-code”. Pseudo-code is what we normally teach students to draft their programs in first, so it’s a very short step between their draft version and the finished version of a program.

This is just one aspect of Python that gives it a clean overall appearance. Another is how Python disposes of numerous bits of syntax that other popular languages insist on: all those brackets and semicolons, which beginners find distracting.

Playful

When I first started programming, I was using machines like the Commodore 64 or the Sinclair Spectrum. Those who remember them will remember that switching the machine on launched you instantly into the BASIC programming environment. If you had even the slightest curiosity about programming, it wouldn’t be long until you were typing in commands and trying to make your computer do all sorts of cute, fun little things. Remember this?

10 PRINT "FRANKIE SAY RELAX!"
20 GOTO 10

Programming on these old computers was so accessible because they were REPL (Read-Eval-Print Loop) environments. There was no infrastructure to set up, no compilation to worry about, and instant feedback about what you’d done. Programming like this can often be a revelation, because if you’re one of those lucky people who “gets” what all the fuss over programming is about, seeing a program instantly do something can draw you in very quickly. It would be a shame if beginners were put off by “heavy” languages which demand lots of work before their work bears any fruit whatsoever.

Python gives you this same REPL environment as standard. All you have to do is launch the Python interpreter, and you can start instantly entering lines of code for instant feedback.

Wide use in FLOSS

Python is a popular language, but I’m not usually one to let popularity get in the way of making the right choice. However, this popularity does give programming students a distinct learning advantage: Python is widely used in free/open source software projects. (Check out the stats on places like Freshmeat or SourceForge, and you’ll find thousands of projects coded in Python.) Why is this important? Programmers learn best by doing, not by listening to a lecturer or just reading a programming book. The next best thing to writing code from scratch yourself is to get hold of an existing program and go through the source, reading, learning, tweaking, extending. With thousands of FLOSS programs available covering every conceivable type of software, the student is spoiled for choice.

Out-of-the-boxiness

Python provides a few things out of the box which are instantly accessible to the beginner but not enforced. The first of these the student is likely to need are the built-in data types, specifically lists and dictionaries, and using them is as simple as x = ['a', 'b', 'c']. Worrying about other data types (integers, strings, booleans and so on) can be delayed because of Python’s dynamic typing, a helpful feature in itself.

There’s also out-of-the-box support for today’s most relevant programming paradigms, like structured, object-oriented and functional, but unlike some popular languages, none of these are enforced. This means, once a beginner has grasped the very basics of programming, the teacher can easily proceed to teach a particular paradigm still using Python.

]]>

The Mythical Man-Month Keeps on Giving

Mon, 20 Jun 2011 10:04:25 +0000

I’ve recently been re-reading Fred Brooks’s The Mythical Man-Month for something like the fourth or fifth time. It’s one of those textbooks that’s so well-written and a joy to read it becomes a work of literature. It’s also a book that keeps on giving. Every time I read it I seem to get something new out of it, or learn some things that either I missed or glossed over on previous occasions. Here’s what I got out of it on my most recent re-reading.

Information Hiding

How I missed this before I don’t know, but Brooks was far from enthusiastic about information hiding when it was first proposed by David Parnas back in the 1970s. He was adamant that the programmer must know certain internal details about procedures, and called information hiding a “recipe for disaster” in the 1975 edition of his book. To be fair, Brooks wrote a full recantation in the second edition, admitting the advantages of encapsulation. I hope I can be as honest and gracious when I am persuaded to change my opinion.

The End of WIMP?

In a prediction of the future (written in 1995), Brooks presages the obsolescence of the WIMP style of interaction “in a generation”. He doesn’t elaborate how long a generation is (maybe it’s passed already?), but amazingly, after however long it lasts, he predicts that speech input will be the way to do things. This is a prediction I just cannot go along with, if only because I know that sitting in an office talking to my computer all day and being surrounded by others doing the same is not the way I want to program. Maybe some interaction models will evolve this eventually (the computer-driven home entertainment system comes to mind), but I can’t help thinking about mobile phones, which have had voice recognition capabilities for a while now. Still, all I ever see are people thumbing their way through their phones.

Small-issue Meetings

A smaller thing I noticed was Brooks advocating a semi-regular meeting where only small issues are discussed (every few months in his advice). I’ve often found it a nagging problem that having a priority-driven task-list risks leaving smaller issues undone as the larger ones monopolise our attention. Despite their merits, approaches such as Scrum and feature-driven development leave me unsatisfied in this regard. Sometimes addressing a bunch of smaller issues can be a better thing to do than taking on a single biggie; forcing consideration of this through regular ‘small issue’ clear-outs intrigued me as an idea. I only worry about the levels of enthusiasm in a meeting like that.

Religion

This wasn’t first time I’d noticed the religious influence in the book. It’s hard to miss it, but before my most recent reading of Mythical Man-Month I had zero interest in religion and the references mostly passed me by. I’ve since become more knowledgeable about it, if only as a stupefied outsider, so I am much more conscious of when Brooks (an evangelical Christian) uses religious references. For example, he cites the comparison of three stages of creativity with the Christian trinity. He uses the cathedral at Reims — “glorious” because the builders chose to sacrifice their own ideas for the purity of its design, thus “the result proclaims not only the glory of God, but also His power to salvage fallen men from their pride” — as an argument in favour of conceptual integrity. Brooks’s proposal for achieving conceptual integrity is to empower one team-member as the “architect” and thus be responsible for all design decisions — a single designer, so to speak.

This put me in mind of Charles Simonyi as described by technical journalist Robert X. Cringely. Simonyi, a Microsoft alumni, developed a software management technique called metaprogramming, which Cringely referred to as the collective farming of software development. Cringely attributed the nature of metaprogramming to Simonyi having grown up in communist Hungary and said that he was unconsciously emulating the rigid structure of the society in which he was raised. Brooks’s ideas are quite different, but I wondered if the same was true of him and his perspective of software development management, in which a powerful architect creates and shapes new worlds, assisted by a team of saints and angels.

I’m only semi-serious with this one; it’s surely much more complicated than that. But do we all instinctively copy the same ideas from the culture in which we’re steeped?

]]>

“Why can’t it work like a TV?”

Wed, 13 Apr 2011 11:16:16 +0000

The research of Andrew Tanenbaum (who, like me, is based in a “Free” university, but his is “Vrije” where mine is “Freie”) has long involved computer operating systems, and he reserves many disparaging opinions about their general state. He regards a number of common OS concepts as obsolete, be they file systems largely unchanged since the 1960s or big, monolithic kernels that stretch to millions of lines of code. (He famously declared the Linux kernel to be obsolete while still in its infancy).

Tanenbaum’s group has taken these problems and developed Minix 3 as an embodiment of many of their solutions. In his articles and talks, he often calls upon some hypothetical grandma as an argument against the woeful state of software quality. This mouse-wielding octogenarian (I mean the grandma, not Tanenbaum) laments “why doesn’t it work like a TV?”, meaning why can’t you just switch on a computer and have it work for the next ten years without crashing?

All respect to Tanenbaum and his efforts at producing fault-tolerant, super-reliable software systems. Minix 3 has many interesting and innovative ideas within it and has been demonstrated as an impressive proof-of-concept. Hopefully, this will help towards steering our industry towards levels of reliability common to just about every other major industry and so salvage its reputation among the public.

However, I fear the days of his TV analogy are numbered. You see, the missus and I recently treated ourselves to a new TV. How surprised I was after setting it up to find that it is powered by the Linux kernel and an assortment of GNU software. But, as we already know: where there is software, there are crashes. If we are now entering the days where TVs are essentially running full operating systems, we may no longer be warranted as citing a TV as technology that “just works”. We’ve already suffered a Blu-Ray player that has been turned into useless brick after it demanded software updates and we foolishly obliged in providing them.

]]>

ReviewBoard: Indispensable and Rather Spiffy

Tue, 22 Feb 2011 10:36:14 +0000

Code inspection is demonstrably one of the most powerful tools for preventing defects in software. For our own part, we who produce Saros have an inspection policy, whereby all but the most trivial changes committed to the version control system (VCS) must gather a minimum of two approving votes before being allowed. But how do we manage this?

In the past, patches were passed freely around on the project’s mailing-list. Team-mates could take each other’s patch files, apply them manually and conversations proceeded in mailing-list threads. While it did the job, it was somewhat cumbersome and only grew worse when the team grew in size. To improve matters, we chose to use ReviewBoard.

ReviewBoard is a web application that links to a project on your VCS. When you make local changes and produce a patch file, you can then upload your patch to ReviewBoard, which will compare the patch to the repository version and manage the diff for you. By “manage the diff”, I mean things like:

Produce a two-pane diff view, showing what has changed
Allow you to attach both general and line-by-line comments on the code
Make multiple versions of the same patch
Aggregate all feedback into a single thread
and so on…

In our time using ReviewBoard, it has become an extremely helpful tool indeed. I think it would cause us to go cold turkey if it were taken away (although perhaps not as powerfully as losing code completion would — I’ve been there, it wasn’t pretty) . This post is simply to express my own support and gratitude to the makers and to point out that any sizeable team of software engineers could do a lot worse than use such a tool to manage their review process (you do have a review process, right?)

I also thought about passing on some advice about using ReviewBoard, which we at Saros have gathered, but I’ve been beaten to it by others on the Interweb, most notably by KDE’s Aaron Seigo. Otherwise, read a good software engineering handbook for the more general inspection stuff.

There are however a couple of additional tips I could pass on:

Due to one of the unusual quirks in the way Saros team functions, we typically have a only single team-member producing a new feature. Something like a new feature means a lot of new code and therefore a big patch. (Reviewing a two-thousand line patch makes me yearn for something less painful, like root canal work.) This is where ReviewBoard’s multi-version helps.
This way, the effort required for reviewing a monster feature is kept under control.
ReviewBoard does a lot of work dynamically that requires it talks to your VCS. If, like us, your VCS is kept on a less than speedy server, ReviewBoard can be slow. We solved this by connecting ReviewBoard to a self-hosted, read-only mirror repository that is much quicker and makes for a much more enjoyable experience.

]]>