Why Does Software Break?

An essay on “Why Software Breaks” that touches on the complexities of software and the computer systems on which they are built – a complexity that is inherent to their flexibility, and therefore can never really be reduced or removed.

It’s only natural to wonder why, after all this time and our collective experience, that we still produce buggy, brittle software that breaks and crashes. It’s also only natural to point at “software engineers” and then the other kinds of “engineers” – as in, the people who build bridges, skyscrapers, cars, planes, etc. – who can build things that work for years and don’t (generally) break down and crash, and ask “why can’t we do the same thing with software?”

To answer that question, it’s important to make a distinction between the physical world of bridges, skyscrapers, planes, and such, and the “thought-stuff” world of software.

While software is, to use the words of Frederick Brooks in The Mythical Man-Month, made purely of insubstantial “thought-stuff,” it is, ultimately, made by man – and as man is fallible, so to are the things that he creates. (After all, some bridges fall down, some skyscrapers collapse/leak/shake in the wind, and some planes crash.)

There’s also the “layer” aspect to keep in mind – software may be “thought-stuff,” but it doesn’t exist purely in a vacuum. It relies upon the perfect function of millions (or billions) of tiny, often microscopic physical components, which have been engineered with great specificity and tight tolerances. A few cosmic rays (or a clumsy user pulling out a cord) can screw up the perfect balance of all these components in unimaginable ways – sort of like pulling out the main support for a bridge, or blowing out the tire of a car. (Or, perhaps like having a few large birds fly into the engine of a plane!) When these sorts of things happen, the system – be it bridge, plane, car, or computer – fails, often spectacularly.

So, it’s less accurate to think of a computer system (hardware and software together) as being like a bridge, and more accurate to think of it as being like a giant clockwork mechanism – a huge Rube Goldberg-type device – with hundreds of finely inter-meshing gears and sprockets. If just one gear pops out of place, or one sprocket cracks a tooth, the system stops working properly – perhaps just a little bit, or perhaps so much so that more gears are forced out of place, and more sprockets are broken, until the entire thing collapses in a pile of ruin.

To carry the bridge metaphor in the other direction (as it were), it might be more accurate to think of a computer system as being like a bridge that not only functions like a bridge (gets people from one side to the other), but also functions as a musical instrument capable of producing both classical, jazz, and electronic/techno music; predicts the weather; washes your clothes; generates electrical power; can be quickly reconfigured into a skyscraper home for people or a hospital, as needed; can float up and down the river to a new crossing (dynamically expanding or shortening its length as it goes, of course); and can also fly, carrying everyone on it to a new river, with new road signs that instantly match the new language and traffic patterns of the new location. It also has to do all this while not disturbing the environment around it, while simultaneously accepting any impact its environment puts on it, even if such impact might cause it to function in a manner contrary to the one for which it was designed.

If you were to try to build a physical bridge to do all of these things, it would probably break in much the same ways that software does.

To use a different analogy, consider the difference between a typewriter (a machine designed to do just one thing – type words) and a computer. No one would argue that the computer is a more reliable typing instrument – after all, the typewriter is fairly simple, and because it is designed to do just one thing, it can do it well. Also, when the typewriter fails, the cause is generally immediately apparent (e.g., out of ink ribbon) and can easily be understood – and fixed – by the user.

On the other hand, the computer – while on the surface just the same as the typewriter (keyboard on which you type words), is infinitely more flexible. There is almost an infinite number of other things that the computer could do in addition to typing – it could play music, calculate your taxes, control millions of tiny light-producing elements to display an interactive 3D environment – or a photo of your dog, talk to you using a synthesized voice, control complex machining equipment, participate in a global network, and almost anything else you could imagine.

When you consider that, it’s no wonder that computers have so many ways in which they can break. It’s exactly because they are so flexible that they are so fragile at times – their flexibility is their greatest strength, and at the same time, their greatest weakness. Because they are so generalized, getting them to do any one specific thing involves a lot of re-building of concepts (we call them “metaphors” in the world of software) just to get any useful work done, never mind actually taking care of the main task at hand.

In the end, software breaks because it (and the computers on which it runs) are general purpose machines which we ask to do an enormous number of things (some often contrary to one another!), and even though we might only be asking it to do something simple at the surface (e.g., type a few words onto the screen), in reality there are innumerable hidden complexities involved in getting a general-purpose machine to do something so specific (and, we would hope, do it well) that it’s only natural that there will be errors – both human induced and artifacts of the system itself.

In other words, softare breaks because computers are fantastically flexible general purpose machines that, by their very nature, require complexity in order to do anything specific – and no layers of abstraction, big-M Methodologies, frameworks, or whatever else we come up with – are going to change that simple and immutable fact.

The Desktop App is still King

Although all the “cool kids” these days seem to be writing web apps, and the word “cloud” has taken on a new meaning that is sure to confuse meteorologists and normal people alike, I still think that desktop apps are very important. Maybe even important enough to deserve a little more attention than they’ve been getting lately (living, as they do, in the shadow of the buzzword friendly “web app”).

Although all the “cool kids” these days seem to be writing web apps, and the word “cloud” has taken on a new meaning that is sure to confuse meteorologists and normal people alike, I still think that desktop apps are very important. Maybe even important enough to deserve a little more attention than they’ve been getting lately (living, as they do, in the shadow of the buzzword friendly “web app”).

Now, I’m not deriding web apps here – I use them, too! But let’s face it – the web was not designed to be an application. Look up the history of what hypertext means and you’ll see how far we’ve had to stretch it to get to where we are today.

Even the very best web apps tend to spend a lot of effort to look just like a desktop app. When the lazy programmer in me sees this sort thing, it causes me to develop an unhealthy twitch (in minor cases) or curl up into a ball in the corner (in extreme cases), muttering something about “code reuse.”

Of course, this is just a generalization – there are web apps that have absolutely nothing in common with desktop apps – take Google and Twitter, for example.

But even truly unique web apps still end up tied to the desktop in one way or another. I use Google all the time; but most of my searches happen through the “search” box in Firefox. I use Google Notebook, but I get to it through a Firefox add-in. I use Twitter, but I do so through a client (I’m currently using Witty Twitter, but there are literally dozens and dozens of clients out there).

Sometimes I worry that we focus too much on web app design, to the detriment of desktop app design (UI design in particular). Web apps are cool, sure, but the desktop app is still “king,” and it’s not wise to ignore the king!

So Much for my “Upgrade” Path

There’s been a major change in my plans to eventually “upgrade” to a 64-bit OS (probably the 64-bit version of Windows 7). Namely, the idea that it could be an “upgrade” at all.

There apparently is no upgrade path from any Windows 32-bit edition to any Windows 64-bit edition. If you’re going to make the jump, you have to do a clean install.

Major bummer.

I Think I’ll Use Windows 7

So, I’ve been trying out the Windows 7 beta lately… and I think I’ve decided, that when Windows 7 is officially released, I will upgrade to it.

…Let me explain.

As you probably already know, I currently use Windows XP. It came with my computer when I bought it, and I just didn’t see the incentive to upgrade – what with the horror stories of driver incompatibilities and so forth. Given everything, it just seemed like it was better to wait until the device manufacturers got around to updating their drivers for Vista, and all the dust had settled (and there was a lot of dust, as you’ll recall).

The new Windows 7 Desktop you've probably seen a million times already.

Of course, by the time the dust had settled to my satisfaction, along comes the announcement of Windows 7. And after giving it a try (and having used Vista before as well), I can say that Windows 7 is just what many people say it is – basically, a “Vista Super Service Pack.” Vista 2.0. Vista “as it should have been.”

I mean, I do like some of the features of Vista and Windows 7 – the graphics sure do look nice, and I’d love to have the new Windows Media Center to play with. But Vista had some annoyances (some of which I’ve written about before) that annoyed me just enough to not upgrade.

But Windows 7 does some things to help that (Vista’s first Service Pack did too, to be fair). The improvements to the UAC (user account control) service were long-overdue (and, in my opinion, the problems with it should have been caught in beta testing). Windows 7 is now smart enough to realize when you (the user) actually clicked on something, and not second-guess you and ask you to “please approve what you just did.”

It’s worth noting, of course, that there were probably some BIG technical challenges to this, even though it seems simple (in principle). After all, it is possible to use functions like “SendKeys” and the like to simulate user interactions – and how is the computer supposed to know the difference between a “real” user action and one that was “simulated” by another program? Without a lot of re-working, hacks, and clever tricks, the answer is “it can’t.” But, being difficult isn’t a good enough excuse in this case, and I’m happy to see Microsoft went ahead and did the Right Thing, even though it was hard.

You can even make the new taskbar smaller, like it was in Vista, if you prefer.

Speaking of the “Right Thing,” that brings me to the new taskbar – a hotly debated topic among Windows 7 reviewers!

The new Windows 7 taskbar is, and will continue to be… controversial. It is, arguably, better than the old taskbar. It is also quite obviously a sort-of-copy of the Mac OS X “dock” –  the idea of using the same icons for launching an application and for switching windows. It sounds confusing at first, and honestly, it is. It will take some getting used to.

In a way, though, I think of Windows 7 as being similar to Office 2007 – yes, things look all different, and yes, you’re going to have to learn some things over again, and what you used to have memorized won’t work anymore. But in the end, once you get used to it, you see that it really is better.

I was one of those people who, at first, was really annoyed with Office 2007. But after I got used to it (and as Yoda would say, “unlearned what I had learned”) I found I could find things easier, faster, and do more. I even found things I didn’t know about before. And isn’t that the point?

Likewise with Windows 7. Yes, it’s different, but really, it is better. And after you “unlearn what you have learned,” you’ll find you don’t really have to “learn” anything new, really. It all just makes sense, once you open your mind to it. And that’s a good thing – that’s what “intuitive” is supposed to be like.

An exchange from the movie The Lion King sums it up nicely:

Rafiki: Change is good.

Simba: Yeah, but it’s not easy!

Making the decisions for Windows 7 undoubtedly weren’t easy for Microsoft. But sometimes, you have to make the “hard” decision to do what you know is right (or better), no matter how much people will complain that they can’t get back their classic start menu or whatever. (Coincidentally, although the classic start menu is really gone, most of the things people gripe about – the “run” dialog box and the quick-launch toolbar – are still there; they’re just hidden a bit. But you can bring them back – just do a Google search and you’ll find people telling you how, if you don’t want to change.)

But there are other factors at play here, at least for me. For one, I’m getting ready to upgrade to a 64-bit CPU this year – finally making the jump to the land of 64-bits (which is where we’ll all be, eventually – it’s inevitable). I’ve basically maxed out the RAM that my computer can address, and I still find myself needing more – and the only way to get more than 4GB of RAM is to upgrade to a 64-bit CPU and a 64-bit OS.

So, the move to 64-bits is going to force me to upgrade my OS one way or another – it might as well be to the latest Windows version, right?

Now I just need to wait and see whether an upgrade from XP to Windows 7 will even be supported – because I don’t want to have to do a “clean” install. Here’s hoping!!

Bad Sectors? Low-Level Format

It seems like there IS a way to “clear” bad sectors from your hard drive so you can use tools like GParted and the like – but I use “clear” in a very loose sense here!

First off, I MUST point out that I’m talking about file-system bad sectors. I’m NOT talking about physically damaged disk platters.

It seems NTFS keeps a list of bad sectors, and as long as those sectors are there, most partition-resizing tools will refuse to touch the disk with a 10 foot pole. HOWEVER… those of you who are beyond a certain age might remember something called a “low-level format.” (I’ll wait a moment for you while the moment of nostalgia passes.)

I thought I’d never see a need for low-level formatting in today’s world of super-reliable, super-fast, super-S.M.A.R.T, super-big hard drives – but it seems there is still one use for it.

The hard drive manufacturer’s low-level formatting utility will detect “bad” sectors and put them in the drive’s internal list of “bad” sectors – this is in the drive’s own firmware mind you, not in any file system structure (because at this point, your file system has been wiped out!).

Once this is done, the drive’s own controller will silently avoid those bad sectors – from any software’s point of view, those sectors or clusters just don’t exist anymore. (Since software rarely – if ever – directly addresses the disk, this sort of behind-the-curtain hiding of sectors or clusters is easily done by the drive’s on-board controller.)

After the low-level format, you can partition & format the drive normally, and your OS or whatever disk-checking tool you use should find a nice, clean disk with no errors.

Astute readers will note the downside to this “solution” – you have to low-level format your hard drive! Obviously this erases everything on it, without any possibility for recovery. So it’s not for the faint-of-heart.

And, coincidentally, it’s not for me, either. I’ve simply got too much data on my 2nd hard drive to back it up easily (and cheaply). And doing this to my primary hard drive (and thus being forced to re-install Windows) is simply out of the question.

For those that are interested, I found a lot of this information in this EASEUS Software Forum posting, while looking for the reason why (you guessed it) their software wouldn’t re-size my partition.

In the end, it looks like I’m stuck running the Windows 7 beta in a virtual machine – which is of course super-slow.

Maybe someday I can shell out the cash for an external hard drive so I can back up my data and do the low-level format… but until then… I’m stuck with my partitions the way they are. Bummer!!

Interested readers may want to catch up on the previous entries in this saga: