Speech Recognition and Artificial Intelligence

Every so often, I’ll get a little pissed off and start wondering aloud, “where the hell are my talking computers?”

Seriously, though – it’s 2008. Ten years ago, we were sure that by now, speech recognition would have surpassed the keyboard as the primary means of input. Hell, we’ve been predicting it for so long, it’s become somewhat of a hollow prediction – a lot like the “flying cars” argument.

But really, why aren’t we all talking to our computers? The answer, in my opinion, is that we haven’t developed artificial intelligence enough yet.

Why is artificial intelligence important for speech recognition, you ask? Let me explain.

We’ve had “basic” speech recognition for some time now. I have personally heard of “Dragon Naturally Speaking” as the be-all, end-all of speech recognition software since somewhere around 1998 – and I’m still not using it. Nor is anyone else – at least not on any large scale. And there’s a very, very good reason for that – it’s simply not good enough.

Now, I’m not saying that speech recognition isn’t getting better at recognizing words and so forth, but at this point, using your computer via voice commands is a bit like trying to operate your computer through the same interface as the original Altair 8800. Oh sure, each individual switch works quite well – but try teaching your grandmother to check her email by just flicking 8 switches on the front of a panel with a few lights on it. That’s about where voice recognition is right now.

You see, there’s a very important “missing piece,” which is context. Or, to put it another way, consciousness.

In order for a speech recognition system to understand instructions given by a human being in plain speech, that system needs to be able to understand plain human speech – which, more often than not, requires a lot of understanding of the context in which it’s used. And to understand context like that, you need a rudimentary consciousness – something that has awareness – not necessarily of itself, but of what it’s working with. And we simply don’t have that yet.

Take an example.

Imagine you’re composing a message. You’re going to send it to your friend, “Bob.” Here’s how you’d use voice commands today:

Command mode. Open Email. Compose message. Dictation mode. “Hi Bob comma how are you doing today question mark capital I am doing just fine comma we enjoyed dinner with you last week period command mode backspace word backspace word command mode” Alt, File, S, Tab, Tab, Tab, Enter. Close Program.

And that’s with minimal errors – in reality, you’d be using the “backspace” or “undo” command quite often. And because speech recognition has no context, no consciousness, you need to tell it explicitly when you move from giving commands about what to do with the computer (basically, using voice commands as a slow and unreliable mouse pointer) to “dictation mode,” where it just writes what you say – basically acting like a bad transcriber. It’s slow, cumbersome, and unreliable. And until it becomes faster and easier (and, to a certain extent, cheaper) than using a keyboard and a mouse, it will remain a fringe method of input.

Contrast this with a voice command session with a computer equipped with speech recognition and a rudimentary AI:

Computer, begin new email to Bob. “Hey Bob, how are you doing today? I am doing just fine, we enjoyed dinner with you last week.” Send message.

Which one do you think most people could adapt to quicker – the first one, or the second one?

Remember also that we haven’t even touched upon corrections. With AI, you could say “no, wait, make that ‘I’m doing just fine'” and the computer would know (based partly on your emphasis on “I’m,” and partly due to its awareness of the sentence structure itself and the context in which it was used) which phrase to replace. Just you try that with today’s speech recognition!

I’m not sure if AI research is being pursued as much as it should be – I have a sinking suspicion it’s not (probably due to fear of runaway AI and other ethical concerns). And maybe that’s a good thing, in the long run. But I’d like to see this sort of thing happen, and happen soon. Because I’m tired of typing – I want to talk to my computer.

I mean, seriously… it’s 2008! Wasn’t this sort of thing supposed to happen like 7 years ago, at least? What ever happened to “life imitating art?”

I’m waiting…

The Evolution of the Desktop

My computer sure has come a long way.

My computer sure has come a long way.

Keith’s Desktop 2000

This is what my desktop looked like somewhere around April 13, 2000. I imagine it was either Windows 98 or Windows ME I was using at the time.

Note the Netscape icons – ah, the heady days of Netscape Communicator version 4!

Keith’s Desktop 2003

Then I upgraded to Windows 2000 – what a difference! And just look at all the icons I had in my system tray (sorry, the taskbar notification area).

Astute readers will notice I have a fondness for WinAmp that runs a long way back.

Keith’s Desktop 2007

And here we are today. The number of icons has gone down a lot, but that’s because I’ve learned the beauty of minimalism. And the option to “hide notification icons.”

There are a few more icons in my quick-launch area, but all in all not much has changed. I still use WinAmp (it’s semi-transparent at the top of the screen) and I still pretty much work the same way – often with the same programs – and my desktop reflects that. (Except now my desktop background changes every 15 minutes thanks to John’s Background Switcher. And of course I now use Firefox and Thunderbird instead of Netscape Communicator.)

I wish I had pictures from before this, but graphics (and screen captures) were kind of hard back then. Still, it’s interesting to see how far I’ve come.

Great Hackers

Explains why I don’t know Java. (Although I don’t know Python, either.)

Great Hackers

Explains why I don’t know Java. (Although I don’t know Python, either.)


Damn Penguins…

I’ve been spending the better part of the last two weeks struggling with penguins. Unfortunately, though setting up a linux box for home or office use is quite easy, setting one up to be a secure web & database server is a bit more difficult.

I’ve been spending the better part of the last two weeks struggling with penguins. Ok, so I don’t mean the real birds – I mean Linux, the cool & froody Operating System that I love & support. Unfortunately, though setting up a linux box for home or office use is quite easy, setting one up to be a secure web & database server is a bit more difficult.

First, some history:

Sanctuary continues to live in my office – the computer that was outdated back in 1998 continues to serve a useful purpose even today. Originally, Sanctuary was my personal computer – the replacement for the wimpy IBM Aptiva that I had freshman year. When I used it, it ran Windows 98, but eventually it went the way of the dinosaurs, and was reborn as a linux box back in 1999 – 2000 (or somewhere thereabouts). Since it was a fairly stable PC, running a Pentium 233 MMX processor with a whopping 64 megs of RAM, it was perfectly happy running those early versions of Red Hat Linux. Eventually it even found life as a PC for my roomate-at-the-time, Dave – running Linux (again), of course.

After I moved to Fitchburg, Sanctuary found a third life as a disposable PC – something that I would install Windows 98 on, test my software product on (Windows 98 is the lowest platform we support), and then re-format it before testing again. It even became a PC for Amanda’s use during this time – though honestly, she hardly ever used it.

After my company moved into REAL office space (here in Fitchburg, as I have blogged before), Sanctuary moved with us. When we needed a secure, stable web & database server, the only possible answer was Sanctuary. Since we needed a “secure” web server, a Windows-based box was out of the question. So, after some research into what’s new in Linux these days, Sanctuary was reborn as a Debian linux server – to be loaded with Apache (web server), MySQL (database server), SSL (secure sockets layer, used for secure web transactions), and PHP (web scripting language).

Now, you’d think it would be easy to do all this, since Linux has come so far since the “old” days.

Well, the answer is NO. And that’s why Linux on the desktop and in the office is still a ways off – and why Micro$loth still rakes in the dough – don’t dump your stock just yet.

Firstly, getting Apache + SSL to work together is a chore. The usual way to do things under linux is to download the source code & compile stuff yourself. That’s all well and fine – after all, I am an accomplished programmer and am quite happy compiling my own programs – but these aren’t little programs, they’re BIG programs – and I have never gotten the hang of makefiles, which is what tells the compiler what to compile & how to put it all together.

First mistake on my part: I tried to combine the downloadable Debian packages for Apache & SSL with source-code compiled PHP and MySQL. The reason? Well, I wanted to be sure I had the latest versions, and the pre-made self-installing Debian packages were not always the latest version. Well, now I know getting those two things – pre-made packages & self-compiled programs – to work together is near impossible. Scratch that idea.

Second mistake: I tried to download & compile EVERYTHING. It actually worked quite well, up to a point. The problems happened when I tried to compile Apache itself, with the SSL (OpenSSL & mod_ssl, if you want to know) module. I just kept getting errors – strange errors too, about files that were present being missing, and constants not being defined. No help from google on this one, so I gave up.

Finally, I installed everything using the Debian package manager – and it worked (thank god). So now I can settle down to getting the data imported into the database & set up the website and other things. Thus ends my affair with the penguin.

Still, even after all this, I still think linux is a totally cool & froody OS; I mean, it runs quite happily on this ancient hardware I have (albeit slowly), and even when I manage to crash things (and crash them good), the OS itself keeps on chugging. And even through all these installs & uninstalls, NO REBOOTS WERE REQUIRED. Now THAT’S cool.

Amanda’s home early tonight, so that’s all the time I have for bloggification. Until next time…