Every so often, I’ll get a little pissed off and start wondering aloud, “where the hell are my talking computers?”
Seriously, though - it’s 2008. Ten years ago, we were sure that by now, speech recognition would have surpassed the keyboard as the primary means of input. Hell, we’ve been predicting it for so long, it’s become somewhat of a hollow prediction - a lot like the “flying cars” argument.
But really, why aren’t we all talking to our computers? The answer, in my opinion, is that we haven’t developed artificial intelligence enough yet.
Why is artificial intelligence important for speech recognition, you ask? Let me explain.
We’ve had “basic” speech recognition for some time now. I have personally heard of “Dragon Naturally Speaking” as the be-all, end-all of speech recognition software since somewhere around 1998 - and I’m still not using it. Nor is anyone else - at least not on any large scale. And there’s a very, very good reason for that - it’s simply not good enough.
Now, I’m not saying that speech recognition isn’t getting better at recognizing words and so forth, but at this point, using your computer via voice commands is a bit like trying to operate your computer through the same interface as the original Altair 8800. Oh sure, each individual switch works quite well - but try teaching your grandmother to check her email by just flicking 8 switches on the front of a panel with a few lights on it. That’s about where voice recognition is right now.
You see, there’s a very important “missing piece,” which is context. Or, to put it another way, consciousness.
In order for a speech recognition system to understand instructions given by a human being in plain speech, that system needs to be able to understand plain human speech - which, more often than not, requires a lot of understanding of the context in which it’s used. And to understand context like that, you need a rudimentary consciousness - something that has awareness - not necessarily of itself, but of what it’s working with. And we simply don’t have that yet.
Take an example.
Imagine you’re composing a message. You’re going to send it to your friend, “Bob.” Here’s how you’d use voice commands today:
Command mode. Open Email. Compose message. Dictation mode. “Hi Bob comma how are you doing today question mark capital I am doing just fine comma we enjoyed dinner with you last week period command mode backspace word backspace word command mode” Alt, File, S, Tab, Tab, Tab, Enter. Close Program.
And that’s with minimal errors - in reality, you’d be using the “backspace” or “undo” command quite often. And because speech recognition has no context, no consciousness, you need to tell it explicitly when you move from giving commands about what to do with the computer (basically, using voice commands as a slow and unreliable mouse pointer) to “dictation mode,” where it just writes what you say - basically acting like a bad transcriber. It’s slow, cumbersome, and unreliable. And until it becomes faster and easier (and, to a certain extent, cheaper) than using a keyboard and a mouse, it will remain a fringe method of input.
Contrast this with a voice command session with a computer equipped with speech recognition and a rudimentary AI:
Computer, begin new email to Bob. “Hey Bob, how are you doing today? I am doing just fine, we enjoyed dinner with you last week.” Send message.
Which one do you think most people could adapt to quicker - the first one, or the second one?
Remember also that we haven’t even touched upon corrections. With AI, you could say “no, wait, make that ‘I’m doing just fine’” and the computer would know (based partly on your emphasis on “I’m,” and partly due to its awareness of the sentence structure itself and the context in which it was used) which phrase to replace. Just you try that with today’s speech recognition!
I’m not sure if AI research is being pursued as much as it should be - I have a sinking suspicion it’s not (probably due to fear of runaway AI and other ethical concerns). And maybe that’s a good thing, in the long run. But I’d like to see this sort of thing happen, and happen soon. Because I’m tired of typing - I want to talk to my computer.
I mean, seriously… it’s 2008! Wasn’t this sort of thing supposed to happen like 7 years ago, at least? What ever happened to “life imitating art?”
I’m waiting…
Before we begin today’s rant, it is important to point out that the word “elegant” holds a special meaning for software people. Let me quote the entry from the New Hacker’s Dictionary (or the “Jargon File” as it is sometimes known):
Combining simplicity, power, and a certain ineffable grace of design. Higher praise than `clever’, `winning’, or even cuspy.
Now we can move on to the topic of today’s rant: Verizon. Verizon’s systems that interact with customers are most definitely not elegant.
This weekend I tried to pay my Verizon phone bill. It was near the due date, and so I didn’t want to mail a check (so… 20th century-ish) and paying by phone brings with it an irrational $3 fee (always a good business practice to make it harder for your customers to pay you… riiiiiight…), so paying online seemed to be the best option. What the heck, I’d done it before, right?
Well, no.
Since I’ve moved, my phone number has (obviously) changed. And although my online account still works (user name and password logs me in, anyway), there are no phone numbers associated with my account. Apparently, the USER - the CUSTOMER is responsible for this association. Great call there, Verizon.
So I try to associate my phone number. In order to prevent random people from stealing your account, their system will actually call you on your home phone - which seems like a good idea at first glance. (I’ll set aside arguments about how this won’t work unless you are at home with a working phone for now, as they didn’t apply to me.) Apparently, the way it works is their automated system will call you and give you a “temporary PIN” which you then type into the website and that’s how they verify that YOU are actually the owner of the line. What could possibly go wrong?
The problem here should now be obvious. Here I was, being spoken to by a computerized voice which read out some random bunch of letters & numbers (which really isn’t a “PIN” in the strictest sense of the word, but whatever), and for the life of me I couldn’t tell whether it was saying “B” or “D” or “V” or maybe even some other letter that sounds sort of the same (and there are a lot). It didn’t even use the standard phonetic alphabet readings, like “V as in Victor” and so forth, so there can’t be any confusion.
In the end, after many different tries of different combinations of letters, I gave up. I could not validate my account AT ALL online, and had to pay BY PHONE and be charged $3 for the privilege of getting my money to Verizon faster.
You can understand why this interaction with Verizon left me feeling like I’d been sucker-punched in the stomach.
For a company as big as Verizon, this is inexcusable. What’s worse is that because it was the weekend (Sunday, to be exact), ALL of their support phone numbers (which are hard enough to find as it is) were closed - except for the automated computer system that can only read you your balance (and take your payment and charge you $3 extra for it).
I’d love for someone to explain to me how, exactly, this is considered “good” customer service?
Let’s break down the transgressions, shall we?
It’s enough to make me think about finally giving up my hard line with Verizon and going entirely with VoIP phone service instead.
If you want 5 ways to lose customers and make them angry, just take these tips from Verizon.