Speech Recognition and Other Unnatural Acts
Straight talk about Naturally Speaking
By Deborah Quilter
First published February 3, 1998

Computer users with repetitive strain injuries are often told to type less, but short of changing jobs, this feat isn't always easy to accomplish. Speech-recognition programs, which transcribe dictation into text files, promise to free computer users from the keyboard. I tested Dragon Systems' speech-recognition program NaturallySpeaking, hoping it could make good on that promise.

I've been using NaturallySpeaking for nearly three weeks, and it makes me think of a dog that can walk on two legs: It doesn't do it well, but it's amazing it can do it all. While Naturally-Speaking is indeed remarkable, it's no replacement for your keyboard.

WHAT YOU NEED

NaturallySpeaking is essentially a dictation program that functions like a rudimentary word processor. Instead of typing words, punctuation, and formatting commands, you dictate these elements and the program transcribes them. You can then print, save in plain text or rich text format, or copy the document into a traditional word processor or email message.

The system requirements are fairly modest. NaturallySpeaking needs at least a 133MHz processor, 32MB of RAM, and Windows 95 or NT 4.0. It's also important to have a compatible 16-bit sound card. Required hard disk space starts at 60MB but varies according to the edition and features you use. The program comes with a headset microphone that you plug into your sound card.

My test system, a newly reconditioned, 200MHz Gateway running Windows 95, came with 32MB of RAM, a 3GB hard drive, and an Ensoniq Wavetable sound card. I started off with the basic Personal edition, but when I experienced general protection faults, I switched to the Preferred edition at the advice of a Dragon technician. (Dragon admitted to errors in the Personal edition.) Switching to the Preferred version solved the problem.

In addition to the $159 Personal and $229 Preferred editions, there's also a $695 Deluxe version of NaturallySpeaking. All three have the same vocabulary size (230,000 words), but you can have up to 55,000 active words in the Deluxe edition. The other two are limited to 30,000 active words. Dragon says the Preferred and Deluxe editions can play back your recorded speech and translate any written text to speech.

GETTING STARTED

Installing and training the program is easy, if time consuming. After installing the software, you read a few paragraphs so the program can adjust to your voice. Then you fine-tune the program's ability to understand your voice by reading about 18 minutes' worth of text. Dragon offers selections from "3001: The Final Odyssey," "Dave Barry in Cyberspace," and "Dogbert's Top Secret Management Handbook."

The program's interface is similar to a word processor's, with menus for File, Edit, View, Format, Tools, and Help options. When you dictate, your words first appear in a tiny window that shows the program analyzing your text. Then the analyzed transcription appears in a larger window. Separate windows pop up when you correct words and train the program to recognize words it has misunderstood.

To improve the program's transcription accuracy, you can run the vocabulary builder. NaturallySpeaking analyzes a general-purpose document you've written, such as a memo or email, and adjusts to your style of composing text. I used one of my columns to acclimate NaturallySpeaking to my style.

MISS TAKES HAP IN

Alas, NaturallySpeaking was a poor student. Here is an example of how the program interpreted some dictated text. My intended words are in parentheses:

"The program would frequently insert words that I had not battered (uttered), and it had a strange tech (tic) of inserting the word "and" when I had not Senate (said it), which I later discovered was how it interpreted my breathing. And know (though) it claims to be continuous speech, you sure deal (do) an awful lot of backing up and repeating yourself in a most unnatural and stressful fashion."

NaturallySpeaking often ran afoul of words that sound the same but are spelled differently. It's obvious that the program has a ways to go before it can recognize words in context.

While such errors might be amusing at first, I quickly tired of having to repeat myself--sometimes five and six times--to make corrections, meanwhile losing my train of thought. When I hemmed and hawed, my mistakes were dutifully printed. I then had to redictate and correct the errors. And though Dragon says you can use the alphabet to spell words, when I tried to change the word "in" to "and," I first got "kende." Then I got "ae" for "a." Then I got "ane" twice, at which point I used the keyboard. It was far easier to type the correct text than to spell it out.

SAY WHAT?!

Dragon claims that NaturallySpeaking can transcribe your words as fast as you can speak, up to 160 words a minute. That was not my experience. In general, I found that printed text lagged as the program figured out which words to choose. When I read a 266-word sample, the program finished transcribing the speech a minute and a half after I stopped, and it didn't include my final line of dictation. (The program failed to record significant chunks of text in other samples, too.)

In the same sample, it took me 58 minutes to correct the approximately 80 errors, and I cheated. When the program failed to correctly interpret my voice commands after the third try, I used my hands.

When NaturallySpeaking makes errors, you can use the correction box to record your pronunciation of the right and wrong words so the program can differentiate between them in the future. I trained the program to recognize the incorrect words in my 266-word sample and then redictated the text. The program repeated a few of the original errors, which made me wonder how effective the "training" system is. It also made about 20 new errors and failed to record 58 words at all.

When I told a Dragon technician about my error rate, he asked what I had dictated. "A simple children's story," I replied. "That's a bad example," he said. He explained that the program analyzes the likelihood of words appearing together in context. If you're not dictating something highly predictable, such as boilerplate contracts or medical reports, NaturallySpeaking is more likely to make mistakes.

To troubleshoot the high error rate, I read a brief, straightforward passage from Dragon's Web site. I got three errors in a 23-word sample the first try and four the next time, including one error I had "corrected." I also sent Dragon's technicians an audio file I recorded using NaturallySpeaking. Their analysis of the file indicated that my sound card's quality was "normal," which meant it was doing its job correctly. Incidentally, this sound card test is available for free as a troubleshooting method to anyone who buys the program.

A Dragon technician said my error rate was unusual and suspected my PC was at fault. However, he agreed that my hardware and software were up to spec and couldn't pinpoint anything like ambient noise or poor diction as the cause of the errors.

DON'T TOSS THAT KEYBOARD

Voice recognition programs are often suggested as a solution for people with repetitive strain injury, but NaturallySpeaking is not entirely hands free. In addition to using your keyboard when voice commands don't work, there are some actions that require your hands, such as turning on the microphone, typing the first few letters of what you're looking for in Help, and holding down the Control key so the program realizes the word you're dictating is a command, such as "click file" or "select."

A heavy user of the program could also be at risk for another injury, serious vocal strain. There's nothing natural about talking for hours on end, as some jobs would demand, or being tethered to a computer by a headset microphone. The likelihood of a vocal injury could be heightened if you find yourself talking in a halting cadence, as I did when the program couldn't keep up with me or when it made mistakes.

NaturallySpeaking may be more appropriate for doctors, lawyers, and other people who dictate predicable strings of text. If you try this software, make sure your PC meets the system requirements, or your results may be worse than mine. If you want to run other software in the background, you'll need extra memory. Otherwise Windows might slow down, which in turn will slow NaturallySpeaking. Small details like microphone placement and ambient noise can also lower performance, although those factors were not a problem in my case.

I hope the program's performance is better in a wider variety of situations. It could be a great boon to switch between computing by hand and voice and thus reduce your overall bodily strain.

Return to Article Listing

Site Info
Home
What's New on this Site
Our Philosophy
Privacy Statement
FAQ (Frequently Asked Questions)
Mailing List
Site Reviews
Link your site to RSIHelp.com
Copyright Notice
Services
Exercise Program
Ergonomics
RSI Consulting
RSI Info
About RSI
RSI Risk Factors
RSI Warning Signs
RSI Prevention Tips
Why Breaks are Important
RSI & Children
Books
Articles
Finding a Physician or Support Group
Ask DQ
About Deborah Quilter
About Ms. Quilter
Media Inquiries
How to Contact Ms. Quilter
Vendor Information
Product Review Policy
Beyond Ergonomics
About the Company
 





Except where specifically noted, all contents of this site copyright © 1996 - 2003 Deborah Quilter. All rights reserved.