|
Ian
Helfant
Dept: Russian
Project: Speech-to-Text and Voice
Control
The
promise that computers will respond to the human
voice has been a staple of science fiction ever
since the supercomputer HAL of 2001, but is
finally a reality with the use of special
software. Actually, the technology has been around
for a few years and continues to improve.
Ian Helfant, Assistant Professor in the Russian
department at Colgate, began using
voice-recognition software when he developed
severe symptoms of repetitive-use injury in both
arms that prevented him from typing on the
computer keyboard. “I can remember when my arms
would ache just walking down the sidewalk, let
alone brushing my teeth,” he recalls. “It all
happened after 7 months of working on a laptop in
Russia while doing research for my dissertation.”
That was six years ago and his injury is long
gone, but Ian continues to “talk” to his computer
as a method of inputting text and controlling its
on-screen movements.
With the use of a program called “Dragon
Naturally Speaking” – one of a handful that are
now available -- Professor Helfant performs many
tasks on his computer with no hands.
Composing email messages, writing articles in
Microsoft Word, surfing the Web in Internet
Explorer, composing lecture notes, and responding
to student’s threaded-discussions are just a few
examples of how Ian uses his computer’s voice
recognition capabilities. Watching Ian speak into
a microphone mounted on a headset and seeing the
words magically appear on his computer’s monitor
is quite amazing. The voice recognition software
is trained to understand your voice as you use it,
which means the more the software is exposed to
your voice, the more accurate it becomes in
translating it into text. In theory, a proficient
user can attain 99% accuracy, although Ian says
he’s never quite reached that goal.
In one of Ian’s demonstrations within Microsoft
Word, he dictated a paragraph of approximately 100
words and the resulting text within Word was about
95% accurate. With the current generation of
computers, the dication appears in “real time,” so
that there is little or no delay between dictation
and the appearance of the text on the computer
screen. To edit mistaken text, one has the option
of either using the conventional method of mouse
and keyboard, or using the voice commands built
into the software. Because Ian has used the
software for about six years, he is now proficient
at quickly using voice commands to fix the errors
in the text. “For someone who types 90 words a
minute, the dictation can still seem less
efficient, but for most of us it actually works
out to be quicker than typing,” he says.
While Ian speaks into the microphone, he must
verbalize the punctuation – “period”, “comma”,
“semi-colon”, etc. When asked if this interrupts
his train of thought, Ian responds negatively. He
says that after a short period of use, the
verbalization of punctuation, and many of the
other commands used to guide the program, became
transparent to him.
The other function of the voice-recognition
software that Ian uses quite proficiently is the
software’s ability to navigate the controls of the
computer. A simple voice command like “check
email” can be defined to launch Microsoft Outlook
and display your Inbox. Other voice commands like
“open”, “close”, “down”, “up”, and “delete” enable
normal functionality within Outlook or most other
Windows software programs. Used in conjunction
with the voice-to-text capabilities, Ian
demonstrated how he can open his email, check
various messages, respond to an email, and close
the program without ever touching the keyboard.
Two components are required to enable a
computer to recognize your voice – the software
and the headset microphone. The software Ian uses
is called
Dragon Naturally Speaking Preferred (~$200)
from a company named ScanSoft. The recommended
“wired” microphone is from a company called “Andrea”
and plugs into the USB port in the back of your
computer (~$70), or into a good soundcard like
those made by Soundblaster. For better
flexibility, wireless microphones are available
(~$200 from
Emkay
Electronics), although Ian stresses that the
Andrea ANC-700 corded microphone yields the most
consistent results.
Repetitive Strain Injury (RSI) is an increasing
hazard of today’s computer-oriented workplace. It
is important that the office environment be set up
to minimize the danger of RSI through encouraging
good posture at the computer, etc. For those of us
who nevertheless find that our arms are in pain
while working at the computer, or who simply never
learned to type efficiently, voice dictation
technology is something to consider.
“Because my arms are fully recovered,” Ian
says, “I use the dictation alongside the keyboard
and mouse – if I know it’s going to be a long day
typing, or I just want to mix things up a bit. I
find the dictation is more efficient than typing
for some types of tasks, and less so for others.
Someone who was really in pain could do almost
everything via dictation as part of the recovery
process.”
|
|
|