Ian Helfant
Dept: 
Russian
Project:  Speech-to-Text and Voice Control

The promise that computers will respond to the human voice has been a staple of science fiction ever since the supercomputer HAL of 2001, but is finally a reality with the use of special software. Actually, the technology has been around for a few years and continues to improve.

Ian Helfant, Assistant Professor in the Russian department at Colgate, began using voice-recognition software when he developed severe symptoms of repetitive-use injury in both arms that prevented him from typing on the computer keyboard. “I can remember when my arms would ache just walking down the sidewalk, let alone brushing my teeth,” he recalls. “It all happened after 7 months of working on a laptop in Russia while doing research for my dissertation.” That was six years ago and his injury is long gone, but Ian continues to “talk” to his computer as a method of inputting text and controlling its on-screen movements.

With the use of a program called “Dragon Naturally Speaking” – one of a handful that are now available -- Professor Helfant performs many tasks on his computer with no hands. Composing email messages, writing articles in Microsoft Word, surfing the Web in Internet Explorer, composing lecture notes, and responding to student’s threaded-discussions are just a few examples of how Ian uses his computer’s voice recognition capabilities. Watching Ian speak into a microphone mounted on a headset and seeing the words magically appear on his computer’s monitor is quite amazing. The voice recognition software is trained to understand your voice as you use it, which means the more the software is exposed to your voice, the more accurate it becomes in translating it into text. In theory, a proficient user can attain 99% accuracy, although Ian says he’s never quite reached that goal.

In one of Ian’s demonstrations within Microsoft Word, he dictated a paragraph of approximately 100 words and the resulting text within Word was about 95% accurate. With the current generation of computers, the dication appears in “real time,” so that there is little or no delay between dictation and the appearance of the text on the computer screen. To edit mistaken text, one has the option of either using the conventional method of mouse and keyboard, or using the voice commands built into the software. Because Ian has used the software for about six years, he is now proficient at quickly using voice commands to fix the errors in the text. “For someone who types 90 words a minute, the dictation can still seem less efficient, but for most of us it actually works out to be quicker than typing,” he says.

While Ian speaks into the microphone, he must verbalize the punctuation – “period”, “comma”, “semi-colon”, etc. When asked if this interrupts his train of thought, Ian responds negatively. He says that after a short period of use, the verbalization of punctuation, and many of the other commands used to guide the program, became transparent to him.

The other function of the voice-recognition software that Ian uses quite proficiently is the software’s ability to navigate the controls of the computer. A simple voice command like “check email” can be defined to launch Microsoft Outlook and display your Inbox. Other voice commands like “open”, “close”, “down”, “up”, and “delete” enable normal functionality within Outlook or most other Windows software programs. Used in conjunction with the voice-to-text capabilities, Ian demonstrated how he can open his email, check various messages, respond to an email, and close the program without ever touching the keyboard.

Two components are required to enable a computer to recognize your voice – the software and the headset microphone. The software Ian uses is called Dragon Naturally Speaking Preferred (~$200) from a company named ScanSoft. The recommended “wired” microphone is from a company called “Andrea” and plugs into the USB port in the back of your computer (~$70), or into a good soundcard like those made by Soundblaster. For better flexibility, wireless microphones are available (~$200 from Emkay Electronics), although Ian stresses that the Andrea ANC-700 corded microphone yields the most consistent results.

Repetitive Strain Injury (RSI) is an increasing hazard of today’s computer-oriented workplace. It is important that the office environment be set up to minimize the danger of RSI through encouraging good posture at the computer, etc. For those of us who nevertheless find that our arms are in pain while working at the computer, or who simply never learned to type efficiently, voice dictation technology is something to consider.

“Because my arms are fully recovered,” Ian says, “I use the dictation alongside the keyboard and mouse – if I know it’s going to be a long day typing, or I just want to mix things up a bit. I find the dictation is more efficient than typing for some types of tasks, and less so for others. Someone who was really in pain could do almost everything via dictation as part of the recovery process.”

 

The Profiles section of the website highlights what the Colgate faculty is doing with technology.

Humanities

Natural Sciences and Mathematics

Social Sciences

University Studies

Faculty Course Web Sites

 

Hit Counter