For decades, computer scientists have dreamed of computers that respond to human voice. But until recently speech-recognition systems could be a nightmare. New users had to recite long scripts to train the software to the peculiarities of their voices, and the software’s translations could still be as mistake-prone as a first-year foreign-language student. But lately the technology has improved dramatically. Last summer Nuance Corp., the industry’s big player, released a new version of Dragon that’s winning raves. This year Microsoft included a voice-recognition feature in its new Vista operating system and dropped a reported $800 million to acquire a speech-software start-up called Tellme. Nuance and other companies—including Google—are working on systems that allow voice to replace the frenzied pecking on BlackBerrys and other mobile devices. “The technology has kind of snuck up on everyone,” says Bill Meisel, publisher of Speech Strategy News.
PC-based voice recognition is different from the “call center” systems you encounter when calling banks or airlines. Telephone systems recognize only simple vocabularies and are designed to work with any voice. In contrast, PC-based systems adapt to a single user’s speech, gaining accuracy over time. Nuance cites several reasons the software has improved lately. As more Dragon users began to have broadband connections, the company started remotely collecting data on the particular words and phrases that Dragon screwed up, allowing researchers to tweak their black-box algorithms to better target trouble spots.
At the same time, faster PCs allow Dragon to crunch more data, increasing accuracy without slowing performance. The company estimates 5 million Americans are now using Dragon software, and it envisions a future in which microphones join keyboards, mice and scanners as another everyday way to digitize data.
Until recently, most speech-recognition users toiled in hyperspecialized fields (like medical transcription) or suffered physical disabilities, like repetitive-stress injuries, that impeded keyboarding. Now more customers are just normal desk jockeys who are trying to boost productivity. Stanley Riemer is the managing partner at a Boston law firm who uses Dragon to answer 200 e-mails a day—often at home in the evenings, while sitting in a comfortable chair with his hands folded in his lap. “I never touch the keyboard unless I feel like it,” he says. With a noise-filtering microphone, he can even watch Red Sox games while he e-mails. “It’s changed my entire work style,” Riemer says. And as the practice grows, talking to yourself may become not a marker of madness, but the sign of a high-efficiency worker.