Skip to main content

Speech Technology Creates Career Opportunities

Imagine talking to your cousin in China. On her end, she hears Chinese. On your end, you hear English. In the middle, a computer is providing real-time translation. It's just like the "universal translator" in Star Trek.

This scenario is a real possibility within the next 10 or 20 years, say experts in voice technology. In the meantime, voice technology is making numerous inroads.

"It's such a dynamic technology," says Doug Alexander. He works with Speech Technology Magazine. "It has applications in so many industries that it's inevitable that it's going to continue to grow."

Voice technology promises to help consumers order products and services. It could help international businesspeople communicate. And it could increase security.

Experts are needed in a variety of fields. The most sought will be those with degrees in electrical and electronics engineering and linguistics.

Speech technology incorporates a variety of disciplines. Signal processing, for instance, is the process of using digital computers to process analog signals, such as music or speech.

One specialty area in signal processing is automatic speech recognition. That's the process of using a computer to understand human speech.

Another discipline in speech technology is text-to-speech synthesis. That's where you take text input and generate synthetic speech.

These technologies have been around since the 1960s. That's when computers became powerful enough to do this sort of work. Technological changes within the last 10 years have made it all come together.

Computer processing power has increased greatly for very low cost. The algorithms that do the processing on both the speech recognition and the speech synthesis side have also improved drastically.

Plus, researchers now have the ability to build good speech models that model the phonetic quality of a specific language. That takes a lot of data.

"Within the last decade, companies have been able to record a tremendous amount of speech data," says Ed Bronson. He's in charge of speech engineering for a voice laboratory.

"For North American English, you need to be able to get enough speech samples from all over the country to pick up all the dialect variations, such as the New York twang...and the southern drawl in Texas."

Another big development in the last three years is the advent of VoiceXML. It's a web-based language for doing voice interactive services. It allows websites to take voice input.

Instead of entering data into fields, you can just speak. For example, you could tell a courier company your tracking number. Then the system could tell you the location of your package.

Companies want to reduce the cost of handling customer calls. This is a major driving force in the development of voice technology. Calls can be shortened. They could use less human intervention. That can mean big savings.

Advocates say the technology will also benefit consumers. They say it's easier and quicker to enter and request information using your voice. And some people, such as those on the road and those operating machinery, can't easily punch numbers into their phone.

Daryle Gardner-Bonneau is the editor-in-chief of the International Journal of Speech Technology. She says some of the technology seems like "technology in search of an application." In other words, critics might say some voice technology innovations have little practical use.

But there's no denying, she says, that voice technology has many valuable applications.

A big area is corporate security. Voice verification ensures that only authorized people can access sensitive information, such as financial data. In addition to a PIN number, a computer can match your voiceprint to confirm your identity.

Currently, voice recognition is good at understanding most people if the pool of possible words is limited. In other words, if the software can predict what you'll say, such as with airline reservations, then the accuracy is good.

But voice recognition gets trickier when it comes to understanding natural conversation, with all its slang, varied tones, accents and so on.

Telecommunications and software companies are among the primary developers of voice technology.

"I'd guess there are several hundred [companies] at any one time in the U.S., many of them small players in particular niches," says Gardner-Bonneau. She estimates there are at least a few thousand employees of speech technology companies.

One company has applications that allow people to use speech to access e-mail, voicemail and fax. You log in using secure voice authentication. You can have your e-mail messages read to you. You can also respond to e-mails by voice.

The program will convert the message to text and send it off. It can also be used to track courier packages or for room service in hotels, for example.

Speech technology companies tend to hire people with bachelor's and master's degrees in electronic engineering and computer science. Since the technology is relatively new, marketing people are important to inform people of the benefits. Linguists are also in demand.

"We're looking for speech scientists at the PhD or master's level in speech recognition or something related to that," says Marie Ruzzo. She works with a company that develops software for text to speech, speaker verification and speech recognition.

Speech recognition software is their primary product. It's used by major airlines and car rental agencies.

Ruzzo says voice recognition is a very cost-effective way to provide customer service. If someone wants to find out when a train is arriving, they can find out in seconds. They don't have to wait on hold for a live operator.

"I would say some of the hottest job opportunities...[with my employer] right now would be people who are linguists," says Ruzzo. "Because we operate around the world, we're creating speech recognition systems for languages for other countries."

As far as salary levels, estimates are hard to come by.

"It can go all over the board," says Carvill. "Entry level can go anywhere from the $40,000 to $50,000 range right up to the sky's the limit. It really depends on your experience level and what you bring to the equation.

"A couple of years ago, it was probably a little higher, but that's [the] ballpark that you can expect for people coming out of school with the desired skill set, if you come out with a master's in engineering."

Ruzzo says that salaries are about $50,000 and up. "It depends on a number of factors," she says, "[including] any past experience working with others in the industry."

Bronson predicts that with VoiceXML becoming more prevalent, programmers are going to be in demand. "I see it as a really nice opportunity for people to get in and work in the applications area," he says.

So if you have an ear for language and a good technical education, speech technology could be your field. Maybe you can help make that universal translator a reality!


Midwest Speech Technology Association
Members include engineers, marketing and sales professionals, end users, consultants and other professionals

VoiceXML Forum
Event listings, news, an explanation of VoiceXML and more

Back to Career Cluster


  • Email Support

  • 1-800-GO-TO-XAP (1-800-468-6927)
    From outside the U.S., please call +1 (424) 750-3900


Powered by XAP

OCAP believes that financial literacy and understanding the financial aid process are critical aspects of college planning and student success. OCAP staff who work with students, parents, educators and community partners in the areas of personal finance education, state and federal financial aid, and student loan management do not provide financial, investment, legal, and/or tax advice. This website and all information provided is for general educational purposes only, and is not intended to be construed as financial, investment, legal, and/or tax advice.