Blog

A Brief History of Voice Recognition Technology

Thanks to the marvels of modern technology, majority of the people have become lazy and hence, they are termed as ‘couch potatoes’. But this is mostly a personal choice and it depends on the individual. You might be a couch potato, reclining on your sofa with the television switched on and your iPhone in the palm of your hand. And then all of a sudden, you have the bright idea of ordering a pizza, so you say “Hey Siri!” and the phone responds to your beck and call.

You command Siri to search for nearest pizza delivery joints and you are presented with a list of them. As can be deduced, voice recognition has come a long way; however, this technology is not something recent. In fact, it has its roots back in the 1950s. Let’s delve into the past and take a look at the brief history of how voice recognition technology has evolved over time into the speech recognition software we know today.

From the 1950’s to the 1960’s

In the history of speech recognition software technology, this was the era of ‘baby talk’; only numbers and digits could be comprehended. In 1952, ‘Audrey’ was invented by Bell Laboratories which could only understand numbers. But in 1962, the ‘shoebox’ technology was able to understand 16 words in English. Later, voice recognition was enhanced to comprehend 9 consonants and 4 vowels.

In the 1970’s

The U.S. Department of Defense contributed heavily towards the development speech recognition systems and from 1971 to 1976, it funded the DARPA SUR (Speech Understanding Research) program. As a result, ‘Harpy’ was developed by Carnegie Mellon which had the ability to comprehend 1011 words. It employed a more efficient system of searching for logical sentences.

There were also parallel advancements in the technology such as the development of a device by Bell Laboratories that could understand more than one person’s voice.

In the 1980’s

A major breakthrough was the development of the hidden Markov model which used statistics to determine the probability of a word originating from an unknown sound. It did not rely on speech patterns or fixed templates. Many of these programs made their way into industries and business applications.

A doll was also made for children in 1987; it was known as ‘Julie’ and it could be trained by children to respond to their speech. But speech recognition systems of the 80s had one flaw: you had to take a break between each spoken word.

In the 1990’s

With the introduction of faster microprocessors, speech software became feasible. In 1990, the company Dragon released ‘Dragon Dictate’ which was the world’s first speech recognition software for consumers. In 1997, they improved it and developed ‘Dragon NaturallySpeaking’; you could speak 100 words in a minute.

In 1996, the first voice activated portal (VAL) was made by BellSouth. However, this system is inaccurate and still is a nuisance for many people.

In the 2000’s

By 2001, speech recognition development had hit a plateau, until Google came along. Google invented an application called ‘Google Voice Search’ for iPhones which utilized data centers to compute the enormous amount of data analysis needed for matching user queries with actual examples of human speech.

In 2010, Google introduced personalized recognition on Android devices which would record different users’ voice queries to develop an enhanced speech model. It consists of 230 billion English words. Eventually, Apple’s Siri was invented which relied on cloud computing as well, and you have a personal assistant who is not only intelligent, but funny and witty too.