An Improved Algorithm to Enhance Speech Recognition

Create: 10/09/2011 - 23:40

Researchers from Purdue University and University of Wisconsin-Madison have developed an algorithm that operates in real-time to sharpen the contrast in the speech signal to minimize the effects of the blur caused by the damaged inner ear. They report the development and evaluation of the algorithm in the open access peer-reviewed journal PLOS ONE.

It is projected that by 2030 there will be over 40 million adults and over 2 million children with hearing loss in the United States. The average reduction in earning potential for individuals with hearing aids is estimated to be $15,000 per year and is twice as much for individuals with untreated hearing loss.

Furthermore, it is estimated that more than 80% of adults with significant hearing loss do not use a hearing aid. An important variable influencing these figures is hearing aid performance, especially in noise. Advancements in hearing aid performance have the potential to improve quality of life for more than 10% of the American population as well as productivity the average hearing-impaired worker.

Alexander Speech Recognition Software
Illustration of the dynamic effects of the Contrast Enhancement (CE) algorithm. Damage to the inner ear may distort a speech signal by spreading out its component frequencies, thereby blurring the information that is processed further up the auditory system. The CE algorithm attempts to minimize these effects by sharpening and expanding them first, so that a clearer speech signal reaches the listener’s brain. (a) Unenhanced complex signal compromised of an additive pair of sub-signals with opposing changes in frequency over time. (b) Enhanced complex signal demonstrating sharpening and expansion of the frequencies (x-axis) of the sub-signals following application of the CE algorithm. (c) Cross-section of channels at 150 ms for unenhanced (blue), dynamically enhanced (red), and instantaneously enhanced (black) complex signals illustrating spatial sharpening. (d) Cross-section of channels at 370 ms for unenhanced (blue), dynamically enhanced (red), and instantaneously enhanced.(black) spatiotemporal signals illustrating sharpening and expansion of the frequencies of the subsignals. Image credit: Authors.

The challenge in advancing hearing aid performance is to overcome the blurring, or distortion, of frequency information important for understanding speech caused by damage to the inner ear. Distortion in the inner ear is the reason why making speech louder does not always make it clearer, which in fact, can contribute further to the distortion.

Alexander, Jenison, and Kluender report on a signal processing strategy- the Contrast Enhancement (CE) algorithm- that sharpens the contrast in the speech signal in an attempt to minimize the effects of the blur caused by the damaged inner ear. The CE algorithm operates in real time so it is ideal for hearing  aid applications. The innovative aspects of the CE algorithm include a simulation of biological processes thought to be important for understanding connected speech, especially those that operate across successive speech sounds to enhance signature changes in their frequency composition.

Normal-hearing listeners identified meaningless syllables in quiet and in noise that were first processed to simulate varying degrees of distortion associated with hearing loss. Consonant and vowel identification, especially in noise, were improved by the processing. The amount of improvement did not depend on the degree of simulated distortion or talker characteristics. For consonants, when results were analyzed according to their key features, the most consistent improvement was for the feature that distinguishes subtle contrasts such as “aga” from “ada.” This is encouraging for hearing aid applications because confusions between consonants differing in this feature are a persistent problem for listeners with sensorineural hearing loss.

Reference Publication: Alexander JM, Jenison RL, Kluender KR.  Real-Time Contrast Enhancement to Improve Speech Recognition. PLoS ONE 2011, 6(9): e24630. doi:10.1371/journal.pone.0024630.

Disclaimer: The information provided on Science News is not intended to be a medical advice. The information on this website is also not intended to be used for diagnosis or treatment. The material provided on this site is of general informational purpose only. You may consult your physician or other qualified health provider for medical advice, diagnosis or treatment.  The inventions, discoveries, devices or products mentioned on Science News may be in planning or experimental stages and may not have approval from regulatory agencies for human or animal use.