Freedom of (compressed) Speech

Conventional dynamic range compression in hearing aids is burdened with a trade-off between compression speed and signal distortion. The more reactive (fast) the compression and the higher the prescribed compression ratio, the more of the dynamics of the signal is lost. This is an unfortunate side-effect, since the dynamic-based contrast of speech, fluctuating intensity between and within phonemes, carries information vital for intelligibility.

Under Pressure

Most of the hearing aids in the 1990’s and early 2000’s didn’t care much about the speech envelope – constrained by limited processing power they nastily ironed out all peaks and valleys. The primary objective was to satisfy the electro-acoustical validation of Input-Output (I-O) curve. The dynamic compressor in early digital hearing aids established still persistent “Amplifier analogy“. Everything beyond the goal of accurate I-O plot, was considered a luxury.

But the damage done to the dynamic of the speech by fast compression was real. Reduction of spectral contrast is critical in hearing aids with many frequency channels [Plomp 1988]. Similar negative effects are produced in the flattening of the time-domain envelope. On the other hand, quick reaction of the compressor was needed to sense all the changes in the signal. Contrary expert positions in favor of either fast compression or slow compression suggested an unsolvable zero-sum game.

Keep a Spring in Your Step

With increasing processing power available in hearing aids, strategies were developed to eat a cake (reduce overall dynamic range) and have it (keep dynamic of speech) too. Here a short overview of some market offerings of dynamic range compression that take care about the time-domain envelope of speech:

Widex

Variable speed compressor in the BEYOND hearing aid family is actually a combination of a slow compressor and a fast compressor. The slow compressor positions the average power of the signal within the residual hearing range. Subsequently, the fast compressor, using lower compression ratio (CR) affects most those parts of the speech with lowest and highest energy, ensuring audibility of soft and reducing the impact of the outliers on the loud end.

Oticon

Oticons SpeechGuard is one of the first features for the preservation of speech envelope during compression. The approach, called “floating linear gain” slowly adapts the gain to the averaged power of the signal. Around this averaged power center, a certain dynamic range (dynamic range of speech is usually 30 dB) is amplified linearly. Outside of that range, a fast compression takes care of extreme signal parts.

Oticon is also a pioneer in research about the use of working memory in understanding compressed speech.

ReSound

LINX Quatro doesn’t offer a separated feature for preserving the speech envelope, however, the brochure suggests that the product increases/retains dynamic range. Related patent applications describe a dynamic range compression where the slope of the compression line and the compression threshold are adapting to the average power of the input signal.

Signia/Siemens

The dynamic compressor in some products implements two compression knees instead of a singe knee – lower CR (low distortion) is applied on softer signals and higher CR on the loud end. Time constants are also adaptive and set to “slow” for “speech in quiet” situations.

Two Pi

FLEXO dynamic range compression takes a different approach to the dilemma between slow and fast compression: Compression ratio is not implemented as a uniform constant resulting in a straight linear dependence between input and output. Instead, the compression ratio is a nonuniform function of the histogram of the input signal. The resulting I-O curve is adaptively bowed so to provide maximally flat gain to the part of the signal that conveys the vital temporal information.

Once such I-O curve is calculated, fast time constants can be used – ensuring immediate audibility of softest parts and control of the loud end.

All described methods have one positive effect in common: the effective Compression Ratio applied on a speech signal is lower than the prescribed Compression Ratio. Those new signal processing algorithms are departing from the “amplifier analogy” of a hearing aid. -Is this maybe a sign that diagnostics and prescription need an update too?

Future developments

Recent research [Souza 2015] suggests that increased dynamics of the speech signal can reduce cognitive load in understanding speech.

Importance of both Envelope clues and Temporal Fine Structures of speech is evidenced by trials [Moore 2016] and might influence the future of diagnostic and prescriptive audiology.

For both speech and music, trials find a subjective preference in a higher dynamic of the signal.

It is therefore very likely that the research will keep an eye on temporal fine structures and related cognitive effort when understanding speech.

References:

[Moore 2016] Moore BC, Sęk A. Preferred Compression Speed for Speech and Music and Its Relationship to Sensitivity to Temporal Fine Structure. Trends in Hearing. 2016 Sep 7.

[Plomp 1988] Plomp R. The negative effect of amplitude compression in multichannel hearing aids in the light of the modulation transfer function. J Acoust Soc Am. 1988;83:2322-2327.

[Souza 2015] Pamela Souza, Kathryn Arehart, Tobias Neher, Working Memory and Hearing Aid Processing: Literature Findings, Future Directions, and Clinical Applications, Frontiers in Psychology. 2015; 6: 1894.