FLEXO – orthogonal dynamic range compression
FLEXO is a novel method for dynamic range transformation that retains dynamics of speech. FLEXO is designed to reduce the dependency on diagnostics and fitting.
The algorithm of FLEXO performs continuous statistical analysis of the acoustical signal and employs characteristic “orthogonal” compression. This approach is responsive and gentle at the same time, retaining so the envelope structure of the speech and reducing the cognitive load during speech perception.
Perfect for demanding listening situations, FLEXO is not slowing its reaction by dampening time-constants or lowering the prescribed Compression Ratio.
Conventional Dynamic range compression is burdened with a trade-off between compression speed and signal distortion. The more reactive (fast) the compression and the higher the prescribed compression ratio, the more of the dynamics of the signal is lost. This is an unfortunate side-effect, since the dynamic-based contrast of speech, fluctuating intensity between and within phonemes, carries information vital for intelligibility.
Recent research [Souza 2015] suggests that increased dynamics of the speech signal can reduce cognitive load in understanding speech. Importance of both Envelope clues and Temporal Fine Structures of speech is evidenced by trials [Moore 2016] and might influence the future of diagnostic and prescriptive audiology.
For both speech and music, trials find a subjective preference in a higher dynamic of the signal.
FLEXO Orthogonal Dynamic Range Compression works with a non-uniformly distributed compression ratio (CR). Instead of using a single CR value for control of the whole compressing range, FLEXO regards CR as a flexible basis (in sense of linear algebra) for projecting one dynamic range to another dynamic range. This novel perspective to CR also implies that CR-basis can be optimized in order to preserve the fidelity of the original signal.
Conventional WDRC with CR = 2.5, attack-time of 10 ms and release-time of 100 ms, WDRC reduces 30 dB of natural speech dynamics to around 15 dB, eliminating much of the temporal contrast. The effect is visible in the figure below.
Using same time constants and same CR prescription, FLEXO produces the speech signal shown in the bottom part of the figure. The dynamics of speech is retained while overall mapping of input and output dynamic ranges is ensured – the best of both worlds!
The beneficial effect of FLEXO is visible in the next figure showing level-histograms representing dynamics of the unprocessed and processed speech signals:
FLEXO is not slowing its reaction by restricting time-constants or lowering the prescribed CR. FLEXO is, therefore, able to avoid uncomfortable loudness levels if excessively loud inputs suddenly appear.
FLEXO accepts convenient prescriptions delivered by NAL-NL2 or similar fitting formulas. However, the more interesting application is a self-fitting hearing device, where FLEXO can control loudness in a less rigid manner than classical WDRC. FLEXO delivers complete mapping of input to output dynamic range and maintains proportional relations between loudness levels.
[Moore 2016] Moore BC, Sęk A. Preferred Compression Speed for Speech and Music and Its Relationship to Sensitivity to Temporal Fine Structure. Trends in Hearing. 2016 Sep 7.
[Plomp 1988] Plomp R. The negative effect of amplitude compression in multichannel hearing aids in the light of the modulation transfer function. J Acoust Soc Am. 1988;83:2322-2327.
[Souza 2015] Pamela Souza, Kathryn Arehart, Tobias Neher, Working Memory and Hearing Aid Processing: Literature Findings, Future Directions, and Clinical Applications, Frontiers in Psychology. 2015; 6: 1894.