Universal Speech Enhancement with Regression and Generative Mamba
Journal
Interspeech 2025
Series/Report No.
Proceedings of the Annual Conference of the International Speech Communication Association Interspeech
Start Page
888
End Page
892
ISSN
2308457X
Date Issued
2025-08-17
Author(s)
Abstract
The Interspeech 2025 URGENT Challenge aimed to advance universal, robust, and generalizable speech enhancement by unifying speech enhancement tasks across a wide variety of conditions, including seven different distortion types and five languages. We present Universal Speech Enhancement Mamba (USEMamba), a state-space speech enhancement model designed to handle long-range sequence modeling, time-frequency structured processing, and sampling frequency-independent feature extraction. Our approach primarily relies on regression-based modeling, which performs well across most distortions. However, for packet loss and bandwidth extension, where missing content must be inferred, a generative variant of the proposed USEMamba proves more effective. Despite being trained in only a subset of the full training data, USEMamba achieved 2nd place in Track 1 during the blind test phase, demonstrating strong generalization across diverse conditions.
Event(s)
26th Interspeech Conference 2025
Subjects
speech restoration
state-space models
universal speech enhancement
URGENT 2025 Challenge
Publisher
ISCA
Type
conference paper
