Audio Super-Resolution
Audio super-resolution, especially speech, refers to the process of reconstructing high-resolution music signals from their low-resolution counterparts. Essentially, it enhances the quality of a speech signal by increasing its sampling rate or bandwidth while preserving naturalness and intelligibility. A representative Github project for speech super-resolution is ClearerVoice-Studio.
Papers
Showing 1–10 of 22 papers
Benchmark Results
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | U-Net | Log-Spectral Distance | 3.1 | — | Unverified |
| 2 | U-Net + TFiLM | Log-Spectral Distance | 1.8 | — | Unverified |
| 3 | U-Net + AFiLM | Log-Spectral Distance | 1.7 | — | Unverified |
| 4 | TUNet | Log-Spectral Distance | 1.36 | — | Unverified |
| 5 | TUNet + MSM pre-training | Log-Spectral Distance | 1.28 | — | Unverified |
| 6 | NVSR | Log-Spectral Distance | 0.78 | — | Unverified |
| 7 | CMGAN | Log-Spectral Distance | 0.76 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | U-Net | Log-Spectral Distance | 3.4 | — | Unverified |
| 2 | U-Net + TFiLM | Log-Spectral Distance | 2 | — | Unverified |
| 3 | U-Net + AFiLM | Log-Spectral Distance | 1.5 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | U-Net | Log-Spectral Distance | 3.2 | — | Unverified |
| 2 | U-Net + TFiLM | Log-Spectral Distance | 2.5 | — | Unverified |
| 3 | U-Net + AFiLM | Log-Spectral Distance | 2.3 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | U-Net and ResNet | SNR | 35.26 | — | Unverified |