[P] Whisper Accent — Accent-Aware English Speech Recognition
Hi everyone, I’ve been working on Whisper-Accent, a project that investigates how to adapt Whisper for accented English speech while preserving strong transcription performance. The repository provides the full training setup, evaluation pipeline, and released checkpoints so that experiments can be reproduced, compared, and extended for research on accent-aware ASR.
Features:
- Extends Whisper with per-accent conditioning via Adaptive Layer Norm in every decoder layer where the weights are trained with zero-initialization while the bias is initialized to pretrained LayerNorm gamma and beta values and frozen.
- Accent embeddings learnt for each accent independently and used to condition the decoder hidden states.
- Accents predicted from encoder hidden states via a classifier head:
- Learnable weighted sum across all layers + input embeddings
- Projection layer
- Multi-head attention pooling over time
- Encoder & decoder remain completely frozen preserving the original generalization capability
- Only <10% of parameters are trainable (AdaLN modulation weights, accent embeddings, accent classifier)
Supported accents:
- American, British, Scottish, Irish, Canadian, Northern Irish
- Indian, Spanish, Dutch, German, Czech, Polish
- French, Italian, Hungarian, Finnish
- Vietnamese, Romanian, Slovak, Estonian, Lithuanian, Croatian, Slovene
Results:
Evaluation results on westbrook/English_Accent_DataSet test split.
| Model | Overall WER ↓ | Accent accuracy ↑ |
|---|---|---|
| Whisper Models: | ||
| openai/whisper-small.en | 17.6% | – |
| openai/whisper-medium.en | 17.5% | – |
| openai/whisper-large-v3 | 17.7% | – |
| openai/whisper-large-v3-turbo | 20.1% | – |
| Whisper Accent Models: | ||
| mavleo96/whisper-accent-small.en | 14.1% (+3.5%) | 85.1% |
| mavleo96/whisper-accent-medium.en | 13.4% (+4.1%) | 95.7% |
Please do comment your thought and any suggestion on what else might be interesting to experiment here — and feel free to star the repo if it’s interesting / helpful.
submitted by /u/Mavleo96
[link] [comments]
Like
0
Liked
Liked