Whisper2

Speech and text LLMs revisited: under the hood investigation of multilingual models

9 Dec 2026 16h-20h (CET)
As a follow-up to our first Whisper workshop, this second informal workshop is intended to discuss various approaches to probing audio neural models such as Whisper (Radford et al., 2019) or Wav2vec (Baevski et al., 2018). This is the Paris contribution to the Global Innovation Fund Research Award.

PROVISIONAL PROGRAMME 

16h Nicolas Ballier (UPCité), Gina Levow & Richard Wright (UW):  introduction

PART 1: METHODS AND TOOLS
16h05 Guillaume Wisniewski (UPCité) Measuring distance with wav2vec & Whisper
16h30 Younès Mattallaoui and Jalal Al-Tamimi (UPCité) Speech Dataset Construction and Fine-tuning of a Dialect- and Variation-Aware Arabic Speech Model
16h50  Discussion  (coffee)

PART 2: CASE STUDIES
17h15 Artem Saloev (UPCité) & Erin Pacquetet (SIAM) Gender Bias in Mimi Codebooks (provisional)
17h35 Quyen Pham (UPCité) Evaluating Whisper Models on Vietnamese Tonal Minimal Pairs
17h50 Behnoosh Namdarzadeh (UPCité)  The exploration of the Whisper token dictionary for Persian
18h05 Gita Dhungana (UW) A linguistics-based error analysis of Whisper’s output for Nepali data

18h15 Discussion  (coffee)
18h40 Siyu Liang (UW) Beyond WER: Probing Whisper’s Sub‐token Decoder Across Diverse Language Resource Levels
19h00 Vipasha Bansal (UW) Whisper and Bengali : Phonetic and orthographic errors in transcription
19h10 Florian Humbert and Alicia Wassink (both UW) A sociolinguistically-informed analysis of ASR errors in underrepresented Washington state dialects
19h30 C. M. Downey (Rochester) Multilingual Tokenizer Adaptation  (Specializing representations for target languages)

19h50 Final discussion and closing remarks
20h end

room : 720
Bâtiment Olympe de Gouges
8 Place Paul Ricoeur
75013 PARIS

 

ADDITIONAL RESOURCES:
Downey, C. M., Blevins, T., Goldfine, N., & Steinert-Threlkeld, S. (2023, December). Embedding Structure Matters: Comparing Methods to Adapt Multilingual Vocabularies to New Languages. In Proceedings of the 3rd Workshop on Multi-lingual Representation Learning (MRL) (pp. 268-281). PDF
Siyu Liang, Nicolas Ballier, Gina-Anne Levow, Richard Wright. Beyond WER: Probing Whisper’s Sub‐token Decoder Across Diverse Language Resource Levels. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, ACL, Nov 2025, Suzhou, France. pp.31225-31235, ⟨10.18653/v1/2025.emnlp-main.1591⟩
A link to the Whisper implementation that provides access to the Whisper dictionaries and prediction probabilities:
Ballier, N., Arnold, T., Méli, A., Thurston, T., & Yunès, J. B. (2024). Whisper for L2 speech scoring. International Journal of Speech Technology, 27(4), 923-934.

 

 

 

——————————————————————————