Whisper2

Speech and text LLMs revisited: under the hood investigation of multilingual models

9 Dec 2026 16h-20h (CET)
As a follow-up to our first Whisper workshop, this second informal workshop is intended to discuss various approaches to probing audio neural models such as Whisper (Radford et al., 2019) or Wav2vec (Baevski et al., 2018). This is the Paris contribution to the Global Innovation Fund Research Award.

PROVISIONAL PROGRAMME 

16h Nicolas Ballier (UPCité), Gina Levow & Richard Wright (UW) :  introduction

PART 1: METHODS AND TOOLS
16h05 Guillaume Wisniewki (UPCité) Measuring distance with wav2vec & Whisper
16h30 Younès Mattallaoui and Jalal Al Tamani (UPCité) Speech Dataset Construction and Fine-tuning of a Dialect- and Variation-Aware Arabic Speech Model
16h50  Discussion  (coffee)

PART 2: CASE STUDIES
17h15 Artem Saloev (UPCité) & Erin Pacquetet (SIAM) Gender Bias in Mimi Codebooks (provisional)
17h35 Quyen PHAM (UPCité) Whisper and Vietnamese tones
17h50 Behnoosh Namdarzadeh (UPCité)  The exploration of the Whisper token dictionary for Persian
18h05 Gita Dhungana (UW) Nepali

18h15 Discussion  (coffee)
18h40 Siyu Liang (UW) Beyond WER: Probing Whisper’s Sub‐token Decoder Across Diverse Language Resource Levels
19h00 Vipasha Bansal (UW) Bengali
19h10 Florian Humbert and Alicia Wassink (both UW) A sociolinguistically-informed analysis of ASR errors in underrepresented Washington state dialects
19h30 C. M. Downey (Rochester)
19h50 Final discussion and closing remarks
20h end
zoom
: https://u-paris.zoom.us/j/83867613208?pwd=NAlJ9Uil3fztw9tbxrjwiJGFrAg265.1
pw : 289481

room : 720
Bâtiment Olympe de Gouges
8 Place Paul Ricoeur
75013 PARIS

 

ADDITIONAL RESOURCES:
Downey, C. M., Blevins, T., Goldfine, N., & Steinert-Threlkeld, S. (2023, December). Embedding Structure Matters: Comparing Methods to Adapt Multilingual Vocabularies to New Languages. In Proceedings of the 3rd Workshop on Multi-lingual Representation Learning (MRL) (pp. 268-281). PDF
Siyu Liang, Nicolas Ballier, Gina-Anne Levow, Richard Wright. Beyond WER: Probing Whisper’s Sub‐token Decoder Across Diverse Language Resource Levels. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, ACL, Nov 2025, Suzhou, France. pp.31225-31235, ⟨10.18653/v1/2025.emnlp-main.1591⟩

 

——————————————————————————