Interpretability Techniques for Speech Models

Interpretability Techniques for Speech Models#

Pre-trained foundation models have revolutionized speech technology like many other adjacent fields. The combination of their capability and opacity has sparked interest in researchers trying to interpret the models in various ways. While interpretability in fields such as computer vision and natural language processing has made significant progress towards understanding model internals and explaining their decisions, speech technology has lagged behind despite the widespread use of complex, black-box neural models. Recent studies have begun to address this gap, marked by a growing body of literature focused on interpretability in the speech domain. This tutorial provides a structured overview of interpretability techniques, their applications, implications, and limitations when applied to speech models, aiming to help researchers and practitioners better understand, evaluate, debug, and optimize speech models while building trust in their predictions. In hands-on sessions, participants will explore how speech models encode distinct features (e.g., linguistic information) and utilize them in their inference. By the end, attendees will be equipped with the tools and knowledge to start analyzing and interpreting speech models in their own research, potentially inspiring new directions.

Note

We presented our tutorial about Interpretability Techniques for Speech Models on Sunday, August 17th at this year’s Interspeech conference in Rotterdam.
Check out the programme below, and browse the materials through the sidebar menu (an overview of all slides and notebooks is here).

tutorial-overview-diagram

Programme at Interspeech 2025#

Time

Topic

Presenter

15:30

Welcome

Tom Lentz

15:30 - 15:50

Introduction to the challenges of speech data for interpretability research

Grzegorz Chrupała

Part I: Representation Understanding

15:50 - 16:15

Lecture on representational analysis techniques for analyzing speech model internals, including:

  • Dimensionality reduction
  • Diagnostic probes & lenses
  • Representation space comparisons

Martijn Bentum

16:15 - 16:30

Walkthrough on Probing

Charlotte Pouw

16:30 - 16:40

Walkthrough on Representation space comparisons (CKA)

Marianne de Heer Kloots

16:40 - 17:05

Break

Part II: Feature Importance Scoring

17:05 - 17:30

Lecture on Context Mixing, including:

  • Analyzing the pattern of attention in speech Transformers
  • Limitations of interpreting raw attention as a measure of context-mixing
  • Expanding the scope of context-mixing analysis beyond attention

Hosein Mohebbi

17:30 - 17:45

Lecture on Feature Attribution, including:

  • Gradient-based methods
  • Perturbation-based methods

Gaofei Shen

17:45 - 18:00

Walkthroughs on Context Mixing & Feature Attribution

Gaofei Shen & Hosein Mohebbi

Discussion

18:00 - 18:10

Key takeaways and outlook on future work in interpretability

Marianne de Heer Kloots

18:10 - 18:30

Panel discussion with all organizers:
Tom Lentz, Grzegorz Chrupała, Martijn Bentum, Charlotte Pouw, Marianne de Heer Kloots, Hosein Mohebbi, Gaofei Shen

Willem Zuidema

Tutorial contents#

Representational Analysis methods:

  • Probing1,2,3,4,5,6

  • Representation space comparisons: RSA7,8, CCA9, CKA10

  • CTC & Decoder lenses11,12

  • Embedding similarities (ABX tests)13,14,15

Feature Importance Scoring methods:

  • Context-mixing: Attention16,17,18,19, Attention Norm20, Value-Zeroing 21

  • Feature attribution22,23: Gradient-based24,25 & Perturbation-based26,27

References#

[1]

Patrick Cormac English, John D. Kelleher, and Julie Carson-Berndsen. Domain-Informed Probing of wav2vec 2.0 Embeddings for Phonetic Features. In Garrett Nicolai and Eleanor Chodroff, editors, Proceedings of the 19th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, 83–91. Seattle, Washington, July 2022. Association for Computational Linguistics. doi:10.18653/v1/2022.sigmorphon-1.9.

[2]

Cheol Jun Cho, Peter Wu, Abdelrahman Mohamed, and Gopala K. Anumanchipalli. Evidence of Vocal Tract Articulation in Self-Supervised Learning of Speech. In ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 1–5. June 2023. ISSN: 2379-190X. URL: https://ieeexplore.ieee.org/abstract/document/10094711 (visited on 2025-08-15), doi:10.1109/ICASSP49357.2023.10094711.

[3]

Martijn Bentum, Louis ten Bosch, and Tom Lentz. The Processing of Stress in End-to-End Automatic Speech Recognition Models. In Interspeech 2024, 2350–2354. 2024. doi:10.21437/Interspeech.2024-44.

[4]

Gaofei Shen, Michaela Watkins, Afra Alishahi, Arianna Bisazza, and Grzegorz Chrupała. Encoding of lexical tone in self-supervised models of spoken language. In Kevin Duh, Helena Gomez, and Steven Bethard, editors, Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 4250–4261. Mexico City, Mexico, June 2024. Association for Computational Linguistics. doi:10.18653/v1/2024.naacl-long.239.

[5]

Martijn Bentum, Louis ten Bosch, and Tomas O. Lentz. Word stress in self-supervised speech models: A cross-linguistic comparison. In Interspeech 2025, 251–255. 2025. doi:10.48550/arXiv.2507.04738.

[6]

Marianne de Heer Kloots and Hosein Mohebbi and Charlotte Pouw and Gaofei Shen and Willem Zuidema and Martijn Bentum. What do self-supervised speech models know about Dutch? Analyzing advantages of language-specific pre-training. In Interspeech 2025, 256–260. 2025. doi:10.48550/arXiv.2506.00981.

[7]

Grzegorz Chrupała, Bertrand Higy, and Afra Alishahi. Analyzing analytical methods: the case of phonology in neural models of spoken language. In Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault, editors, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 4146–4156. Online, July 2020. Association for Computational Linguistics. doi:10.18653/v1/2020.acl-main.381.

[8]

Gaofei Shen, Afra Alishahi, Arianna Bisazza, and Grzegorz Chrupała. Wave to Syntax: Probing spoken language models for syntax. In Interspeech 2023, 1259–1263. 2023. doi:10.21437/Interspeech.2023-679.

[9]

Ankita Pasad, Ju-Chieh Chou, and Karen Livescu. Layer-wise analysis of a self-supervised speech representation model. In 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), volume, 914–921. 2021. doi:10.1109/ASRU51503.2021.9688093.

[10]

Simon Kornblith, Mohammad Norouzi, Honglak Lee, and Geoffrey Hinton. Similarity of neural network representations revisited. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, 3519–3529. PMLR, 09–15 Jun 2019. URL: https://proceedings.mlr.press/v97/kornblith19a.html.

[11]

Marianne de Heer Kloots and Willem Zuidema. Human-like Linguistic Biases in Neural Speech Models: Phonetic Categorization and Phonotactic Constraints in Wav2Vec2.0. In Interspeech 2024, 4593–4597. 2024. doi:10.21437/Interspeech.2024-2490.

[12]

Anna Langedijk, Hosein Mohebbi, Gabriele Sarti, Willem Zuidema, and Jaap Jumelet. DecoderLens: Layerwise Interpretation of Encoder-Decoder Transformers. In Kevin Duh, Helena Gomez, and Steven Bethard, editors, Findings of the Association for Computational Linguistics: NAACL 2024, 4764–4780. Mexico City, Mexico, June 2024. Association for Computational Linguistics. doi:10.18653/v1/2024.findings-naacl.296.

[13]

Thomas Schatz. ABX-Discriminability Measures and Applications. Theses, Université Paris 6 (UPMC), September 2016. URL: https://hal.science/tel-01407461.

[14]

Robin Algayres, Tristan Ricoul, Julien Karadayi, Hugo Laurençon, Salah Zaiem, Abdelrahman Mohamed, Benoît Sagot, and Emmanuel Dupoux. DP-Parse: Finding Word Boundaries from Raw Speech with an Instance Lexicon. Transactions of the Association for Computational Linguistics, 10:1051–1065, September 2022. doi:10.1162/tacl_a_00505.

[15]

Maureen de Seyssel, Jie Chi, Skyler Seto, Maartje ter Hoeve, Masha Fedzechkina, and Natalie Schluter. Discriminating Form and Meaning in Multilingual Models with Minimal-Pair ABX Tasks. June 2025. arXiv:2505.17747 [cs]. doi:10.48550/arXiv.2505.17747.

[16]

Shu-wen Yang, Andy T. Liu, and Hung-yi Lee. Understanding Self-Attention of Self-Supervised Audio Transformers. August 2020. URL: http://arxiv.org/abs/2006.03265 (visited on 2025-08-15), doi:10.48550/arXiv.2006.03265.

[17]

Kyuhong Shim, Jungwook Choi, and Wonyong Sung. Understanding the role of self-attention for speech understanding. In International Conference on Learning Representations. 2022. URL: https://openreview.net/forum?id=AvcfxqRy4Y.

[18]

Belen Alastruey, Javier Ferrando, Gerard I. Gállego, and Marta R. Costa-jussà. On the Locality of Attention in Direct Speech Translation. In Samuel Louvan, Andrea Madotto, and Brielen Madureira, editors, Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, 402–412. Dublin, Ireland, 2022. Association for Computational Linguistics. URL: https://aclanthology.org/2022.acl-srw.32/, doi:10.18653/v1/2022.acl-srw.32.

[19]

Kartik Audhkhasi, Yinghui Huang, Bhuvana Ramabhadran, and Pedro J. Moreno. Analysis of Self-Attention Head Diversity for Conformer-based Automatic Speech Recognition. 2022. URL: http://arxiv.org/abs/2209.06096, doi:10.48550/arXiv.2209.06096.

[20]

Goro Kobayashi, Tatsuki Kuribayashi, Sho Yokoi, and Kentaro Inui. Attention is Not Only a Weight: Analyzing Transformers with Vector Norms. In Bonnie Webber, Trevor Cohn, Yulan He, and Yang Liu, editors, Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 7057–7075. Online, 2020. Association for Computational Linguistics. URL: https://aclanthology.org/2020.emnlp-main.574/, doi:10.18653/v1/2020.emnlp-main.574.

[21]

Hosein Mohebbi, Grzegorz Chrupała, Willem Zuidema, and Afra Alishahi. Homophone Disambiguation Reveals Patterns of Context Mixing in Speech Transformers. In Houda Bouamor, Juan Pino, and Kalika Bali, editors, Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 8249–8260. Singapore, December 2023. Association for Computational Linguistics. doi:10.18653/v1/2023.emnlp-main.513.

[22]

Dennis Fucci, Beatrice Savoldi, Marco Gaido, Matteo Negri, Mauro Cettolo, and Luisa Bentivogli. Explainability for Speech Models: On the Challenges of Acoustic Feature Selection. In Felice Dell'Orletta, Alessandro Lenci, Simonetta Montemagni, and Rachele Sprugnoli, editors, Proceedings of the 10th Italian Conference on Computational Linguistics (CLiC-it 2024), 373–381. Pisa, Italy, December 2024. CEUR Workshop Proceedings.

[23]

Gaofei Shen, Hosein Mohebbi, Arianna Bisazza, Afra Alishahi, and Grzegorz Chrupała. On the reliability of feature attribution methods for speech classification. In Interspeech 2025. 2025. doi:10.48550/arXiv.2505.16406.

[24]

Archiki Prasad and Preethi Jyothi. How Accents Confound: Probing for Accent Information in End-to-End Speech Recognition Systems. In Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault, editors, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 3739–3753. Online, July 2020. Association for Computational Linguistics. URL: https://aclanthology.org/2020.acl-main.345 (visited on 2024-10-10), doi:10.18653/v1/2020.acl-main.345.

[25]

Shubham Gupta, Mirco Ravanelli, Pascal Germain, and Cem Subakan. Phoneme Discretized Saliency Maps for Explainable Detection of AI-Generated Voice. In Interspeech 2024, 3295–3299. 2024. doi:10.21437/Interspeech.2024-632.

[26]

Xiaoliang Wu, Peter Bell, and Ajitha Rajan. Explanations for automatic speech recognition. In ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), volume, 1–5. 2023. doi:10.1109/ICASSP49357.2023.10094635.

[27]

Eliana Pastor, Alkis Koudounas, Giuseppe Attanasio, Dirk Hovy, and Elena Baralis. Explaining speech classification models via word-level audio segments and paralinguistic features. In Yvette Graham and Matthew Purver, editors, Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), 2221–2238. St. Julian's, Malta, March 2024. Association for Computational Linguistics. URL: https://aclanthology.org/2024.eacl-long.136/.