8 Interpretability, Explainability and Bias
Week 8
Lecture: Techniques for interpretability in text classification; sources of bias and implications
Lab: Applying transformers-interpret and ferret for interpreting model predictions; visualising bias
8.1 Lecture
Content to be added.
8.2 Lab
Content to be added.
8.3 Readings
- Wan, Y. et al. (2023) ‘“Kelly is a Warm Person, Joseph is a Role Model”: Gender Biases in LLM-Generated Reference Letters’. arXiv. https://doi.org/10.48550/arXiv.2310.09219
- Rossi, L., Harrison, K. and Shklovski, I. (2024) ‘The Problems of LLM-generated Data in Social Science Research’, Sociologica, 18(2), pp. 145–168. https://doi.org/10.6092/issn.1971-8853/19576