References
Adcock, Robert, and David Collier. 2001. “Measurement
Validity: A Shared
Standard for Qualitative and
Quantitative Research.” The
American Political Science Review 95 (3): 529–46. https://www.jstor.org/stable/3118231.
Argyle, Lisa P., Ethan C. Busby, Nancy Fulda, Joshua R. Gubler,
Christopher Rytting, and David Wingate. 2023. “Out of
One, Many: Using
Language Models to Simulate
Human Samples.” Political
Analysis 31 (3): 337–51. https://doi.org/10.1017/pan.2023.2.
Baker, Zachary R., and Zarif L. Azher. 2024. “Simulating
The U.S. Senate:
An LLM-Driven Agent
Approach to Modeling Legislative
Behavior and Bipartisanship.” arXiv. https://doi.org/10.48550/arXiv.2406.18702.
Bender, Emily M., Timnit Gebru, Angelina McMillan-Major, and Shmargaret
Shmitchell. 2021. “On the Dangers of
Stochastic Parrots: Can
Language Models Be
Too Big? 🦜.” In Proceedings of the
2021 ACM Conference on Fairness,
Accountability, and Transparency, 610–23.
FAccT ’21. New York, NY, USA: Association for Computing
Machinery. https://doi.org/10.1145/3442188.3445922.
Benoit, Kenneth. 2020. “Text as Data: An
Overview.” In The SAGE
Handbook of Research Methods in
Political Science and
International Relations, 461–97. Sage
London.
Benoit, Kenneth, Kevin Munger, and Arthur Spirling. 2019.
“Measuring and Explaining Political
Sophistication Through Textual
Complexity.” American Journal of Political
Science 63 (2): 491–508. https://doi.org/10.1111/ajps.12423.
Birkenmaier, Lukas, Clemens M. Lechner, and Claudia Wagner. 2024.
“The Search for Solid
Ground in Text as Data:
A Systematic Review of
Validation Practices and
Practical Recommendations for
Validation.” Communication Methods and
Measures 18 (3): 249–77. https://doi.org/10.1080/19312458.2023.2285765.
Bucher, Martin Juan José, and Marco Martini. 2024.
“Fine-Tuned ’Small’ LLMs
(Still) Significantly Outperform
Zero-Shot Generative
AI Models in Text
Classification.” arXiv. https://doi.org/10.48550/arXiv.2406.08660.
Buolamwini, Joy, and Timnit Gebru. 2018. “Gender
Shades: Intersectional Accuracy
Disparities in Commercial Gender
Classification.” In Proceedings of the 1st
Conference on Fairness,
Accountability and Transparency,
81:77–91. PMLR. https://proceedings.mlr.press/v81/buolamwini18a.html.
Caravaca, Francisco, Ángel Cuevas, and Rubén Cuevas. 2025.
“Dataset of Keywords Used by European Political
Parties on Facebook.” Data in Brief 58
(February): 111280. https://doi.org/10.1016/j.dib.2025.111280.
Dietrich, Bryce J. 2021. “Using Motion
Detection to Measure Social
Polarization in the U.S.
House of Representatives.”
Political Analysis 29 (2): 250–59. https://doi.org/10.1017/pan.2020.25.
Dietrich, Bryce J., Ryan D. Enos, and Maya Sen. 2019. “Emotional
Arousal Predicts Voting on the
U.S. Supreme
Court.” Political Analysis 27 (2): 237–43.
https://doi.org/10.1017/pan.2018.47.
Dietrich, Bryce J., Matthew Hayes, and Diana Z. O’brien. 2019.
“Pitch Perfect: Vocal Pitch
and the Emotional Intensity of
Congressional Speech.” American
Political Science Review 113 (4): 941–62. https://doi.org/10.1017/S0003055419000467.
Gilardi, Fabrizio, Meysam Alizadeh, and Maël Kubli. 2023.
“ChatGPT Outperforms Crowd Workers for
Text-Annotation Tasks.” Proceedings of the National Academy
of Sciences 120 (30): e2305016120. https://doi.org/10.1073/pnas.2305016120.
Girbau, Andreu, Tetsuro Kobayashi, Benjamin Renoust, Yusuke Matsui, and
Shin’ichi Satoh. 2024. “Face Detection,
Tracking, and Classification from
Large-Scale News
Archives for Analysis of Key
Political Figures.” Political
Analysis 32 (2): 221–39. https://doi.org/10.1017/pan.2023.33.
Grimmer, Justin, and Brandon M. Stewart. 2013. “Text as
Data: The Promise and
Pitfalls of Automatic Content
Analysis Methods for Political
Texts.” Political Analysis 21 (3): 267–97.
https://doi.org/10.1093/pan/mps028.
Grossmann, Igor, Matthew Feinberg, Dawn C. Parker, Nicholas A.
Christakis, Philip E. Tetlock, and William A. Cunningham. 2023.
“AI and the Transformation of Social Science
Research.” Science 380 (6650): 1108–9. https://doi.org/10.1126/science.adi1778.
Hackenburg, Kobi, and Helen Margetts. 2024. “Evaluating the
Persuasive Influence of Political Microtargeting with Large Language
Models.” Proceedings of the National Academy of Sciences
121 (24): e2403116121. https://doi.org/10.1073/pnas.2403116121.
Halterman, Andrew, and Katherine A. Keith. 2025. “Codebook
LLMs: Evaluating LLMs as
Measurement Tools for Political
Science Concepts.” arXiv. https://doi.org/10.48550/arXiv.2407.10747.
Hwang, Jackelyn, and Nikhil Naik. 2023. “Systematic
Social Observation at Scale:
Using Crowdsourcing and Computer
Vision to Measure Visible
Neighborhood Conditions.”
Sociological Methodology 53 (2): 183–216. https://doi.org/10.1177/00811750231160781.
Jaros, Kyle, and Jennifer Pan. 2018. “China’s
Newsmakers: Official Media
Coverage and Political Shifts in
the Xi Jinping Era.”
The China Quarterly 233 (March): 111–36. https://doi.org/10.1017/S0305741017001679.
Joo, Jungseock, and Zachary C. Steinert-Threlkeld. 2018. “Image as
Data: Automated Visual
Content Analysis for Political
Science.” arXiv. https://doi.org/10.48550/arXiv.1810.01544.
Laurer, Moritz, Wouter van Atteveldt, Andreu Casas, and Kasper Welbers.
2024. “Less Annotating, More
Classifying: Addressing the Data
Scarcity Issue of Supervised
Machine Learning with Deep
Transfer Learning and
BERT-NLI.” Political Analysis
32 (1): 84–100. https://doi.org/10.1017/pan.2023.20.
Laver, Michael, Kenneth Benoit, and John Garry. 2003. “Extracting
Policy Positions from Political
Texts Using Words as
Data.” American Political Science Review 97
(02). https://doi.org/10.1017/S0003055403000698.
Liu, Yuhan, Zirui Song, Xiaoqing Zhang, Xiuying Chen, and Rui Yan. 2024.
“From a Tiny Slip to a
Giant Leap: An
LLM-Based Simulation for
Fake News Evolution.”
arXiv. https://doi.org/10.48550/arXiv.2410.19064.
Lucy, Li, Dorottya Demszky, Patricia Bromley, and Dan Jurafsky. 2020.
“Content Analysis of Textbooks via
Natural Language Processing:
Findings on Gender, Race, and
Ethnicity in Texas
U.S. History
Textbooks.” AERA Open 6 (3):
2332858420940312. https://doi.org/10.1177/2332858420940312.
Lüken, Malte, Kody Moodley, Eva Viviani, Christian Pipal, and Gijs
Schumacher. 2024. “MEXCA - A
Simple and Robust Pipeline for
Capturing Emotion Expressions in
Faces, Vocalization, and
Speech.” OSF. https://doi.org/10.31234/osf.io/56svb.
Mendelsohn, Julia, Ceren Budak, and David Jurgens. 2021. “Modeling
Framing in Immigration Discourse
on Social Media.” In Proceedings of
the 2021 Conference of the North
American Chapter of the
Association for Computational
Linguistics: Human Language
Technologies, 2219–63. https://doi.org/10.18653/v1/2021.naacl-main.179.
Mikhaylov, Slava, Michael Laver, and Kenneth R. Benoit. 2012.
“Coder Reliability and Misclassification
in the Human Coding of Party
Manifestos.” Political Analysis 20 (1):
78–91. https://doi.org/10.1093/pan/mpr047.
Moghimifar, Farhad, Yuan-Fang Li, Robert Thomson, and Gholamreza
Haffari. 2024. “Modelling Political
Coalition Negotiations Using
LLM-Based Agents.” arXiv. https://doi.org/10.48550/arXiv.2402.11712.
Müller, Stefan, and Sven-Oliver Proksch. 2024. “Nostalgia in
European Party Politics:
A Text-Based
Measurement Approach.” British
Journal of Political Science 54 (3): 993–1005. https://doi.org/10.1017/S0007123423000571.
Radford, Alec, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey,
and Ilya Sutskever. 2022. “Robust Speech
Recognition via Large-Scale
Weak Supervision.” arXiv.org.
https://arxiv.org/abs/2212.04356v1.
Rask, Mathias. 2025. “When They Go
High, We Go Low:
Rhetorical Rewards of
Governing.”
Rheault, Ludovic, Kaspar Beelen, Christopher Cochrane, and Graeme Hirst.
2016. “Measuring Emotion in
Parliamentary Debates with
Automated Textual
Analysis.” Edited by Joseph Najbauer. PLOS
ONE 11 (12): e0168843. https://doi.org/10.1371/journal.pone.0168843.
Rheault, Ludovic, and Christopher Cochrane. 2020. “Word
Embeddings for the Analysis of
Ideological Placement in
Parliamentary Corpora.” Political
Analysis 28 (1): 112–33. https://doi.org/10.1017/pan.2019.26.
Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. 2016.
“"Why Should I
Trust You?": Explaining the
Predictions of Any
Classifier.” In Proceedings of the 22nd
ACM SIGKDD International
Conference on Knowledge Discovery
and Data Mining, 1135–44. San Francisco
California USA: ACM. https://doi.org/10.1145/2939672.2939778.
Rittmann, Oliver. 2024. “A Measurement
Framework for Computationally
Analyzing Politicians’ Body
Language.” OSF. https://doi.org/10.31219/osf.io/9wynp.
Rodriguez, Pedro L., and Arthur Spirling. 2022. “Word
Embeddings: What Works,
What Doesn’t, and How to
Tell the Difference for Applied
Research.” The Journal of Politics 84 (1):
101–15. https://doi.org/10.1086/715162.
Rodriguez, Pedro L., Arthur Spirling, and Brandon M. Stewart. 2023.
“Embedding Regression: Models for
Context-Specific Description and
Inference.” American Political Science
Review 117 (4): 1255–74. https://doi.org/10.1017/S0003055422001228.
Rogers, Anna, Olga Kovaleva, and Anna Rumshisky. 2019. “Calls to
Action on Social Media:
Potential for Censorship and
Social Impact.” EMNLP-IJCNLP 2019
Second Workshop on Natural Language Processing for Internet
Freedom.
Russell, Stuart, and Peter Norvig. 2020. Artificial
Intelligence: A Modern
Approach. Pearson.
Santurkar, Shibani, Esin Durmus, Faisal Ladhak, Cinoo Lee, Percy Liang,
and Tatsunori Hashimoto. 2023. “Whose Opinions
Do Language Models
Reflect?” In Proceedings of the 40th
International Conference on
Machine Learning, 202:29971–30004. PMLR.
https://proceedings.mlr.press/v202/santurkar23a.html.
Slapin, Jonathan B., and Sven-Oliver Proksch. 2008. “A
Scaling Model for Estimating
Time-Series Party
Positions from Texts.” American
Journal of Political Science 52 (3): 705–22. https://doi.org/10.1111/j.1540-5907.2008.00338.x.
Smith, Marianne, Bryce Jensen Dietrich, Er-wei Bai, and Henry Jeremy
Bockholt. 2020. “Vocal Pattern Detection of Depression Among Older
Adults.” International Journal of Mental Health Nursing
29 (3): 440–49. https://doi.org/10.1111/inm.12678.
Tarr, Alexander, June Hwang, and Kosuke Imai. 2023. “Automated
Coding of Political Campaign
Advertisement Videos: An
Empirical Validation
Study.” Political Analysis 31 (4): 554–74.
https://doi.org/10.1017/pan.2022.26.
Timm, Jasper, Chetan Talele, and Jacob Haimes. 2025. “Tailored
Truths: Optimizing LLM
Persuasion with Personalization and
Fabricated Statistics.” arXiv. https://doi.org/10.48550/arXiv.2501.17273.
Törnberg, Petter. 2024. “Best Practices for
Text Annotation with Large
Language Models.” arXiv. https://doi.org/10.48550/arXiv.2402.05129.
Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion
Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017.
“Attention Is All You Need.” Advances in Neural
Information Processing Systems 30. https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html.
Widmann, Tobias, and Maximilian Wich. 2023. “Creating and
Comparing Dictionary, Word
Embedding, and Transformer-Based
Models to Measure Discrete
Emotions in German Political
Text.” Political Analysis 31 (4): 626–41.
https://doi.org/10.1017/pan.2022.15.
Williams, Adrienne, Milagros Miceli, and Timnit Gebru. 2022. “The
Exploited Labor Behind
Artificial Intelligence.”
Noema, October. https://www.noemamag.com/the-exploited-labor-behind-artificial-intelligence.
Xue, Zhaoqian, Mingyu Jin, Beichen Wang, Suiyuan Zhu, Kai Mei, Hua Tang,
Wenyue Hua, Mengnan Du, and Yongfeng Zhang. 2025. “What If
LLMs Have Different
World Views: Simulating
Alien Civilizations with
LLM-Based Agents.” arXiv. https://doi.org/10.48550/arXiv.2402.13184.
Zech, John R., Marcus A. Badgeley, Manway Liu, Anthony B. Costa, Joseph
J. Titano, and Eric Karl Oermann. 2018. “Variable Generalization
Performance of a Deep Learning Model to Detect Pneumonia in Chest
Radiographs: A Cross-Sectional Study.” PLOS
Medicine 15 (11): e1002683. https://doi.org/10.1371/journal.pmed.1002683.
Zhao, Haiyan, Hanjie Chen, Fan Yang, Ninghao Liu, Huiqi Deng, Hengyi
Cai, Shuaiqiang Wang, Dawei Yin, and Mengnan Du. 2023.
“Explainability for Large Language
Models: A Survey.” arXiv.
https://doi.org/10.48550/arXiv.2309.01029.
Ziems, Caleb, William Held, Omar Shaikh, Jiaao Chen, Zhehao Zhang, and
Diyi Yang. 2023. “Can Large Language
Models Transform Computational
Social Science?” arXiv. http://arxiv.org/abs/2305.03514.