References

Adcock, Robert, and David Collier. 2001. “Measurement Validity: A Shared Standard for Qualitative and Quantitative Research.” The American Political Science Review 95 (3): 529–46. https://www.jstor.org/stable/3118231.

Argyle, Lisa P., Ethan C. Busby, Nancy Fulda, Joshua R. Gubler, Christopher Rytting, and David Wingate. 2023. “Out of One, Many: Using Language Models to Simulate Human Samples.” Political Analysis 31 (3): 337–51. https://doi.org/10.1017/pan.2023.2.

Baker, Zachary R., and Zarif L. Azher. 2024. “Simulating The U.S. Senate: An LLM-Driven Agent Approach to Modeling Legislative Behavior and Bipartisanship.” arXiv. https://doi.org/10.48550/arXiv.2406.18702.

Bender, Emily M., Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. 2021. “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜.” In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610–23. FAccT ’21. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/3442188.3445922.

Benoit, Kenneth. 2020. “Text as Data: An Overview.” In The SAGE Handbook of Research Methods in Political Science and International Relations, 461–97. Sage London.

Benoit, Kenneth, Kevin Munger, and Arthur Spirling. 2019. “Measuring and Explaining Political Sophistication Through Textual Complexity.” American Journal of Political Science 63 (2): 491–508. https://doi.org/10.1111/ajps.12423.

Birkenmaier, Lukas, Clemens M. Lechner, and Claudia Wagner. 2024. “The Search for Solid Ground in Text as Data: A Systematic Review of Validation Practices and Practical Recommendations for Validation.” Communication Methods and Measures 18 (3): 249–77. https://doi.org/10.1080/19312458.2023.2285765.

Bucher, Martin Juan José, and Marco Martini. 2024. “Fine-Tuned ’Small’ LLMs (Still) Significantly Outperform Zero-Shot Generative AI Models in Text Classification.” arXiv. https://doi.org/10.48550/arXiv.2406.08660.

Buolamwini, Joy, and Timnit Gebru. 2018. “Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification.” In Proceedings of the 1st Conference on Fairness, Accountability and Transparency, 81:77–91. PMLR. https://proceedings.mlr.press/v81/buolamwini18a.html.

Caravaca, Francisco, Ángel Cuevas, and Rubén Cuevas. 2025. “Dataset of Keywords Used by European Political Parties on Facebook.” Data in Brief 58 (February): 111280. https://doi.org/10.1016/j.dib.2025.111280.

Dietrich, Bryce J. 2021. “Using Motion Detection to Measure Social Polarization in the U.S. House of Representatives.” Political Analysis 29 (2): 250–59. https://doi.org/10.1017/pan.2020.25.

Dietrich, Bryce J., Ryan D. Enos, and Maya Sen. 2019. “Emotional Arousal Predicts Voting on the U.S. Supreme Court.” Political Analysis 27 (2): 237–43. https://doi.org/10.1017/pan.2018.47.

Dietrich, Bryce J., Matthew Hayes, and Diana Z. O’brien. 2019. “Pitch Perfect: Vocal Pitch and the Emotional Intensity of Congressional Speech.” American Political Science Review 113 (4): 941–62. https://doi.org/10.1017/S0003055419000467.

Gilardi, Fabrizio, Meysam Alizadeh, and Maël Kubli. 2023. “ChatGPT Outperforms Crowd Workers for Text-Annotation Tasks.” Proceedings of the National Academy of Sciences 120 (30): e2305016120. https://doi.org/10.1073/pnas.2305016120.

Girbau, Andreu, Tetsuro Kobayashi, Benjamin Renoust, Yusuke Matsui, and Shin’ichi Satoh. 2024. “Face Detection, Tracking, and Classification from Large-Scale News Archives for Analysis of Key Political Figures.” Political Analysis 32 (2): 221–39. https://doi.org/10.1017/pan.2023.33.

Grimmer, Justin, and Brandon M. Stewart. 2013. “Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts.” Political Analysis 21 (3): 267–97. https://doi.org/10.1093/pan/mps028.

Grossmann, Igor, Matthew Feinberg, Dawn C. Parker, Nicholas A. Christakis, Philip E. Tetlock, and William A. Cunningham. 2023. “AI and the Transformation of Social Science Research.” Science 380 (6650): 1108–9. https://doi.org/10.1126/science.adi1778.

Hackenburg, Kobi, and Helen Margetts. 2024. “Evaluating the Persuasive Influence of Political Microtargeting with Large Language Models.” Proceedings of the National Academy of Sciences 121 (24): e2403116121. https://doi.org/10.1073/pnas.2403116121.

Halterman, Andrew, and Katherine A. Keith. 2025. “Codebook LLMs: Evaluating LLMs as Measurement Tools for Political Science Concepts.” arXiv. https://doi.org/10.48550/arXiv.2407.10747.

Hwang, Jackelyn, and Nikhil Naik. 2023. “Systematic Social Observation at Scale: Using Crowdsourcing and Computer Vision to Measure Visible Neighborhood Conditions.” Sociological Methodology 53 (2): 183–216. https://doi.org/10.1177/00811750231160781.

Jaros, Kyle, and Jennifer Pan. 2018. “China’s Newsmakers: Official Media Coverage and Political Shifts in the Xi Jinping Era.” The China Quarterly 233 (March): 111–36. https://doi.org/10.1017/S0305741017001679.

Joo, Jungseock, and Zachary C. Steinert-Threlkeld. 2018. “Image as Data: Automated Visual Content Analysis for Political Science.” arXiv. https://doi.org/10.48550/arXiv.1810.01544.

Laurer, Moritz, Wouter van Atteveldt, Andreu Casas, and Kasper Welbers. 2024. “Less Annotating, More Classifying: Addressing the Data Scarcity Issue of Supervised Machine Learning with Deep Transfer Learning and BERT-NLI.” Political Analysis 32 (1): 84–100. https://doi.org/10.1017/pan.2023.20.

Laver, Michael, Kenneth Benoit, and John Garry. 2003. “Extracting Policy Positions from Political Texts Using Words as Data.” American Political Science Review 97 (02). https://doi.org/10.1017/S0003055403000698.

Liu, Yuhan, Zirui Song, Xiaoqing Zhang, Xiuying Chen, and Rui Yan. 2024. “From a Tiny Slip to a Giant Leap: An LLM-Based Simulation for Fake News Evolution.” arXiv. https://doi.org/10.48550/arXiv.2410.19064.

Lucy, Li, Dorottya Demszky, Patricia Bromley, and Dan Jurafsky. 2020. “Content Analysis of Textbooks via Natural Language Processing: Findings on Gender, Race, and Ethnicity in Texas U.S. History Textbooks.” AERA Open 6 (3): 2332858420940312. https://doi.org/10.1177/2332858420940312.

Lüken, Malte, Kody Moodley, Eva Viviani, Christian Pipal, and Gijs Schumacher. 2024. “MEXCA - A Simple and Robust Pipeline for Capturing Emotion Expressions in Faces, Vocalization, and Speech.” OSF. https://doi.org/10.31234/osf.io/56svb.

Mendelsohn, Julia, Ceren Budak, and David Jurgens. 2021. “Modeling Framing in Immigration Discourse on Social Media.” In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2219–63. https://doi.org/10.18653/v1/2021.naacl-main.179.

Mikhaylov, Slava, Michael Laver, and Kenneth R. Benoit. 2012. “Coder Reliability and Misclassification in the Human Coding of Party Manifestos.” Political Analysis 20 (1): 78–91. https://doi.org/10.1093/pan/mpr047.

Moghimifar, Farhad, Yuan-Fang Li, Robert Thomson, and Gholamreza Haffari. 2024. “Modelling Political Coalition Negotiations Using LLM-Based Agents.” arXiv. https://doi.org/10.48550/arXiv.2402.11712.

Müller, Stefan, and Sven-Oliver Proksch. 2024. “Nostalgia in European Party Politics: A Text-Based Measurement Approach.” British Journal of Political Science 54 (3): 993–1005. https://doi.org/10.1017/S0007123423000571.

Radford, Alec, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, and Ilya Sutskever. 2022. “Robust Speech Recognition via Large-Scale Weak Supervision.” arXiv.org. https://arxiv.org/abs/2212.04356v1.

Rask, Mathias. 2025. “When They Go High, We Go Low: Rhetorical Rewards of Governing.”

Rheault, Ludovic, Kaspar Beelen, Christopher Cochrane, and Graeme Hirst. 2016. “Measuring Emotion in Parliamentary Debates with Automated Textual Analysis.” Edited by Joseph Najbauer. PLOS ONE 11 (12): e0168843. https://doi.org/10.1371/journal.pone.0168843.

Rheault, Ludovic, and Christopher Cochrane. 2020. “Word Embeddings for the Analysis of Ideological Placement in Parliamentary Corpora.” Political Analysis 28 (1): 112–33. https://doi.org/10.1017/pan.2019.26.

Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. 2016. “"Why Should I Trust You?": Explaining the Predictions of Any Classifier.” In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135–44. San Francisco California USA: ACM. https://doi.org/10.1145/2939672.2939778.

Rittmann, Oliver. 2024. “A Measurement Framework for Computationally Analyzing Politicians’ Body Language.” OSF. https://doi.org/10.31219/osf.io/9wynp.

Rodriguez, Pedro L., and Arthur Spirling. 2022. “Word Embeddings: What Works, What Doesn’t, and How to Tell the Difference for Applied Research.” The Journal of Politics 84 (1): 101–15. https://doi.org/10.1086/715162.

Rodriguez, Pedro L., Arthur Spirling, and Brandon M. Stewart. 2023. “Embedding Regression: Models for Context-Specific Description and Inference.” American Political Science Review 117 (4): 1255–74. https://doi.org/10.1017/S0003055422001228.

Rogers, Anna, Olga Kovaleva, and Anna Rumshisky. 2019. “Calls to Action on Social Media: Potential for Censorship and Social Impact.” EMNLP-IJCNLP 2019 Second Workshop on Natural Language Processing for Internet Freedom.

Russell, Stuart, and Peter Norvig. 2020. Artificial Intelligence: A Modern Approach. Pearson.

Santurkar, Shibani, Esin Durmus, Faisal Ladhak, Cinoo Lee, Percy Liang, and Tatsunori Hashimoto. 2023. “Whose Opinions Do Language Models Reflect?” In Proceedings of the 40th International Conference on Machine Learning, 202:29971–30004. PMLR. https://proceedings.mlr.press/v202/santurkar23a.html.

Slapin, Jonathan B., and Sven-Oliver Proksch. 2008. “A Scaling Model for Estimating Time-Series Party Positions from Texts.” American Journal of Political Science 52 (3): 705–22. https://doi.org/10.1111/j.1540-5907.2008.00338.x.

Smith, Marianne, Bryce Jensen Dietrich, Er-wei Bai, and Henry Jeremy Bockholt. 2020. “Vocal Pattern Detection of Depression Among Older Adults.” International Journal of Mental Health Nursing 29 (3): 440–49. https://doi.org/10.1111/inm.12678.

Tarr, Alexander, June Hwang, and Kosuke Imai. 2023. “Automated Coding of Political Campaign Advertisement Videos: An Empirical Validation Study.” Political Analysis 31 (4): 554–74. https://doi.org/10.1017/pan.2022.26.

Timm, Jasper, Chetan Talele, and Jacob Haimes. 2025. “Tailored Truths: Optimizing LLM Persuasion with Personalization and Fabricated Statistics.” arXiv. https://doi.org/10.48550/arXiv.2501.17273.

Törnberg, Petter. 2024. “Best Practices for Text Annotation with Large Language Models.” arXiv. https://doi.org/10.48550/arXiv.2402.05129.

Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. “Attention Is All You Need.” Advances in Neural Information Processing Systems 30. https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html.

Widmann, Tobias, and Maximilian Wich. 2023. “Creating and Comparing Dictionary, Word Embedding, and Transformer-Based Models to Measure Discrete Emotions in German Political Text.” Political Analysis 31 (4): 626–41. https://doi.org/10.1017/pan.2022.15.

Williams, Adrienne, Milagros Miceli, and Timnit Gebru. 2022. “The Exploited Labor Behind Artificial Intelligence.” Noema, October. https://www.noemamag.com/the-exploited-labor-behind-artificial-intelligence.

Xue, Zhaoqian, Mingyu Jin, Beichen Wang, Suiyuan Zhu, Kai Mei, Hua Tang, Wenyue Hua, Mengnan Du, and Yongfeng Zhang. 2025. “What If LLMs Have Different World Views: Simulating Alien Civilizations with LLM-Based Agents.” arXiv. https://doi.org/10.48550/arXiv.2402.13184.

Zech, John R., Marcus A. Badgeley, Manway Liu, Anthony B. Costa, Joseph J. Titano, and Eric Karl Oermann. 2018. “Variable Generalization Performance of a Deep Learning Model to Detect Pneumonia in Chest Radiographs: A Cross-Sectional Study.” PLOS Medicine 15 (11): e1002683. https://doi.org/10.1371/journal.pmed.1002683.

Zhao, Haiyan, Hanjie Chen, Fan Yang, Ninghao Liu, Huiqi Deng, Hengyi Cai, Shuaiqiang Wang, Dawei Yin, and Mengnan Du. 2023. “Explainability for Large Language Models: A Survey.” arXiv. https://doi.org/10.48550/arXiv.2309.01029.

Ziems, Caleb, William Held, Omar Shaikh, Jiaao Chen, Zhehao Zhang, and Diyi Yang. 2023. “Can Large Language Models Transform Computational Social Science?” arXiv. http://arxiv.org/abs/2305.03514.