RT Journal Article SR Electronic(1) A1 Jaidka, KokilYR 2022 T1 Talking politics: Building and validating data-driven lexica to measure political discussion quality JF Computational Communication Research, VO 4 IS 2 SP 486 OP 527 DO https://doi.org/10.5117/CCR2022.2.005.JAID PB Amsterdam University Press, SN 2665-9085, AB Abstract Social media data offers computational social scientists the opportunity to understand how ordinary citizens engage in political activities, such as expressing their ideological stances and engaging in policy discussions. This study curates and develops discussion quality lexica from the Corpus for the Linguistic Analysis of Political Talk ONline (CLAPTON). Supervised machine learning classifiers to characterize political talk are evaluated for out-of-sample label prediction and generalizability to new contexts. The approach yields data-driven lexica, or dictionaries, that can be applied to measure the constructiveness, justification, relevance, reciprocity, empathy, and incivility of political discussions. In addition, the findings illustrate how the choices made in training such classifiers, such as the heterogeneity of the data, the feature sets used to train classifiers, and the classification approach, affect their generalizability. The article concludes by summarizing the strengths and weaknesses of applying machine learning methods to social media posts and theoretical insights into the quality and structure of online political discussions., UL https://www.aup-online.com/content/journals/10.5117/CCR2022.2.005.JAID