Measuring Group Differences in High-Dimensional Choices: Method and Application to Congressional Speech

Working Paper: NBER ID: w22423

Authors: Matthew Gentzkow; Jesse M. Shapiro; Matt Taddy

Abstract: We study the problem of measuring group differences in choices when the dimensionality of the choice set is large. We show that standard approaches suffer from a severe finite-sample bias, and we propose an estimator that applies recent advances in machine learning to address this bias. We apply this method to measure trends in the partisanship of congressional speech from 1873 to 2016, defining partisanship to be the ease with which an observer could infer a congressperson’s party from a single utterance. Our estimates imply that partisanship is far greater in recent years than in the past, and that it increased sharply in the early 1990s after remaining low and relatively constant over the preceding century.

Keywords: No keywords provided

JEL Codes: D72

Causal Claims Network Graph

Edges that are evidenced by causal inference methods are in orange, and the rest are in light blue.

Causal Claims

Cause	Effect
partisanship in congressional speech has increased significantly since the early 1990s (D72)	partisanship in congressional speech in the 2000s (D72)
sharp increase in partisanship (D72)	changes in political marketing strategies (M38)
sharp increase in partisanship (D72)	innovations in persuasive language (O35)
Republican takeover of Congress in 1994 (D72)	sharp increase in partisanship (D72)
earlier estimators (not correcting for finite-sample bias) (C51)	misleading results about partisanship levels (D72)
partisan differences in speech (D72)	reflect party differences in values and goals (D72)
partisan differences in speech (D72)	distinct from differences in roll call voting (D72)

Back to index