Text as Data

Working Paper: NBER ID: w23276

Authors: Matthew Gentzkow; Bryan T. Kelly; Matt Taddy

Abstract: An ever increasing share of human interaction, communication, and culture is recorded as digital text. We provide an introduction to the use of text as an input to economic research. We discuss the features that make text different from other forms of data, offer a practical overview of relevant statistical methods, and survey a variety of applications.

Keywords: Text analysis; High-dimensional data; Economics; Statistical methods

JEL Codes: C1


Causal Claims Network Graph

Edges that are evidenced by causal inference methods are in orange, and the rest are in light blue.


Causal Claims

CauseEffect
Text from financial news (G19)Asset price movements (G19)
Text from macroeconomic reports (E60)Inflation and unemployment rates (E31)
Google search data (Y10)Voting behavior in the 2008 election (K16)
News text (Y60)Political slant (D72)
Local news coverage of earnings announcements (G14)Trading by local investors (G15)

Back to index