Partially Linear Models Under Data Combination

Working Paper: NBER ID: w29953

Authors: Xavier Dhaultfoeuille; Christophe Gaillac; Arnaud Maurel

Abstract: We consider the identification of and inference on a partially linear model, when the outcome of interest and some of the covariates are observed in two different datasets that cannot be linked. This type of data combination problem arises very frequently in empirical microeconomics. Using recent tools from optimal transport theory, we derive a constructive characterization of the sharp identified set. We then build on this result and develop a novel inference method that exploits the specific geometric properties of the identified set. Our method exhibits good performances in finite samples, while remaining very tractable. Finally, we apply our methodology to study intergenerational income mobility over the period 1850-1930 in the United States. Our method allows to relax the exclusion restrictions used in earlier work while delivering confidence regions that are informative.

Keywords: Partially Linear Models; Data Combination; Intergenerational Income Mobility

JEL Codes: C14; C21; J62

Causal Claims Network Graph

Edges that are evidenced by causal inference methods are in orange, and the rest are in light blue.

Causal Claims

Cause	Effect
covariates (xc and xnc) (C29)	outcome variable (y) (C29)
identified set for the parameter of interest (0 or 0k) (C20)	convex and compact characteristics (C61)
shape restrictions on f (C46)	sign of 0k (C29)
optimal transport theory (L91)	construction of identified set (D79)
method developed (C59)	confidence regions (C46)
method developed (C59)	good performance in finite samples (C52)
method developed (C59)	study intergenerational income mobility (J62)

Back to index