Working Paper: CEPR ID: DP15840
Authors: Anthony Strittmatter; Conny Wunsch
Abstract: The vast majority of existing studies that estimate the average unexplained gender pay gap use unnecessarily restrictive linear versions of the Blinder-Oaxaca decomposition. Using a notably rich and large data set of 1.7 million employees in Switzerland, we investigate how the methodological improvements made possible by such big data affect estimates of the unexplained gender pay gap. We study the sensitivity of the estimates with regard to i) the availability of observationally comparable men and women, ii) model flexibility when controlling for wage determinants, and iii) the choice of different parametric and semi-parametric estimators, including variants that make use of machine learning methods. We find that these three factors matter greatly. Blinder-Oaxaca estimates of the unexplained gender pay gap decline by up to 39% when we enforce comparability between men and women and use a more flexible specification of the wage equation. Semi-parametric matching yields estimates that when compared with the Blinder-Oaxaca estimates, are up to 50% smaller and also less sensitive to the way wage determinants are included.
Keywords: gender inequality; gender pay gap; common support; model specification; matching estimator; machine learning
JEL Codes: J31; C21
Edges that are evidenced by causal inference methods are in orange, and the rest are in light blue.
Cause | Effect |
---|---|
Methodological choices (C90) | Estimated unexplained gender pay gap (J79) |
Enforcing comparability between men and women (J78) | Estimated unexplained gender pay gap (J79) |
Using more flexible specifications of the wage equation (J39) | Estimated unexplained gender pay gap (J79) |
Semiparametric matching (C14) | Estimated unexplained gender pay gap (J79) |
Lack of comparable men for women (J79) | Raw gender pay gap (J79) |