Do Two Wrongs Make a Right? Measuring the Effect of Publications on Science Careers

Working Paper: NBER ID: w31844

Authors: Donna K. Ginther; Carlos Zambrana; Patricia Oslund; Wanying Chang

Abstract: This paper examines whether publication data matched to the Survey of Doctorate Recipients can be used for research purposes. We use Gold Standard data created to validate the publication match quality and compare these measures to publications assigned by a machine-learning algorithm developed by Thomson Reuters (now Clarivate). Our econometric model demonstrates that publications likely suffer from non-classical measurement error. Using horse race and instrumental variable models, we confirm that the Gold Standard data are relatively free from measurement error but show that the Clarivate data suffer from non-classical measurement error. We employ a variety of methods to adjust the Clarivate data for false negatives and false positives and demonstrate that with these adjustments the data produce estimates very similar to the Gold Standard. However, these adjustments are not as useful when publications are used as a dependent variable. We recommend using subsamples of the data that have better match quality when using the Clarivate data as a dependent variable.

Keywords: No keywords provided

JEL Codes: C26; J40; O30

Causal Claims Network Graph

Edges that are evidenced by causal inference methods are in orange, and the rest are in light blue.

Causal Claims

Cause	Effect
Clarivate data (Y10)	nonclassical measurement error (C20)
gold standard data (Y10)	free from measurement error (C20)
adjustments for false negatives and false positives in Clarivate data (C80)	estimates similar to gold standard (C13)
publication counts (A14)	career outcomes (salaries and likelihood of receiving federal research funding) (I23)
Clarivate data (with adjustments) (Y10)	reliable estimates for career outcomes (J24)
publication data as dependent variable (C29)	inadequate adjustments (F32)

Back to index