Working Paper: NBER ID: w12039
Authors: Pierre Azoulay; Andrew Stellman; Joshua Graff Zivin
Abstract: We present PublicationHarvester, an open-source software tool for gathering publication information on individual life scientists. The software interfaces with MEDLINE, and allows the end-user to specify up to four MEDLINE-formatted names for each researcher. Using these names along with a user-specified search query, PublicationHarvester generates yearly publication counts, optionally weighted by Journal Impact Factors. These counts are further broken-down by order on the authorship list (first, last, second, next-to-last, middle) and by publication type (clinical trials, regular journal articles, reviews, letters/editorials, etc.) The software also generates a keywords report at the scientist-year level, using the Medical Subject Headings (MeSH) assigned by the National Library of Medicine to each publication indexed by Medline. The software, source code, and user manual can be downloaded at http://www.stellman-greene.com/PublicationHarvester/
Keywords: Publication data; Software tool; Science policy; Publication counts; Journal impact factors
JEL Codes: O32
Edges that are evidenced by causal inference methods are in orange, and the rest are in light blue.
Cause | Effect |
---|---|
publicationharvester (Y30) | increased efficiency in data collection (C80) |
publicationharvester (Y30) | quality of publication data (L15) |
publicationharvester (name specification) (Y30) | more accurate publication counts (C80) |
publicationharvester (publication type differentiation) (Y30) | improved data accuracy (Y10) |