PublicationHarvester: An OpenSource Software Tool for Policy Research

Working Paper: NBER ID: w12039

Authors: Pierre Azoulay; Andrew Stellman; Joshua Graff Zivin

Abstract: We present PublicationHarvester, an open-source software tool for gathering publication information on individual life scientists. The software interfaces with MEDLINE, and allows the end-user to specify up to four MEDLINE-formatted names for each researcher. Using these names along with a user-specified search query, PublicationHarvester generates yearly publication counts, optionally weighted by Journal Impact Factors. These counts are further broken-down by order on the authorship list (first, last, second, next-to-last, middle) and by publication type (clinical trials, regular journal articles, reviews, letters/editorials, etc.) The software also generates a keywords report at the scientist-year level, using the Medical Subject Headings (MeSH) assigned by the National Library of Medicine to each publication indexed by Medline. The software, source code, and user manual can be downloaded at http://www.stellman-greene.com/PublicationHarvester/

Keywords: Publication data; Software tool; Science policy; Publication counts; Journal impact factors

JEL Codes: O32


Causal Claims Network Graph

Edges that are evidenced by causal inference methods are in orange, and the rest are in light blue.


Causal Claims

CauseEffect
publicationharvester (Y30)increased efficiency in data collection (C80)
publicationharvester (Y30)quality of publication data (L15)
publicationharvester (name specification) (Y30)more accurate publication counts (C80)
publicationharvester (publication type differentiation) (Y30)improved data accuracy (Y10)

Back to index