Close to You Bias and Precision in Patent-Based Measures of Technological Proximity

Working Paper: NBER ID: w13322

Authors: Mary Benner; Joel Waldfogel

Abstract: Patent data have been widely used in research on technological innovation to characterize firms' locations as well as the proximities among firms in knowledge space. Researchers could measure proximity among firms with a variety of measures based on patent class data, including Euclidean distance, correlation, and angle between firms' patent class distributions. Alternatively, one could measure proximity using overlap in cited patents. We point out that measures of proximity based on small numbers of patents are imprecisely measured random variables. Measures computed on samples with few patents generate both biased and imprecise measures of proximity. We explore the effects of larger sample sizes and coarser patent class breakdowns in mitigating these problems. Where possible, we suggest that researchers increase their sample sizes by aggregating years or using all of the listed patent classes on a patent, rather than just the first.

Keywords: No keywords provided

JEL Codes: O31

Causal Claims Network Graph

Edges that are evidenced by causal inference methods are in orange, and the rest are in light blue.

Causal Claims

Cause	Effect
sample size (C83)	accuracy of proximity measures (C52)
small samples (C90)	biased and imprecise proximity measures (C49)
increasing sample size (C83)	convergence of mean estimates to true values (C51)
coarser patent class groupings (L70)	reduced bias (C46)

Back to index