Working Paper: CEPR ID: DP17865
Authors: Jeanine Miklosthal; Avi Goldfarb; Avery Haviv; Catherine Tucker
Abstract: When a user shares multi-dimensional data about themselves with a firm, the firm learns about the correlations of different dimensions of user data. We incorporate this type of learning into a model of a data market in which a firm acquires data from users with privacy concerns. User data is multi-dimensional, and each user can share no data, only non-sensitive data, or their full data with the firm. As the firm collects more data and becomes better at drawing inferences about a user’s privacy-sensitive data from their non-sensitive data, the share of new users who share no data (“digital hermits”) grows. At the same time, the share of new users who share their full data also grows. The model therefore predicts a polarization of users’ data sharing choices away from non-sensitive data sharing to no sharing and full sharing.
Keywords: digital markets
JEL Codes: D42; D82; D83; L20
Edges that are evidenced by causal inference methods are in orange, and the rest are in light blue.
Cause | Effect |
---|---|
amount of data shared by early users (D16) | firm's ability to make accurate predictions about sensitive data from nonsensitive data (C52) |
firm's ability to make accurate predictions about sensitive data from nonsensitive data (C52) | share of new users who choose to share no data (D16) |
amount of data shared by early users (D16) | share of new users who choose to share no data (D16) |
firm's ability to make accurate predictions about sensitive data from nonsensitive data (C52) | share of new users who share all their data (D16) |
users' privacy valuations (J17) | share of new users who share all their data (D16) |
firm's learning process (L21) | proportion of users sharing only nonsensitive data (D16) |
firm's learning process (L21) | equilibrium prices for nonsensitive data (P22) |