Demand Estimation with Text and Image Data

Working Paper: CEPR ID: DP18507

Authors: Giovanni Compiani; Ilya Morozov; Stephan Seiler

Abstract: We propose a demand estimation method that allows researchers to estimate substitution patterns from unstructured image and text data. We first employ a series of machine learning models to measure product similarity from products' images and textual descriptions. We then estimate a nested logit model with product-pair specific nesting parameters that depend on the image and text similarities between products. Our framework does not require collecting product attributes for each category and can capture product similarity along dimensions that are hard to account for with observed attributes. We apply our method to a dataset describing the behavior of Amazon shoppers across several categories and show that incorporating texts and images in demand estimation helps us recover a flexible cross-price elasticity matrix.

Keywords: demand estimation; unstructured data; computer vision; text models

JEL Codes: C1; C5; C81


Causal Claims Network Graph

Edges that are evidenced by causal inference methods are in orange, and the rest are in light blue.


Causal Claims

CauseEffect
Product similarities derived from unstructured data (C52)Consumer substitution behavior (D12)
Incorporating text and images into demand estimation (C51)Cross-price elasticity matrix (C10)
Proposed method for demand estimation (C51)Substitution patterns (C60)
Proposed method for demand estimation (C51)More accurate reflections of consumer behavior (D12)
Model generates cross-price elasticities (C51)Differ substantially from simple logit model (C35)
Method captures additional dimensions of similarity (C59)Not accounted for in traditional models (E19)

Back to index