Problem Set #1: BLP Methodology
Econ 637: Empirical IO
Attached is a table of market shares, prices, and characteristics on the top-selling brands
of cereal in 1992. The data are aggregated from household-level scanner data (collected at
supermarket checkout counters).
The market shares below are shares of total cereal purchases observed in the dataset. For
the purposes of this problem set, assume that all households purchased some cereal during
1992 (so that non-purchase is not an option).1 Assume that brand #51, the composite basket
of “all other brands”, is the outside good.
Two sets of prices are given in the table. Shelf prices are those listed on supermarket
shelves, and do not include coupon discounts. Transactions prices are prices actually paid
by consumers, net of coupon discounts. Estimate using the transactions prices. Note that
you should subtract the price of brand #51, the “outside good”, from the prices of the top
fifty brands.
Assume a utility specification for uij, household i’s utility from brand j:
uij = Xj�i � ↵ipj + ⇠j + ⌫ij
where Xj are characteristics of brand j, ⇠j is an unobserved (to the econometrician) quality
param- eter for brand j, and ⌫ij is a disturbance term which is identicially and independently
distributed (i.i.d.) over households i and brands j. Assume
�ik = �k + �k&ik
↵i = ↵ + �0&i0
where &i0, …, &iK follows i.i.d. standard normal distribution.
1
This is not far from the truth; from an alternative data source (the IRI Marketing FactBook), one finds
out that in 1992, 97.1% of American households purchased at least some cereal during the year.
1
As in Berry (1994), denote the mean utility level from brand j as
�j = Xj� + ↵pj + ⇠j
1. Assuming that the ⌫ij ’s are distributed i.i.d. standard type I extreme value, derive the
resulting expressions for the market shares of each brand j, j = 1,… ,51. Next we implement
the BLP two-step estimator.
2. Invert the resulting system of demand functions to get estimates of the mean utility
levels �j as a function of the shares sj.
3. Estimate the second stage regression of �j on Xj and pj in different ways:
(a) OLS
(b) 2SLS: using average characteristics for all other brands produced by the same man-
ufacturer as brand j as instruments for pj
(c) 2SLS: using average characteristics for all other brands produced by rivals to the
manufacturer as brand j as instruments for pj
(d) 2SLS: using average characteristics for all other brands as instruments for pj
How do your results differ?
4. From the aggregate demand functions derived in question 1, derive the formulas for
the derivatives @sj
@pk
and the elasticities “jk =
@sj
@pk
pk
sj
, where j and k are any two pairs of
brands. What is the difference between “jj and “jk . Explain the implication of this.
5. Assuming that the manufacturers of the top fifty brands compete in Bertrand fashion,
derive the fifty first-order conditions which define prices in this market, assuming constant
marginal costs of production for each brand (and ignoring advertising costs). In other words,
assume that the total cost function for brand j : Cj(qj) = cjqj. These FOCs are a system
of linear equations in the unknowns c1, …, c50 . Using the expression derived in question 4
above, rewrite these FOCs completely in terms of the known prices, shares, and parameters
(in particular, ↵).
Challenging: solve for the marginal costs from this system of equations. Recall that linear
equations of the form Ax = b can be solved by x = A�1b. After deriving these costs, solve
for the markup pj�cj
pj
associated with each brand.
2