程序代写代做代考 Hive graph game Review of Economic Studies (1993) 60, 531-542 0034-6527/93/00270531$02.00 © 1993 The Review of Economic Studies Limited

Review of Economic Studies (1993) 60, 531-542 0034-6527/93/00270531$02.00 © 1993 The Review of Economic Studies Limited
Identification of Endogenous
Social Effects:
The Reflection Problem
CHARLES F. MANSKI
University o f Wisconsin-Madison
First version received December 1991; final version accepted December 1992 (Eds.)
This paper examines the reflection problem that arises when a researcher observing the distribution of behaviour in a population tries to infer whether the average behaviour in some group influences the behaviour of the individuals that comprise the group. It is found that inference is not possible unless the researcher has prior information specifying the composition of reference groups. If this information is available, the prospects for inference depend critically on the population relationship between the variables defining reference groups and those directly affecting outcomes. Inference is difficult to impossible if these variables are functionally dependent or are statistically independent. The prospects are better if the variables defining reference groups and those directly affecting outcomes are moderately related in the population.
1. INTRODUCTION
A variety of terms in common use connote endogenous social effects, wherein the propensity of an individual to behave in some way varies with the prevalence of that behaviour in some reference group containing the individual. These effects may, depend- ing on the context, be called “social norms”, “peer influences”, “neighbourhood effects”, “conformity”, “imitation”, “contagion”, “epidemics”, “bandwagons”, “herd behaviour”, “social interactions”, or “interdependent preferences”.
Endogenous effects have long been central to sociology and social psychology; see, for example, Asch (1952), Merton (1957), Erbring and Young (1979), and Bandura (1986). Mainstream economics has always been fundamentally concerned with a particular endogenous effect: how an individual’s demand for a product varies with price, which is partly determined by aggregate demand in the relevant market. Economists have also studied other types of endogenous effects. Models of oligopoly posit reaction functions, wherein the output chosen by each firm is a function of aggregate industry output. Schelling (1971) analyzed the residential patterns that emerge when individuals choose not to live in neighbourhoods where the percentage of residents of their own race is below some threshold. Conlisk (1980) showed that, if decision making is costly, it may be optimal for individuals to imitate the behaviour of other persons who are better informed. Akerlof (1980), Jones (1984), and Bernheim (1991) studied the equilibria of non-cooperative games in which individuals are punished for deviation from group norms. Gaertner (1974); Pollak (1976), Alessie and Kapteyn (1991), and Case (1991) analyzed consumer demand models in which, holding price fixed, individual demand increases with the mean demand of a reference group.
While the price-mediated effect of aggregate demand on individual demand is quite
generally accepted, other endogenous effects are controversial. Many economists regard 531
Downloaded from http://restud.oxfordjournals.org/ at UQ Library on April 23, 2014

532 REVIEW OF ECONOMIC STUDIES
such central sociological concepts as norms and peer influences to be spurious phenomena explainable by processes operating entirely at the level ofthe individual. (See, for example, the Friedman (1957) criticism of Duesenberry (1949).) Even among sociologists, one still does not find consensus on the nature of social effects. For example, there has been a long-running debate about the existence and nature of neighbourhood effects. (See, for example, Jencks and Mayer (1989).)
Why do such different perspectives persist? Why do the social sciences seem unable to converge to common conclusions about the channels through which society affects the individual? I believe that a large part of the answer is the difficulty of the identification problem. Empirical analysis of behaviour often cannot distinguish among competing hypotheses about the nature of social effects.
Economists have long been concerned with the identification of endogenous effects channelled through markets, especially with the conditions under which observations of equilibrium prices and quantities reveal the demand behaviour of consumers and the supply behaviour of firms. But the identification of other endogenous effects has remained relatively unexamined and poorly understood.
This paper examines the “reflection” problem that arises when a researcher observing the distribution of behaviour in a population tries to infer whether the average behaviour in some group influences the behaviour of the individuals that comprise the group. The term reflection is appropriate because the problem is similar to that of interpreting the almost simultaneous movements of a person and his reflection in a mirror. Does the mirror image cause the person’s movements or reflect them? An observer who does not understand something of optics and human behaviour would not be able to tell.
Although the reflection problem has several aspects, the series of simple findings reported in this paper collectively develop a theme: Inference on endogenous effects is not possible unless the researcher has prior information specifying the composition of reference groups. If this information is available, the prospects for inference depend critically on the population relationship between the variables defining reference groups and those directly affecting outcomes. Inference is difficult to impossible if these variables are functionally dependent or statistically independent. The prospects are better if the variables defining reference groups and those directly affecting outcomes are “moderately” related in the population.
Section 2 examines the reflection problem in the context of a linear model applied in many empirical studies of social effects. Section 3 analyzes non-linear models. Section 4 discusses dynamic models. Section 5 relates conventional consumer demand analysis to the work of this paper. Section 6 concludes by stressing the need for richer data if the analysis of social effects is to make more progress.
2. A LINEAR MODEL
The linear model analyzed here gives formal expression to three hypotheses often advanced to explain the common observation that individuals belonging to the same group tend to behave similarly. These hypotheses are:
(a) endogenous effects, wherein the propensity of an individual to behave in some way varies with the behaviour of the group;
(b) exogenous (contextual) effects, wherein the propensity of an individual to behave 1
in some way varies with the exogenous characteristics of the group, and
1. In the sociological literature, this is referred to as a “contextual effect”. Inference on contextual effects became an important concern of sociologists in the 1960s, when substantial efforts were made to learn the effects on youth of school and neighbourhood environment (e.g. Coleman et al. (1966); Sewell and Armer (1966». The recent resurgence of interest in spatial concepts of the underclass has spawned many new empirical studies (e.g. Crane (1991), Jencks and Mayer (1989), and Mayer (1991)). I use the term “exogenous” effect as a synonym for contextual effect, to distinguish the idea from endogenous effects.
Downloaded from http://restud.oxfordjournals.org/ at UQ Library on April 23, 2014

MANSKI THE REFLECTION PROBLEM
533
(c) correlated effects, wherein individuals in the same group tend to behave similarly because they have similar individual characteristics or face similar institutional environments.
An example may help to clarify the distinction. Consider the high school achievement of a teenage youth. There is an endogenous effect if, all else equal, individual achievement tends to vary with the average achievement of the students in the youth’s school, ethnic group, or other reference group. There is an exogenous effect if achievementtends to vary with, say, the socio-economic composition ofthe reference group. There are correlated effects if youths in the same school tend to achieve similarly because they have similar family backgrounds or because they are taught by the same teachers.
The three hypotheses have differing policy implications. Consider, for example, an educational intervention providing tutoring to some of the students in a school but not to the others. If individual achievement increases with the average achievement of the students in the school, then an effective tutoring programme not only directly helps the tutored students but, as their achievement rises, indirectly helps all students in the school, with a feedback to further achievement gains by the tutored students. Exogenous effects and correlated effects do not generate this “social multiplier”.
Section 2.1 specifies the model. Sections 2.2 and 2.3 analyse the identification problem, first considering the general model and then a restricted version assuming that neither exogenous nor correlated effects are present. Section 2.4 shows that, although the linear model sometimes imposes restrictions on observed behaviour, the model holds tautologically if the attributes defining reference groups and those directly affecting outcomes are functionally dependent. Section 2.5 draws implications for the problem of identifying reference groups. Section 2.6 discusses sample inference.
2.1. Model specification
Let each member of a population be characterized by a value for (y, x, z, u) E R I X R X
KI
R X R • Here y is a scalar outcome (e.g. a youth’s achievement in high school), x are
attributes characterizing an individual’s reference group (e.g. a youth’s school or ethnic group), and (z, u)are attributes that directly affect y (e.g. socioeconomic status and ability). A researcher observes a random sample of realizations of (y, x, z). Realizations of u are not observed.
Assume that
y =a+f3E(y Ix)+ E(zlx)”Y+ z’.” +u, E(u Ix,z) =x’B, (1) where (a, f3, ‘Y, 8, TJ) is a parameter vector. It follows that the mean regression of y on
(x, z) has the linear form
E ( y Ix, z) = a + f3E ( y Ix ) + E (z Ix ) ‘ ‘Y + x ‘ 8 + Z’11·
(2)
If f3 :¢ 0, the linear regression (2) expresses an endogenous effect: a person’s outcome y varies with E (y Ix), the mean of y among those persons in the reference group defined byx.2 If’Y¢ 0,themodelexpressesanexogenouseffect:yvarieswithE(zIx),themean of the exogenous variables z among those persons in the reference group. If 8 :¢ 0, the model expresses correlated effects: persons in reference group x tend to behave similarly
2. Beginning with Hyman (1942), sociological reference-group theory has sought to express the idea that individuals learn from or are otherwise influenced by the behaviour and attitudes of some reference group. Bank, Slavings, and Biddle (1990) give an historical account. Sociological writing has remained predominately verbal, but economists have interpreted reference groups as conditioning variables, in the manner of (2). See Alessie and Kapteyn (1991) or Manski (1993a).
J
Downloaded from http://restud.oxfordjournals.org/ at UQ Library on April 23, 2014

534 REVIEW OF ECONOMIC STUDIES
because they have similar unobserved individual characteristics u or face similar institu-
tional environments. The parameter.” expresses the direct effect of z on y.3 2.2. Identificationoftheparameters
The question is whether the two types of social effects can be distinguished from one another and from the non-social effects. Thus, we are interested in identification of the parameter veetor (a, fJ, ‘Y, 8, .,,). To focus attention on this question, I shall assume that either (i) (x, z) has discrete support or (ii) y and z have finite variances and the regressions [E(ylx, z), E(ylx), E(zlx)) are continuous on the support of (x, z). Either assumption implies that [E(ylx, z), E(ylx), E(zlx)] are consistently estimable on the support of (x, Z).4 SO we can treat these regressions as known and focus on the parameters.
The reflection problem arises out of the presence of E (y Ix) as a regressor in (2). Integrating both sides of (2) with respect to z reveals that E (y Ix) solves the “social equilibrium” equation
E(y [x) = a +J3E(ylx)+ E(zlx)”Y+x’8+ E(zlx)’.”. (3) Provided that f3 :1:-1, equation (3) has a unique solution, namely
E(y[x)= a/(I-f3)+E(zIx)'(‘Y+'”)/(1-{3)+x’8/(I-f3). (4)
Thus, E(ylx) is a linear function of [1, E(zlx), x], where “I” denotes the constant. It follows that the parameters (a, 13, ‘Y, 8) are all unidentified. Endogenous effects cannot be distinguished from exogenous effects or from correlated effects.
What is identified? Inserting (4) into (2) we obtain the reduced form model E(ylx,z)= a/(l-f3)+E(zlx)'[(‘Y+f3.,,)/(l-f3)]+x’8/(l-13)+z’.”. (5)
Inspection of (5) provides our first result:
PropositioDI. In the linear model (2) with J3 :1:-1, the compositeparameters a / ( l – fJ), (‘Y+fJ.,,)/(l-{3), 8/(l-fJ), and.” are identified if the regressors [1,E(zlx),x,z] are linearly independent in the population.s
Identification of the composite parameters does not enable one to distinguish between the two social effects but does permit one to determine whether some social effect is present. If (‘Y+ /3.”)/(1- fJ) is non-zero, then either f3.” or ‘Y must be non-zero.
3. Many generalizations of model (2) are of potential interest. Non-linear and dynamic models will be examined in Sections 3 and 4. Some other directions for generalization include the following:
(i) Each person might be influenced by multiple reference groups. giving more weight to the behaviour of some groups than to others.
(ii) The outcome y might be a vector. yielding a system of endogenous effects with mean reference-group outcomes along each dimension affecting individual outcomes along other dimensions.
(iii) Social effects might be transmitted by distributional features other than the mean. For example, it is sometimes said that the strength of the effect of social norms on individual behaviour depends on the dispersion of behaviour in the reference group; the smaller the dispersion. the stronger the norm.
4. Cell-average estimates may be used if the support of (x, z) is discrete. A variety of non-parametric regression estimates may be used if assumption (ii) holds; see, for example, Hardie (1990). Assumptions (i) and (ii) cover many but not all cases of empirical interest. They do not seem appropriate in studies of small-group social interactions, such as family interactions. In analyses of family interactions, each reference group (i.e, family) has negligible size relative to the population and random sampling of individuals typically does not yield multiple members of the same family. Hence, it is not a good empirical approximation to assume that (x, z) has discrete support. Moreover, unless one can somehow characterize different families as being similar in composition. one cannot assume that [E(ylx, z), E(Ylx). E(zlx)) are continuous. Random sampling of individuals is not an effective data-gathering process for the study of family interactions. It is preferable to use families as the sampling unit.
5. Linear independence in the population means that the support of the distribution of [1. E(zlx). x, z]
1KJK isnotaproperlinearsubspaceofR xR XR XR •
Downloaded from http://restud.oxfordjournals.org/ at UQ Library on April 23, 2014

MANSKI THE REFLECTION PROBLEM 535
The ability to detect some social effect breaks down if E (z Ix) is a linear function of [1, x, z]. Unfortunately, E(zlx) is a linear function of [1, X, z] in various situations, including those stated in the following corollary to Proposition 1.
Corollary. In the linear model (2) with {j y!: 1, the composite social-effects parameter (‘Y+(3TJ)/(1- (j) is not identified ifany ofthese conditions hold, almost everywhere.
(a) z is a function ofx.
(b) E (z Ix) does not vary with x. (c) E(zlx) is a linear function ofx.
The corollary shows that the ability to infer the presence of social effects depends critically on the manner in which z varies with x. In the context of the linear model (2), inference is possible only if E (z Ix) varies non-linearly with x and Var (z Ix) > o.
2.3. A pure endogenous-effects model
The outlook for identification improves if one has information on some parameter values. Empirical studies of endogenous effects typically assume that ‘Y = ~ = 0; so neither exogenous nor correlated effects are present. In this case, (5) reduces to
E(yIx,z) = a/(l- fJ)+E(zIx),[{jTJ/(I-(j)]+ZlTJ· Inspection of (6) shows the following:
(6)
Proposition 2. In the linear model (2) with parameter restrictions y = 5 = 0 and /3 y!: 1, the composite parameters a/(l- (3), fJ.,,/(l-/3), and.” are identified if the regressors [1, E (z Ix), z] are linearly independent in the population.
The endogenous-effects parameter fJ is not identified if TJ =0 or if E (z Ix) is a linear function of [1, z]. In particular, /3 is not identified if any of these conditions hold, almost
everywhere:
(a) zisafunctionofx.
(b) E(zlx) does not vary with x.
(d) E (z Ix) is a linear function ofx. x is a linear function of z.
For example, in a study of school achievement, /3 is identified if x is family income, z is ability, average ability E(zIx) varies non-linearly with income, and achievement varies with ability (i.e. .”y!:0). But /3 is not identified if x is (ability, family income) and z is ability (condition a); if x is family income, z is ability, and average ability does not vary with income (condition b); or if x is family income, z is (ability, family income), and average ability varies linearly with income (condition d).
2.4. Tautological models
Even when its parameters are unidentified, a social-effects model may impose restrictions on observed behaviour and so have testable implications. There are, however, specifications of (x, z) that make a model hold tautologically. In particular, this is the case when x and z are functionally dependent.
Specify z to be a function of x, say z=z(x). Then E[Ylx,z(x)]=E(Ylx). So the linear model (2) holds with /3=1 and a =y = ~= .” =o. Thus, observed behaviour is
Downloaded from http://restud.oxfordjournals.org/ at UQ Library on April 23, 2014

536 REVIEW OF ECONOMIC STUDIES
always consistent with the hypothesis that individual behaviour reflects mean reference- group behaviour. For example, if a researcher studying student achievement specifies x to be (ability, family income) and z to be (ability), he will find that the data are consistent with the hypothesis that reference groups are defined by (ability, family income), that individual achievement reflects reference-group achievement, and that ability has no direct effect on achievement.
Conversely,specifyxtobeafunctionofz,sayx=x(z). ThenE[yIx(z),z]=E(yIz). So the semi-linear model
E(y[x,z)= a +f3E(y[x)+E(zIx)’y+x’B +g(z) (2′)
holds with a = f3 =y =8 =0 and g(z) =E (y Iz); the only testable restriction of the linear model (2) is its assumption that g( .) is a linear function. Continuing the school achieve- ment example, a researcher who specifies x to be (family income) and z to be (ability, family income) will find that the data are consistent with the hypothesis that social forces do not affect achievement.
2.5. Identifying reference groups
So far, I have presumed that researchers know how individuals form reference groups and that individuals correctly perceive the mean outcomes experienced by their supposed reference groups. There is substantial reason to question these assumptions. Researchers studying social effects rarely offer empirical evidence to support their reference-group specifications. The prevailing practice is simply to assume that individuals are influenced by E(ylx) and E(zlx), for some specified x.6 One of the few studies that does attempt to justify its specification of reference groups is Woittiez and Kapteyn (1991). They use individuals’ responses to questions about their “social environments” as evidence on their reference groups.
If researchers do not know how individuals form reference groups and perceive reference-group outcomes, then it is reasonable to ask whether observed behavior can be used to infer these unknowns. The findings reported in Section 2.4 imply that this is not possible. Any specification of a functionally dependent pair (x, z) is consistent with observed behaviour. The conclusion to be drawn is that informed specification of reference groups is a necessary prelude to analysis of social effects.
2.6. Sample inference
Although our primary concern is with identification, a discussion of sample inference is warranted.
Empirical studies o f social effects have generally assumed that there are no correlated effects and only one of the two types of social effects. Studies of exogenous effects have typically applied a two-stage method to estimate (y, 7J). In the first stage, one uses the sample data on (z, x) to estimate E(zlx) non-parametrically; typically x is discrete and the estimate of E(zIx) is a cell-average. In the second stage, one estimates (y, 7J) by finding the least squares fit of y to [1, EN(z Ix), z], where EN(z Ix) is the first-stage estimate of E(zlx). See, for example, Coleman et al. (1966), Sewell and Armer (1966), Hauser (1970), Crane (1991) or Mayer (1991).
6. The same practice is found in empirical studies of decision making under uncertainty. Researchers assume they know how individuals form their expectations but offer no evidence justifying their assumptions. I have recently criticized this practice in the context of studies of schooling choice. See Manski (l993b).
Downloaded from http://restud.oxfordjournals.org/ at UQ Library on April 23, 2014

MANSKI THE REFLECTION PROBLEM 537
Studies of endogenous effects have also applied a two-stage method to estimate (/3, 7J), but in the guise of a “spatial correlation” model
Yi= ~~NY+Zi’7J+Ui’
i= 1,…,N. (7)
Here Y=(Yi’i=1,…,N)istheNx1vectorofsamplerealizationsofYand~Nisa specified 1x N weighting vector; the components of ~N are non-negative and sum to one. The disturbances u are usually assumed to be normally distributed, independent of x, and the model is estimated by maximum likelihood. See, for example, Cliff and Ord (1981), Doreian (1981), or Case (1991).
Equation (7) states that the behaviour of each person in the sample varies with a weighted average of the behaviours of the other sample members. Thus, the spatial correlation model assumes that an endogenous effect is present within the researcher’s sample rather than within the population from which the sample was drawn. This makes sense in studies of small-group interactions, where the sample is composed of clusters of friends, co-workers, or household members; see, for example, Duncan, Haller, and Portes (1968) or Erbring and Young (1979). But it does not make sense in studies of neighbourhood and other large-group social effects, where the sample members are randomly chosen individuals. Taken at face value, equation (7) implies that the sample members know who each other are and choose their outcomes only after having been selected into the sample.
The spatial correlation model does make sense in studies of large-group interactions if interpreted as a two-stage method for estimating a pure endogenous-effects model. In the first stage, one uses the sample data on (y, x) to estimate E(y Ix) non-parametrically, and in the second stage, one estimates (/3,7J) by finding the least-squares fit of Y to [1, EN(ylx), z], where EN(yjx) is the first-stage estimate of E(Ylx). Many non- parametric estimates of E(ylxi) are weighted averages of the form EN(ylxi) = ~N~ with ~N determining the specific estimate; see Hardie (1990). Hence, estimates of (fJ, 7J) reportedin the spatial correlation literature can be interpreted as estimates ofpure endogenous-effects models.
Note that point estimates can be obtained for unidentified models. If condition a, b, or d of Proposition 2 holds, then E(y Ix) is a linear function of [1, z]. But the estimate EN(y Ix) typically is linearly independent of [1, Z]. SO the two-stage procedure typically produces an estimate for ~ even when this parameter is unidentified.”
3. NON-LINEAR ENDOGENOUS-EFFECTS MODELS
How do the findings reported in Section 2 fare when the social-effects model is not necessarily linear? This section examines two situations. Section 3.1 assumes that one
7. It is necessary to point out that empirical studies reporting two-stage estimates of social-effects models have routinely misreported the sampling distribution of their estimates. The practice in two-stage estimation of exogenous-effects models has been to treat the first-stage estimate EN(z [x) as if it were E(z [x) rather than an estimate thereof. The literature on spatial correlation models has presumed that equation (7) holds as stated and has not specified how the weights ~N should change with N.
Two-stage estimation of social-effects models is similar to other semi-parametric two-stage estimation problems whose asymptotic properties have been studied recently. Ahn and Manski (1993), Ichimura and Lee (1991), and others have analyzed the asymptotic behaviour of various estimators whose first stage is non- parametric regression and whose second stage is parametric estimation conditional on the first-stage estimate. It is typically found that the second-stage estimate of an identified parameter is .,fN-consistent with a limiting normal distribution if the first-stage estimator is chosen appropriately. The variance of the limiting distribution is typically larger than that which would prevail if the first-stage regression were known rather than estimated. It seems likely that this result holds here as well.
Downloaded from http://restud.oxfordjournals.org/ at UQ Library on April 23, 2014

538 REVIEW OF ECONOMIC STUDIES
does not know the form ofthe regression and so analyses social effects non-parametrically. Section 3.2 assumes that the regression is a member of a specified non-linear family of functions. To keep the analysis relatively simple, I restrict attention to pure endogenous- effects models.
3.1. Non-parametric analysis
Assume that, for some unknown function f: RI X RK -+ RI,
E(ylx, z) = JIE(y [x), z]. (8) This non-parametric endogenous-effects model, with the implied social equilibrium
equation
E(ylx)= fJIE(Ylx),z]dP(zlx), (9)
drops the linearity assumption imposed in Section 2.3.
In this non-parametric setting, one measures endogenous effects directly rather than
Let’ER and(eo,el)ER• Thenthecontrast
(to)
measures the effect at , of exogenously changing mean reference-group behaviour from eoto el • In the absence of functional form assumptions.j'(: , .) is identified on the support of [E(ylx), z]. The contrast T(eh eo,,) is identified if and only if (eh ,) and (eo, ,) are both on the support of [E(Ylx), z].
To say more requires that one characterize the support of [E(y~x),z]. Useful conditions ensuring that contrasts are identified seem hard to obtain. On the other hand, I can show that contrasts are generically not identified if x and z are either functionally dependent or statistically independent.
Proposition 3. In the non-parametric endogenous-effects model (8), no contrasts of the form (10) are identified if any of these conditions hold, almost everywhere:
(e) z is a function o f x and the social equilibrium equation (9) has a unique solution. (f) z is statistically independent ofx and the social equilibrium equation (9) has a
unique solution.
(g) x is a function ofz.
Proof. If E (y Ix) is a function of z, (el , ,) and (eo, ,) cannot both be on the support of (x, z). So no contrasts are identified. Conditions e, f, and g all imply that E(ylx) is a function of z.
(e) Let’E RK.. The distribution ofx conditional on the event [z = ,] is concentrated on the set X(,) ==[x: z(x) = ,]; hence, the distribution of E(y [x) conditional on the event [z =,] is concentrated on [E(ylx), XE X(‘)]. For XE X(‘), p(zlx) has all its mass at the point ,. Hence, for x E X(‘), equation (9) reduces to E(yIx)=11E(y[x],,]. SoE(yIx)solvesthesameequationforeachxinX«(). The uniqueness assumption then implies that E(ylx) is constant on X(n. So E (y ~x) is a function of z, almost everywhere.
through a parameter. That is, one fixes z and asks how 11E(ylx), z] varies with E(Ylx). K2
Downloaded from http://restud.oxfordjournals.org/ at UQ Library on April 23, 2014

MANSKI THE REFLECTION PROBLEM 539
(0 Statistical independence means that p(zlx) = P(z). Hence (9) reduces to E(ylx) =JJIE(Ylx), z]dP(z). So E(ylx) solves the same equation for each value of x. The uniqueness assumption then implies that E (y Ix) is constant for all values of x.
(g) Letx5x(z). ThenE[ylx(z)]isafunctionofz.
A useful way to think about the proposition is to imagine that f( . , . ) is really linear with /3:;!: 1 but, not knowing this, one proceeds non-parametrically. The linear model has a unique social equilibrium so conditions e, f, and g all apply. Taken together, the conditions say that non-parametric identification of endogenous effects is precluded if the attributes defining reference groups and those directly affecting outcomes are func- tionally dependent or statistically independent. Non-parametric study of social effects remains conceivable only if x and z are “moderately related” random variables.
The prospects for identification may improve if f( . , . ) is non-linear in a manner that generates multiple social equilibria. In this case, condition g remains in effect but e and f do not apply. When there are multiple equilibria, E(ylx) may fluctuate from one
equilibrium value to another and so may not be a function of z.
3.2. Binary response models
Perhaps the most familiar non-linear parametric models with endogenous effects are binary response models. Let y be a binary random variable and assume that
P(y = llx, z) = H[a +/3P(y = llx)+ Z’TJ], (11)
where H ( .) is a specified continuous, strictly increasing distribution function. For example, if H ( . ) is the logistic distribution, we have a logit model with social effects.
Models of form (11) have been estimated by two-stage methods. The usual approach is to estimate P(y =11x) non-parametrically and then estimate ({3, ‘Y) by maximizing the quasi-likelihood in which PN (y =11x) takes the place of P(y =11x). Examples include Case and Katz (1991) and Gamoran and Mare (1989). A multinomial response model estimated in this manner appears in Manski and Wise (1983, Chapter 6).
The literature has not addressed the coherency and identification of model (11) but I can settle the coherency question here. The model is coherent if there is a solution to the social equilibrium equation
P(y= llx)= f H[a+/3P(y = l!x)+z’7/]dP(zlx). (12)
If /3=0, E(ylx)=J H(a+z’.,,)dp(zlx) so there is a unique solution to (12). If /3<0, the right-hand side of (12) decreases strictly and continuously from JH(a +Z'TJ)dP(zIx) to JH(a+/3+z'TJ)dP(zlx) as E(ylx) rises from 0 to 1. Meanwhile, the left-hand side increases strictly and continuously from 0 to 1. Hence the left- and right-hand sides cross at a unique value of E(ylx). Finally, let /3> O. In this case, a solution exists because the right-hand side of (12) increases strictly and continuously from JH(a +Z’TJ)dp(zlx) to JH(a +f3+z’1])dP(z [x) as E(yIx) rises from 0 to 1. Meanwhile, the left-hand side traverses the larger interval [0, 1]. Hence, the left-hand side must cross the right-hand side from below at some value of E(ylx).
The above shows that binary response models with endogenous effects are always coherent. When {3 ~ 0, these models have unique social equilibria. When {3 > 0, it does
Downloaded from http://restud.oxfordjournals.org/ at UQ Library on April 23, 2014

540 REVIEW OF ECONOMIC STUDIES
not appear possible to determine the number of equilibria without imposing additional structure. The conditions under which the parameters (a, fJ, .,,) are identified have not been established.
4. DYNAMIC MODELS
The models posed thus far assume contemporaneous effects. It may well be more realistic to assume some lag in the transmission of these effects. Some authors, including Alessie and Kapteyn (1991) and Borjas (1991), have estimated the following dynamic version of the linear model (2):
E,(y Ix, z) = a + j3E,_I(ylx)+ E,-I(zlx)”Y+ x,’8 + Z,’T1, (13)
where E, and E,_I denote expectations taken at periods t and t – 1. The idea is that non-social forces act contemporaneously but social forces act on the individual with a lag. If {E(zlx), x, z} are time-invariant and -1