ASSIGNMENT 3
1. Structural models (15 marks). Consider the two equation structural model:
Y =AY +BX+ε E[εX′] = 0
whereY =(Y1,Y2)′,X=(X1,X2)′ andε=(ε1,ε2)′ are2×1vectors, 0 α1 β1 0
Copyright By PowCoder代写 加微信 powcoder
A= α2 0 , |α1α2|<1, B= 0 β2 (a) Assuming that E[XX′]−1 exists, compute Π ≡ E[Y X′]E[XX′]−1 as a function of A, B. (2 marks) (b) Compute each element of Π as a function of the structural parameters. (3 marks) (c) Under what conditions on β1 and β2 are the structural parameters identified? (4 marks) (d) Suppose now that we have a sample of (Y i, Xi) for i = 1, ..., N. Propose a consistent estimator of Π. (3 marks) (e) Propose consistent estimators of the structural parameters. (3 marks) 2. Panel data (20 marks). Consider the linear panel data model for individuals i = 1,...,N in periods t = 1,...,T, yit =α+x′itβ+αi +εit, (1) where yit is a continuous outcome of interest and xit comprises K regressors. Suppose also that the individual effects (αi) are i.i.d. with E[αi] = 0,V[αi] = σα2, that the errors (εit) are i.i.d. with E[εit] = 0, V[εit] = σε2 and that COV[αj,εit] = 0 otherwise (2) (a) If xit contains dummy variables for period effects (i.e. γt for t = 1, ..., T − 1), explain why the period effects can be consistently estimated for fixed T and N → ∞ but the individual effects cannot. (2 marks) (b) Stacking the data in long form, the GLS estimator can be written as (X′Ω−1X)−1(X′Ω−1y). What are the dimensions and entries of y,X, and Ω? Which assumption(s) are necessary for the GLS estimator to be consistent for fixed T and N → ∞? (3 marks) (c) Under which assumption(s) is the first differences estimator consistent for fixed T = 2 and N → ∞? Use your assumption(s) to show consistency for the specific case of T = 2, K = 1. (3 marks) (d) Find V[βF D |X], where βF D is the estimator considered in part (c). You may continue to use T = 2,K = 1 and the assumption(s) you made in part (c). (3 marks) (e) A university ran an experiment in which a cohort of first year Psychology students were randomly assigned to tutorial groups of different sizes. The range of possible sizes was {2, 3, 4, ..., 28, 29, 30}. You have panel data for N = 515 students over T = 4 courses. All students took the same 4 courses. The randomization occurred at the student-course level, hence all 4 of a student’s tutorial groups could be different sizes. For student i in course t, the variables to which you have access are yit, which is the natural logarithm of their percentage grade and xit, which is the natural logarithm of their tutorial group size. Your colleague posits that small group tuition improves academic performance, but that students of higher intrinsic ability have a lower return to small group tuition. Suggest an appropriate empirical model to test this assertion, and explain how its parameters can be estimated using the data described in the previous paragraph. You must state and justify any assumptions which you make. You need not keep any features of the model in equations (1)-(2) above. (9 marks) 3. Non-parametric analysis (15 marks). You have a sample of hourly wages for US manufacturing workers of size N = 100. The sample standard deviation is 1.3628 and the sample interquartile range is 5. Exactly sixty of the observations fall in the interval (19, 21). (a) Given a uniform kernel, explain why a good choice of the bandwidth is h = 1. (2 marks) (b) Do rule-of-thumb bandwidths satisfy the asymptotic criteria for uniform consistency of the Kernel density estimator? Justify your answer. (2 marks) (c) Using the Kernel and bandwidth from part (a), compute the Kernel density estimator ˆ f(20). (3 marks) ˆ (d) Explain why f(20) is unbiased when f(y) is linear for y in a neighbourhood of 20. (2 marks) (e) Construct a 0.9 confidence interval for f(20). (6 marks) 4. Data analysis (50 marks) (5 pages max.). An important research area in health economics regards inter-generational transmission of health. The attached data comprise a sample of 1,793 individuals aged 18-30, and includes demographic information and body mass index (BMI), which will be our measure of health. Use the “describe” command for a more detailed description of the data. In an attempt to establish the extent of inter-generational transmission in health, you will model the relationship between offspring BMI and the BMI of the parent of the same gen- der as the offspring. You should perform your analysis so that your empirical model uses all 1,793 observations simultaneously to estimate one equation, with dependent variable bmi and one variable of interest, parentbmi (constructed so that parentbmi=dadbmi for male offspring and parentbmi=mombmi for female offspring). Throughout your analysis, you should control for age, education, employment status, gender, family income, health insurance, kids and race. (Note: All variables are for offspring unless otherwise stated in the variable description). (a) Discuss the extent to which a regression based approach (i.e., using the command “regress”) will yield estimation results which have a causal interpretation regarding the inter-generational transmission of health. Implement a regression based approach and interpret the results. (5 marks) (b) Discuss the extent to which the BMI of the parent of the opposite gender to the offspring is a suitable instrument for parentbmi. Use the “ivregress” (or “ivreg2”) command to implement IV estimation and compare with your analysis in part (a). (10 marks) (c) Discuss the extent to which the education of the parent of the same gender as the offspring is a suitable instrument for parentbmi. Use the “ivregress” (or “ivreg2”) command to implement IV estimation and compare with your analysis in parts (a) and (b). (10 marks) (d) Propose and implement an alternative empirical strategy to identify the extent of inter-generational transmission in health. Discuss your findings and any limitations of your approach. (25 marks) 程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com