—
title: “Problem Set 3”
subtitle: “Behavioral Economics, Boston College”
output:
html_document:
theme: readable
highlight: pygment
number_sections: TRUE
toc: yes
toc_float:
collapsed: yes
—
“`{css question/answer styling, echo=FALSE}
/* do nothing to this chunk! */
.written_answer {
padding: 1em;
background: #FFF8DC;
color: gray;
border-left: 5px solid #fce68b;
border-radius: 0px;
margin-top: 2em;
margin-bottom: 2em;
}
blockquote {
font-size:16px;
background-color: #f0f5ff;
padding-left:5px;
}
“`
“`{r chunk options, echo=FALSE}
knitr::opts_chunk$set(message=FALSE, warning = FALSE)
“`
# Set-up and background {-}
The assignment is worth **100 points**. There are **27 questions**. You should have the following packages installed:
“`{r setup, message=FALSE, warning=FALSE}
library(tidyverse)
library(patchwork)
library(fixest)
“`
In this problem set you will summarize the paper [“Do Workers Work More if Wages Are High? Evidence from a Randomized Field Experiment”](https://www.aeaweb.org/articles?id=10.1257/aer.97.1.298) (Fehr and Goette, AER 2007) and recreate some of its findings.
# Big picture
>**[Q1]** What is the main question asked in this paper?
“`{block q1, type = ‘written_answer’}
Do intertemporal labor work more when wages are high?
“`
>**[Q2]** Recall the taxi cab studies where reference dependence is studied using observational data. What can an experimental study do that an observational study can’t?
“`{block q2, type = ‘written_answer’}
“A key feature of our experiment is the implementation of an exogenous and transitory increase in the commission rate by 25 percent. Therefore, we can be sure that unobserved supply or demand variations did not induce the change in the commission rate (i.e., the “wage” change)” (Ernst Fehr and Lorenz Götte,2004).This isthe epxperimental study do that implement an exogenous and transitory increase in the commission rate. Also, the experimantal study can make sure the unobserved variations in supply or demand did not induce the change in the commission rate. That’s an observational study cannot do.
“`
>**[Q3]** Summarize the field experiment design.
“`{block q3, type = ‘written_answer’}
The field experiment design set the experiment in a natural setting and implement exogenous variables. It eliminate the influence on unobserved supply and demand variations and participant know the basic informations.
“`
>**[Q4]** Summarize the laboratory experiment design. Why was it included with the study?
“`{block q4, type = ‘written_answer’}
The laboratory experiment design help experimetors to create controllable environments to test hypotheses. It is included with the study since laboratory experiment design could provides the best control of the participants, manipulated conditions and the environment.
“`
>**[Q5]** Summarize the main results of the field experiment.
“`{block q5, type = ‘written_answer’}
There are two measures of labor supply: the amount of revenue generated and the number of deliveries completed and the estimates of the treatment effect are almost identical for either choice of the labor supply measure. The results imply a large imtertemporal elasticity of substitution and imply a large and highly significant effect of a temporary wage increase on the total effect of the treated group.
“`
>**[Q6]** Summarize the main results of the laboratory experiment.
“`{block q6, type = ‘written_answer’}
There is a strong positive impact of the wage increase on total labor supply during the 2 four-week periods. On average, participants increase working time during the four weeks when they recive a higher wage by four shifts. However, the wage increase cuases a decrease in revenue (effort) per shift by roughly 6 percent. Also, the higher degree of loss aversion are assiociated with a stronger negative impact of the wage increase on effort per shift and vise versa.
“`
>**[Q7]** Why are these results valuable? What have we learned? Motivate your discussion with a real-world example.
“`{block q7, type = ‘written_answer’}
These results are valuable since we know that people who do not display loss aversion would work more if they could receive higher wages which offered companies a method to improve the labor productity. I have learned that higher salaries would increase productivity of a labor but this effect may not work on higher loss aversion people. There are so many examples in the real world. The best example I could think of is one kind of salary model that employees are paid extra salaries by each order. The delivery man is given extra salaries for each order and if they deliver exceed certain amount, the extra salaries would go up and this motivated them to deliver more orders which indiectly increase their productivity.
“`
# Replication
*Use `theme_classic()` for all plots.*
## Correlations in revenues across firms
*For this section please use `dailycorrs.csv`.*
“`{r load dailycorrs, message=FALSE, warning=FALSE}
dailycorrs = read_csv(“~/Desktop/empirical behavioral ec/assigment 3/dailycorrs.csv”)
“`
>**[Q8]** The authors show that earnings at Veloblitz and Flash are correlated. Show this with a scatter plot with a regression line and no confidence interval. Title your axes and the plot appropriately. Do not print the plot but assign it to an object called `p1`.
“`{r q8}
p1 <- dailycorrs %>%
ggplot(aes(x=logv, y = logf)) +
geom_point()+
geom_smooth(method = ‘lm’)+
theme_classic()
“`
>**[Q9]** Next plot the kernel density estimates of revenues for both companies. Overlay the distributions and make the densities transparent so they are easily seen. Title your axes and the plot appropriately. Do not print the plot but assign it to an object called `p2`.
“`{r q9}
delogf <- dailycorrs %>%
ggplot(aes(x=logf, color=logf, fill=logf)) +
geom_density(alpha = 0.1) +
theme_classic()
delogv <- dailycorrs %>%
ggplot(aes(x=logv, color=logv, fill=logv)) +
geom_density(alpha = 0.1) +
theme_classic()
p2 <- delogf+delogv
p2
```
>**[Q11]** Now combine both plots using `library(patchwork)` and label the plots with letters.
“`{r q11, message=FALSE}
patched <- p1 + p2
patched + plot_annotation(tag_levels = 'a')
```
## Tables 2 and 3
*For this section please use `tables1to4.csv`.*
“`{r load tables1to4, message=FALSE, warning=FALSE}
tables1to4 = read_csv(“~/Desktop/empirical behavioral ec/assigment 3/tables1to4.csv”)
“`
### Table 2
On page 307 the authors write:
“Table 2 controls for **individual fixed effects** by showing how, on average, the messengers’ revenues deviate from their person-specific mean revenues. Thus, a positive number here indicates a positive deviation from the person-specific mean; a negative number indicates a negative deviation.”
>**[Q12]** Fixed effects are a way to control for *heterogeneity* across individuals that is *time invariant.* Why would we want to control for fixed effects? Give a reason how bike messengers could be different from each other, and how these differences might not vary over time.
“`{block q12, type = ‘written_answer’}
We want to control for fixed effects because this help us reduce the threat of omitted variable bias and time invariant help experimentors not influnenced by time variables. The bike messengers are different from each other because the messengers keep receipts from each delivery they did on a shift and this difference would not change over time since their salaires are based on each shift.
“`
>**[Q13]** Create a variable called `totrev_fe` and add it to the dataframe. This requires you to “average out” each individual’s revenue for a block from their average revenue: $x_i^{fe} = x_{it} – \bar{x}_i$ where $x_i^{fe}$ is the fixed effect revenue for $i$.
“`{r q13}
tables1to4 <- tables1to4 %>%
mutate(totrev_fe = )
“`
>**[Q14]** Use `summarise()` to recreate the findings in Table 2 for “Participating Messengers” using your new variable `totrev_fe`. (You do not have to calculate the differences in means.)
>
> In addition to calculating the fixed-effect controled means, calculate the standard errors. Recall the standard error is $\frac{s_{jt}}{\sqrt{n_{jt}}}$ where $s_{jt}$ is the standard deviation for treatment $j$ in block $t$ and $n_{jt}$ are the corresponding number of observations.
>
> (Hint: use `n()` to count observations.) Each calculation should be named to a new variable. Assign the resulting dataframe to a new dataframe called `df_avg_revenue`.
“`{r q14}
# your code here
“`
>**[Q15]** Plot `df_avg_revenue`. Use points for the means and error bars for standard errors of the means.
>
*To dodge the points and size them appropriately, use*
“`{r, eval = FALSE}
geom_point(position=position_dodge(width=0.5), size=4)
“`
*To place the error bars use*
“`{r, eval=FALSE}
geom_errorbar(aes(
x=block,
ymin = [MEAN] – [SE], ymax = [MEAN] + [SE]),
width = .1,
position=position_dodge(width=0.5))
“`
*You will need to replace `[MEAN]` with whatever you named your average revenues and `[SE]` with whatever you named your standard errors.*
“`{r q15}
# your code here
“`
>**[Q16]** Interpret the plot.
“`{block q16, type = ‘written_answer’}
[your written answer here]
“`
### Table 3
>**[Q17]** Recreate the point estimates in Model (1) in Table 3 by hand (you don’t need to worry about the standard errors). Assign it to object `m1`. Recreating this model requires you to control for individual fixed effects and estimate the following equation where $\text{H}$ is the variable `high`, $\text{B2}$ is the second block (`block == 2`) and $\text{B3}$ is the third block (`block == 3`):
$$
y_{ijt} – \bar{y}_{ij} = \beta_1 (\text{H}_{ijt} – \bar{\text{H}}_{ij}) + \beta_2 (\text{B2}_{ijt} – \bar{\text{B2}}_{ij}) + \beta_3 (\text{B3}_{ijt} – \bar{\text{B3}}_{ij}) + (\varepsilon_{ijt} – \bar{\varepsilon}_{ij})
$$
“`{r q17}
# your code here
“`
>**[Q18]** Now recreate the same point estimates using `lm` and assign it to object `m2`. You are estimating the model below where where $\text{F}_i$ is the dummy variable for each messenger (`fahrer`). **Make sure to cluster the standard errors at the messenger level**. (Use `lmtest` and `sandwhich` for this.)
$$
y_{ijt} – \beta_0 + \beta_1 \text{H}_{ijt} + \beta_2 \text{B2}_{ijt} + \beta_3 \text{B3}_{ijt} + \sum_{i=1}^{n} \alpha_i \text{F}_i + \varepsilon_{ijt}
$$
“`{r q19}
# your code here
“`
>**[Q20]** Now use `feols` to recreate Model (1), including the standard errors. Assign your estimates to the object `m3`. You are estimating the model below where where $\alpha_i$ is the individual intercept (i.e. the individual fixed effect):
$$
y_{ijt} = \alpha_i + \beta_1 \text{H}_{ijt} + \beta_2 \text{B2}_{ijt} + \beta_3 \text{B3}_{ijt} + \varepsilon_{ijt}
$$
“`{r q20}
# your code here
“`
>**[Q21]** Compare the estimates in `m1`, `m2` and `m3`. What is the same? What is different? What would you say is the main advantage of using `felm()`?
“`{block q21, type = ‘written_answer’}
[your written answer]
“`
>**[Q22]** Explain why you need to cluster the standard errors.
“`{block q22, type = ‘written_answer’}
[your written answer]
“`