—
title: “Assignment 8 Notes”
output:
html_document:
toc: yes
—
“`{r global_options, include=FALSE}
knitr::opts_chunk$set(collapse=TRUE)
“`
## 1. Air Pollution Data
When there is a clear dependent variable, that variable should go on
the vertical axis; here that is `ozone.level`.
If you use `ggpairs` it is a good idea to put the dependent variable
last so you have a plot with the dependent variable on the vertical
axis against each predictor variable:
“`{r}
library(SemiPar)
data(calif.air.poll)
library(GGally)
ggpairs(calif.air.poll[c(2 : 4, 1)])
“`
The conditional distributions show an increasing relation between
ozone level and inversion temperature; the slope decreases with
increasing inversion height.
“`{r}
library(lattice)
xyplot(ozone.level ~ inversion.base.temp |
equal.count(inversion.base.height, 9, overlap = 0),
type = c(“p”, “smooth”), data = calif.air.poll, col.line=”red”)
“`
The top two height panels both contain points with heights of 5000.
## 2. Olive Oils
“`{r, message = FALSE}
library(dplyr)
library(ggplot2)
library(GGally)
olives <- read.csv("http://homepage.divms.uiowa.edu/~luke/data/olives.csv")
```
Focus on the northern region:
```{r}
olivesN <- filter(olives, Region == "North")
olivesN <- droplevels(olivesN)
```
A parallel coordinates plot of all the values suggests looking more
closely at `oleic`, `stearic`, and `linolenic`:
```{r}
ggparcoord(olivesN, 3:10, groupColumn="Area", scale = "uniminmax")
```
The plot of In the plot of `stearic` against `oleic` shows the the
Umbria oils all have `oleic` values above 7870:
```{r}
ggplot(olivesN) +
geom_point(aes(oleic, stearic, color = Area)) +
geom_vline(aes(xintercept = 7870), linetype = 2)
```
Among the oils with `oleic > 7870` all Umbria oils, and only the Umbria oils
have values of `stearic < 230` and `linolenic > 15`:
“`{r}
ggplot(filter(olivesN, oleic > 7870)) +
geom_point(aes(linolenic, stearic, color = Area)) +
geom_vline(aes(xintercept = 15), linetype = 2) +
geom_hline(aes(yintercept = 230), linetype = 2)
“`