代写 html graph

title: Assignment 4 Notes
output:
htmldocument:
toc: yes

r globaloptions, includeFALSE
knitr::optschunksetcollapseTRUE

General Issues

Make sure you name your files as requested, including matching the
specified use of upper and lower case. This matters on file systems
that are casesensitive.

Make sure to commit your work to your local repository and push your
commits to GitHub. We can only see what is on GitHub, not what is on
your computer. You can check what we see by going to the GitHub web
interface.

Include your name and the date in the header of your .Rmd file
using author: and date: tags.

Your HTML file should be a report of your findings.

Any graph you show should be discussed in your narrative.

Any code you show should be discussed in your narrative.

If you do not need to discuss a piece of code in the narrative,
use echo FALSE to avoid showing it.

Plotting A against B means mapping A to the vertical axis and
B to the horizontal axis, so using aesx B, y A.

1. Life Expectancy and GDP Per Capita

One way to select the subset of four years:

r, message FALSE
librarydplyr
libraryggplot2
librarygapminder
gap filtergapminder, year 10 7 year 1977

Another possibility is
r
gap1 filtergapminder, year in c1977, 1987, 1997, 2007
identicalgap, gap1

A faceted plot of life expectancy against GDP per capita, with color
encoding continent and area encoding population size:

r
ggplotgap, aesgdpPercap, lifeExp, color continent, size pop
geompoint scalesizeareamaxsize8 facetwrapyear

Using facets

ensures all views use common axes;

organizes the plots so they can be viewed as a unit instead of
requiring scrolling;

makes efficient use of screen real estate by only showing one
legend, and axes only along the outer margins.

These are essentially four frames from the
Gapminder animationhttps:www.gapminder.orgtoolscharttypebubbles.

2. Fuel Economy

A plot of city fuel economy level against the engine displacement with
color encoding the number of cylinders and shape encoding the
transmission type:

r
mpg1 mutatempg, cyl factorcyl, trans substrtrans, 1, 4
ggplotmpg1, aesy cty, x displ, color cyl, shape trans
geompointsize 2.5

Even though cyl is a numeric attribute it seems best to treat is as a
categorical one.

When encoding an attribute in point color it is useful to make the
points larger so the color can be perceived better. This has to be
balanced against overplotting. This is a case where defaults often
need to changed.

Using substr is a useful quick way to split the transmission types
into manual and automatic, but does not result in ideal labels.

For a given displacement level manual transmission cars are
generally more fuel efficient.

Cars with smaller engines and fewer cylinders are more fuel
efficient.

Five cylinders may seem like an odd number; the cars with five
cylinders in this data set are all from one manufacturer:

r
filtermpg, cyl 5

A Bar chart showing the distribution of transmission types within
cylinder counts:

r
ggplotmpg1, aesx cyl, fill trans geombarposition dodge

or as a stacked bar chart standardized to show relative proportions:

r
ggplotmpg1, aesx cyl, fill trans geombarposition fill

Manual transmissions are about equally common as automatic
transmissions among four and five cylinder cars, but less common on
cars with more cylinders.

3. Fuel Economy Again

Reading the data:

r, message FALSE
libraryreadr
if ! file.existsvehicles.csv.zip
download.filehttp:www.stat.uiowa.edulukedatavehicles.csv.zip,
vehicles.csv.zip
newmpg readcsvvehicles.csv.zip, guessmax 10000

From the documentation for the
datahttps:www.fueleconomy.govfegwsindex.shtmlvehicle the
city08 variable seems a reasonable match to the cty variable in
the mpg data set.

Select data for models from 2009 to the present, pull out the most
useful variables, renaming some, and making trans a factor:

r
newmpg1 filternewmpg, year 2009
newmpg1 selectnewmpg1,
make, model, year,
cty city08,
trans trany,
cyl cylinders,
displ
newmpg1 mutatenewmpg1, trans factortrans
summarynewmpg1

There are many variants of trans that need to be combined into
manual and automatic.
cyl and displ have some NA values.

trans levels can be combined using substr again or using
fctrecode or fctcollapse from the forcats package.

Using fctcollapse along with grep:

r
libraryforcats
tlevs levelsnewmpg1trans
headgrepAuto, tlevs, value TRUE
headgrepManu, tlevs, value TRUE
ntrans fctcollapsenewmpg1trans,
Automatic grepAuto, tlevs, value TRUE,
Manual grepManu, tlevs, value TRUE
newmpg1 mutatenewmpg1, trans ntrans

There are many levels to cyl:

r
summaryfactornewmpg1cyl

Collapsing the lower and higher levels will make color encoding more
effective:

r
ncyl fctcollapsefactornewmpg1cyl,
2 or 3 c2, 3,
10 or more c10, 12, 16
levelsncyl
newmpg2 mutatenewmpg1, cyl ncyl

An initial plot:

r
ggplotnewmpg2, aesy cty, x displ, color cyl, shape trans
geompoint

The large cty value with displ zero and cyl equal to NA is
worth a look:

r
filternewmpg2, displ 0

There are also vehicles with displ equal to NA:

r
filternewmpg2, is.nadispl

These are dropped from the plot, though the range of their cty
values affects the default range shown in the plot.

By encoding these as displ 0 we can include these vehicles in the plot.

!
Specifying an explicit shape to use for na.value makes the points with
NA trans values visible.

r
newmpg3 mutatenewmpg2, displ ifelseis.nadispl, 0, displ

ggplotnewmpg3, aesy cty, x displ, color cyl, shape trans
geompointna.rm TRUE
scaleshapediscretena.value 21

Considering only vehicles with positive and nonNA displ values
matches the mpg data:

r
newmpg4 filternewmpg3, displ 0
p ggplotnewmpg4,
aesy cty, x displ, color cyl, shape trans
geompointna.rm TRUE
scaleshapediscretena.value 21
p

The distribution of transmission type within cylinder levels:

r
ggplotnewmpg1, aesx factorcyl, fill trans geombarposition fill

Fuel efficiency has improved, especially among vehicles with
smaller engines.

Manual transmissions are less common in this data base.

NA values are often important, and making them visible can be helpful.

What do fuel economy numbers for electric vehicles mean?

!
Local Variables:
mode: polymarkdownR
mode: flyspell
End: