程序代写代做代考 data science Introduction to information system

Introduction to information system

R Graphics

lattice & ggplot2

Bowei Chen, Deema Hafeth and Jingmin Huang

School of Computer Science

University of Lincoln

CMP3036M/CMP9063M

Data Science 2016 – 2017 Workshop

Today’s Objectives

• Study the following slides:

– Part I: lattice

– Part II: ggplot2

• Do the exercises 1-11

• Do the additional exercises 1-3

(which can help you to understand and review our last week’s lecture)

lattice

lattice was developed and
maintained by Deepayan

Sarkar, Assistant Professor

at Indian Statistical Institute.

http://www.isid.ac.in/~deepayan/

http://www.isid.ac.in/~deepayan/

Histogram by Wrap

> pl <- histogram(~ gcsescore | + factor(score), data = Chem97) > print(pl)

Density by Wrap

> pl <- densityplot( + ~ gcsescore | factor(score), + data = Chem97, + plot.points = FALSE, + ref = TRUE + ) > print(pl)

Density Plot by Different Colour

> pl <- densityplot( + ~ gcsescore, + data = Chem97, + groups = score, + plot.points = FALSE, + ref = TRUE, + auto.key = list(columns = 3) + ) > print(pl)

Boxplot by Wrap (1/2)

> pl <- bwplot( + gcsescore ^ 2.34 ~ gender | factor(score), + Chem97, + varwidth = TRUE, + layout = c(6, 1), + ylab = "Transformed GCSE score" + ) > print(pl)

Boxplot by Wrap (2/2)

> pl <- bwplot( + score~gcsescore |factor(gender), + data = Chem97, + groups = score, + plot.points = FALSE, + ref = TRUE, + xlab = "Average GCSE score") ) >print(pl)

Other Materials for lattice

Below the references will be useful:

• http://www.isid.ac.in/~deepayan/R-tutorials/labs/04_lattice_lab.pdf

• https://www.stat.auckland.ac.nz/~paul/RGraphics/chapter4.pdf

• https://fas-web.sunderland.ac.uk/~cs0her/Statistics/UsingLatticeGraphicsInR.htm

http://www.isid.ac.in/~deepayan/R-tutorials/labs/04_lattice_lab.pdf
https://www.stat.auckland.ac.nz/~paul/RGraphics/chapter4.pdf
https://fas-web.sunderland.ac.uk/~cs0her/Statistics/UsingLatticeGraphicsInR.htm

ggplot2

ggplot2 was developed by Hadley
Wickham, Chief Scientist at RStudio,

and an Adjunct Professor of Statistics

at the University of Auckland.

http://hadley.nz/

http://hadley.nz/

Histogram by Wrap

> pg <- + ggplot(Chem97, aes(gcsescore)) + + geom_histogram(binwidth = 0.5) + + facet_wrap( ~ score) > print(pg)

Density Plot by Wrap

> pg <- ggplot(Chem97, aes(gcsescore)) + + stat_density(geom = "path", + position = "identity") + + facet_wrap(~ score) > print(pg)

Density Plot by Different Colour

> pg <- ggplot(Chem97, aes(gcsescore)) + + stat_density(geom = "path", + position = "identity", + aes(colour = factor(score))) > print(pg)

Boxplot by Wrap (1/2)

> pg <- ggplot(Chem97, + aes(factor(gender), + gcsescore^2.34)) + + geom_boxplot() + + facet_grid(~score) + + ylab("Transformed GCSE score") > print(pg)

Boxplot by Wrap (2/2)

> pg <- ggplot(Chem97, + aes(factor(score), + gcsescore)) + + geom_boxplot() + + coord_flip() + + ylab("Average GCSE score") + + facet_wrap( ~ gender) > print(pg)

Other Materials for ggplot2

Below the references will be useful:

• http://www.ceb-institute.org/bbs/wp-content/uploads/2011/09/handout_ggplot2.pdf

• http://www.statmethods.net/advgraphs/ggplot2.html

• http://blog.echen.me/2012/01/17/quick-introduction-to-ggplot2/

• http://www.stat.wisc.edu/~larget/stat302/chap2.pdf

http://www.ceb-institute.org/bbs/wp-content/uploads/2011/09/handout_ggplot2.pdf
http://www.statmethods.net/advgraphs/ggplot2.html
http://blog.echen.me/2012/01/17/quick-introduction-to-ggplot2/
http://www.stat.wisc.edu/~larget/stat302/chap2.pdf

References

• W. Venables, D. Smith, and the R Core Team (2015) An Introduction to R.

• P. Teetor (2011) R Cookbook. O’Reilly.

• J. Adler (2012) R in a Nutshell, 2nd Edition, O’Reilly

Exercises

Exercise 1/11

Use lattice or ggplot package to draw the figure as below

> data(postdoc, package = “latticeExtra”)

> pl <- barchart(prop.table(postdoc, margin = 1), + xlab = "Proportion", + auto.key = list(adj = 1)) > print(pl)

Exercise 2/11

1) Read the dataset
PublicHealthEnglandDataT
ableDistrict.xlsx

2) Plot the following figure
using lattice package

Hint: xyplot

Exercise 3/11

1) Read the dataset

PublicHealthEnglandData

TableDistrict.xlsx

2) Plot the following figure

using ggplot2 package

Hint: 1) ggplot; 2) plot points:

3) by wrap

Exercise 4/11

1) Read the dataset “chem97” from “mlmRev” package

2) Plot the following figure using ggplot2 package

Exercise 5/11

1) Read the dataset “chem97” from “mlmRev” package

2) Plot the following figure using ggplot2 package

Exercise 6/11

1) Read the dataset “chem97” from “mlmRev” package

2) Plot the following figure using ggplot2 package

Exercise 7/11

1) Read the dataset “chem97” from “mlmRev” package

2) Plot the following figure using ggplot2 package

Exercise 8/11

1) Read the dataset “chem97”

from “mlmRev” package

2) Plot the following figure using

ggplot2 package

Exercise 9/11

1) Read the dataset “chem97” from

“mlmRev” package

2) Plot the following figure using

ggplot2 package

Exercise 10/11

1) Read the dataset “chem97” from

“mlmRev” package

2) Plot the following figure using

ggplot2 package

Exercise 11/11

1) Read the dataset “chem97” from “mlmRev” package

2) Plot the following figure using lattice package

Additional Exercises

Well done if you’ve completed the exercises. Once you complete these

additional exercises, you can leave the workshop sessions 

Additional Exercise (1/3)

Find the mean, median and mode of iris$Sepal.Length

Additional Exercise (2/3)

1) Generate 100 random numbers which follow the uniform distribution 𝑈(0,1).
What is the mean?

2) Repeat the above number generating process of 1000 times and record the

mean values. Draw a histogram. What distribution does the figure look like?

3) Does it help you to understand the “Central Limit Theorem”?

Additional Exercise (3/3)

1) Generate 10000 random numbers which follow the normal distribution

𝒩(0,1). How many numbers are larger than 0.5?

2) What is the z-score value with 𝑛 = 1 and what is the corresponding
probability from the table?

3) Record the mean value of 10 random numbers for 10000 times. How many

of these mean values are larger than 0.5?

4) What is the Z-score and the probability now?

Thank You!

bchen@lincoln.ac.uk

mailto:bchen@lincoln.ac.uk