程序代写代做代考 CS570 Biomedical Science & Health IT

CS570 Biomedical Science & Health IT

CS544 D1
Foundations of Analytics
Lecture 6

Guanglan Zhang
1

1

Bivariate Data – Un-summarized Data

If the data for the two categorical variables is given in the unsummarized form (the actual values), the contingency table can be created.
2

2

Bivariate Data – Summarized Data

Sometimes data comparing two variables is available in summarized form
The distribution of each variable separately is the marginal distribution of that variable.
In a two-way table, adding the rows or the columns gives the marginal distribution of the corresponding variable.

Graphical Summarization of Two-way Tables
The mosaic plot is a graphical display showing the relationship among two or more categorical variables.
The bar plot can also be used for the graphical presentation of the two-way data.

3

3

Bivariate Data – Relationships in Numeric Data

Graphical Representation
A scatterplot is used to visualize the relationship between two numerical variables.
Use plot() function to draw the scatterplot

> plot(data$explanatoryvariable, data$responsevariable)
Use main, xlab, and ylab to label the picture appropriately
Use xlim and ylim to control x and y axises
Change the type of point using pch
and/or the color of the point using col

4

4

Multivariate Data

Three-way or (n-way) contingency tables show the relationships among three (or n variables using multiple two-way tables.

Graphical Summarization
The box plot can be used to show graphical representation of independent samples.
The scatterplot matrix shows the pair-wise relationships between the given variables using a scatter plot for each pair.
The bar plot and the mosaic plot can be used to show graphical representation of summarized data.

5

5

Handling null values

R supports: NULL, NA, NaN, Inf/-Inf

NULL – It is a reserved word. It is returned when an expression or function results in an undefined value

NA – a logical constant of length 1 indicating a missing value

NaN – stands for Not A Number.

Inf / -Inf – stands for infinity or negative infinity. It is a result of storing a large number or a product of division by zero.

6

6

In-class quizzes

Go to https://b.socrative.com

Enter classroom: ZHANG6334

7

7

/docProps/thumbnail.jpeg