CS计算机代考程序代写 data science Week-1 Introduction to R

Week-1 Introduction to R

Some of the slides are adapted from the lecture notes provided by Prof. Antoine Saure and Prof. Rob Hyndman

Business Forecasting Analytics
ADM 4307 – Fall 2021

Introduction to R and RStudio

Ahmet Kandakoglu, PhD

13 September, 2021

R and RStudio

R

• A software package for statistical computing and graphics

• Open source and therefore available free of charge.

• Interactive and thus suitable for data analysis.

• Very simple and intuitive

RStudio

• A powerful and productive user interface for R.

• Free and open source, and works great on Windows, MacOS and Linux.

Fall 2021 ADM 4307 Business Forecasting Analytics 2

RStudio

Fall 2021 ADM 4307 Business Forecasting Analytics 3

The console is

where you can type

commands and see

output

The source is where

you can type scripts

and see the data

tables

The workspace tab

shows all the active

objects (see next slide).

The history tab shows

a list of commands

used so far.

The files tab shows all

the files and folders in

your default

workspace.

The plots tab will

show all your graphs.

The packages tab will

list a series of

packages or add-ons

needed to run certain

processes.

For additional info see

the help tab

Installing R and RStudio

• Download and install R

• Download and install RStudio Desktop

• Two ways to install required packages

• Run RStudio. On the “Tools” menu, click on “Install Packages” and install the packages

“fpp3” and “tidyverse” (make sure “Install Dependencies” is checked)

• Run RStudio. Run the code “install.packages(c(“tidyverse”, “fpp3”))” using the Console

• That’s it! You should now be ready to go.

Fall 2021 ADM 4307 Business Forecasting Analytics 4

https://cran.r-project.org/
https://www.rstudio.com/products/rstudio/download/

Main packages

• Data manipulation and plotting functions

• library(tidyverse)

• Time series manipulation

• library(tsibble)

• Tidy time series data

• library(tsibbledata)

• Time series graphics and statistics

• library(feasts)

• Forecasting functions

• library(fable)

• All of the above

• library(fpp3)

Fall 2021 ADM 4307 Business Forecasting Analytics 5

Getting Started with R

• Load the fpp3 package using the Packages tab. This can also be done by

typing library(fpp3) in the Console panel or in your script.

• Using the File menu, choose “Import Dataset” and import the data from the

tute1.csv file

• The data is now saved as an object in your Global Environment workspace.

Clicking the name of the object will cause it to be viewed. Typing the name of

the object in the Console tab will cause it to be printed to the console.

Fall 2021 ADM 4307 Business Forecasting Analytics 6

Use R

See what the following commands do:

tute1[,2]

tute1[,”Sales”]

tute1[5,]

tute1[1:10,3:4]

tute1[1:10,2] <- 0 tute1[1:20,] Notice that “<-” means to assign the value on the right to the object on the left Fall 2021 ADM 4307 Business Forecasting Analytics 7 Convert the Data Convert the data to time series: # First, set the working directory from the menu “Session/Set Working Directory” # Then, execute the following tute1 <- read.csv("tute1.csv") # Create the time series object tute1 <- ts(tute1[,-1], start=1981, frequency=4) tute1_tsibble <- as_tsibble(tute1) The "[,-1]'' removes the first column which contains the quarters as we don’t need them now Fall 2021 ADM 4307 Business Forecasting Analytics 8 Time Series Plots Construct time series plots of each of the three series: plot(tute1) Fall 2021 ADM 4307 Business Forecasting Analytics 9 Time Series Plots • Try the following plots and figure out what is being plotted in each case: tute1_tsibble %>% filter(key == “Sales”) %>% gg_season(value)

tute1_tsibble %>% filter(key == “AdBudget”) %>% gg_season(value)

tute1_tsibble %>% filter(key == “GDP”) %>% gg_season(value)

• The notation data[,”x”] means: column named x in the data set named data

• What features do you notice about each of the series AdBudget, Sales and GDP?

Fall 2021 ADM 4307 Business Forecasting Analytics 10

Scatterplots

• Construct scatterplots of (AdBudget,Sales) and (GDP,Sales), with Sales on

the vertical axes:

plot(Sales ~ AdBudget, data=tute1)

plot(Sales ~ GDP, data=tute1)

Fall 2021 ADM 4307 Business Forecasting Analytics 11

Scatterplot Matrix

• Do a scatterplot matrix of the three variables:

pairs(as.data.frame(tute1))

Fall 2021 ADM 4307 Business Forecasting Analytics 12

Summary Information

• Use the summary command to get summary information about the data:

summary(tute1)

Fall 2021 ADM 4307 Business Forecasting Analytics 13

Histogram

• Produce some more plots of the data:

hist(tute1[,”Sales”])

Fall 2021 ADM 4307 Business Forecasting Analytics 14

Histogram

• Produce some more plots of the data:

hist(tute1[,”AdBudget”])

Fall 2021 ADM 4307 Business Forecasting Analytics 15

Boxplot

boxplot(tute1[,”Sales”])

Fall 2021 ADM 4307 Business Forecasting Analytics 16

Boxplot

boxplot(as.data.frame(tute1))

or

boxplot(tute1)

Fall 2021 ADM 4307 Business Forecasting Analytics 17

Correlation Test

• Also do a correlation test:

cor.test(tute1[,”Sales”],tute1[,”AdBudget”])

Fall 2021 ADM 4307 Business Forecasting Analytics 18

RStudio as a Calculator

• Now try using RStudio as a calculator. Figure out what each of the following is

doing:

(100+2)/3

5*10^2

1/0

0/0

(0i-9)^(1/2)

sqrt(2 * max(-10, 0.2, 4.5))

x <- sqrt(2 * max(-10, 0.2, 4.5)) + 100 x log(100) log(100, base=10) • Save the workfile, and exit RStudio Fall 2021 ADM 4307 Business Forecasting Analytics 19 More Tutorials • There are dozens of R tutorials available on the web. Some of the best of them are listed below: • Try R Code School • R tutorial (Clarkson University) • DataCamp Introduction to R • Coursera R Programming • R for Data Science by Garrett Grolemund and Hadley Wickham Fall 2021 ADM 4307 Business Forecasting Analytics 20 http://tryr.codeschool.com/ http://www.cyclismo.org/tutorial/R https://www.datacamp.com/courses/free-introduction-to-r https://www.coursera.org/learn/r-programming https://r4ds.had.co.nz/ Business Forecasting Analytics ADM 4307 – Fall 2021 Introduction to R and RStudio Fall 2021 ADM 4307 Business Forecasting Analytics 21