Week-1 Introduction to R
Some of the slides are adapted from the lecture notes provided by Prof. Antoine Saure and Prof. Rob Hyndman
Business Forecasting Analytics
ADM 4307 – Fall 2021
Introduction to R and RStudio
Ahmet Kandakoglu, PhD
13 September, 2021
R and RStudio
R
• A software package for statistical computing and graphics
• Open source and therefore available free of charge.
• Interactive and thus suitable for data analysis.
• Very simple and intuitive
RStudio
• A powerful and productive user interface for R.
• Free and open source, and works great on Windows, MacOS and Linux.
Fall 2021 ADM 4307 Business Forecasting Analytics 2
RStudio
Fall 2021 ADM 4307 Business Forecasting Analytics 3
The console is
where you can type
commands and see
output
The source is where
you can type scripts
and see the data
tables
The workspace tab
shows all the active
objects (see next slide).
The history tab shows
a list of commands
used so far.
The files tab shows all
the files and folders in
your default
workspace.
The plots tab will
show all your graphs.
The packages tab will
list a series of
packages or add-ons
needed to run certain
processes.
For additional info see
the help tab
Installing R and RStudio
• Download and install R
• Download and install RStudio Desktop
• Two ways to install required packages
• Run RStudio. On the “Tools” menu, click on “Install Packages” and install the packages
“fpp3” and “tidyverse” (make sure “Install Dependencies” is checked)
• Run RStudio. Run the code “install.packages(c(“tidyverse”, “fpp3”))” using the Console
• That’s it! You should now be ready to go.
Fall 2021 ADM 4307 Business Forecasting Analytics 4
https://cran.r-project.org/
https://www.rstudio.com/products/rstudio/download/
Main packages
• Data manipulation and plotting functions
• library(tidyverse)
• Time series manipulation
• library(tsibble)
• Tidy time series data
• library(tsibbledata)
• Time series graphics and statistics
• library(feasts)
• Forecasting functions
• library(fable)
• All of the above
• library(fpp3)
Fall 2021 ADM 4307 Business Forecasting Analytics 5
Getting Started with R
• Load the fpp3 package using the Packages tab. This can also be done by
typing library(fpp3) in the Console panel or in your script.
• Using the File menu, choose “Import Dataset” and import the data from the
tute1.csv file
• The data is now saved as an object in your Global Environment workspace.
Clicking the name of the object will cause it to be viewed. Typing the name of
the object in the Console tab will cause it to be printed to the console.
Fall 2021 ADM 4307 Business Forecasting Analytics 6
Use R
See what the following commands do:
tute1[,2]
tute1[,”Sales”]
tute1[5,]
tute1[1:10,3:4]
tute1[1:10,2] <- 0 tute1[1:20,] Notice that “<-” means to assign the value on the right to the object on the left Fall 2021 ADM 4307 Business Forecasting Analytics 7 Convert the Data Convert the data to time series: # First, set the working directory from the menu “Session/Set Working Directory” # Then, execute the following tute1 <- read.csv("tute1.csv") # Create the time series object tute1 <- ts(tute1[,-1], start=1981, frequency=4) tute1_tsibble <- as_tsibble(tute1) The "[,-1]'' removes the first column which contains the quarters as we don’t need them now Fall 2021 ADM 4307 Business Forecasting Analytics 8 Time Series Plots Construct time series plots of each of the three series: plot(tute1) Fall 2021 ADM 4307 Business Forecasting Analytics 9 Time Series Plots • Try the following plots and figure out what is being plotted in each case: tute1_tsibble %>% filter(key == “Sales”) %>% gg_season(value)
tute1_tsibble %>% filter(key == “AdBudget”) %>% gg_season(value)
tute1_tsibble %>% filter(key == “GDP”) %>% gg_season(value)
• The notation data[,”x”] means: column named x in the data set named data
• What features do you notice about each of the series AdBudget, Sales and GDP?
Fall 2021 ADM 4307 Business Forecasting Analytics 10
Scatterplots
• Construct scatterplots of (AdBudget,Sales) and (GDP,Sales), with Sales on
the vertical axes:
plot(Sales ~ AdBudget, data=tute1)
plot(Sales ~ GDP, data=tute1)
Fall 2021 ADM 4307 Business Forecasting Analytics 11
Scatterplot Matrix
• Do a scatterplot matrix of the three variables:
pairs(as.data.frame(tute1))
Fall 2021 ADM 4307 Business Forecasting Analytics 12
Summary Information
• Use the summary command to get summary information about the data:
summary(tute1)
Fall 2021 ADM 4307 Business Forecasting Analytics 13
Histogram
• Produce some more plots of the data:
hist(tute1[,”Sales”])
Fall 2021 ADM 4307 Business Forecasting Analytics 14
Histogram
• Produce some more plots of the data:
hist(tute1[,”AdBudget”])
Fall 2021 ADM 4307 Business Forecasting Analytics 15
Boxplot
boxplot(tute1[,”Sales”])
Fall 2021 ADM 4307 Business Forecasting Analytics 16
Boxplot
boxplot(as.data.frame(tute1))
or
boxplot(tute1)
Fall 2021 ADM 4307 Business Forecasting Analytics 17
Correlation Test
• Also do a correlation test:
cor.test(tute1[,”Sales”],tute1[,”AdBudget”])
Fall 2021 ADM 4307 Business Forecasting Analytics 18
RStudio as a Calculator
• Now try using RStudio as a calculator. Figure out what each of the following is
doing:
(100+2)/3
5*10^2
1/0
0/0
(0i-9)^(1/2)
sqrt(2 * max(-10, 0.2, 4.5))
x <- sqrt(2 * max(-10, 0.2, 4.5)) + 100 x log(100) log(100, base=10) • Save the workfile, and exit RStudio Fall 2021 ADM 4307 Business Forecasting Analytics 19 More Tutorials • There are dozens of R tutorials available on the web. Some of the best of them are listed below: • Try R Code School • R tutorial (Clarkson University) • DataCamp Introduction to R • Coursera R Programming • R for Data Science by Garrett Grolemund and Hadley Wickham Fall 2021 ADM 4307 Business Forecasting Analytics 20 http://tryr.codeschool.com/ http://www.cyclismo.org/tutorial/R https://www.datacamp.com/courses/free-introduction-to-r https://www.coursera.org/learn/r-programming https://r4ds.had.co.nz/ Business Forecasting Analytics ADM 4307 – Fall 2021 Introduction to R and RStudio Fall 2021 ADM 4307 Business Forecasting Analytics 21