Review of VBA and R
2020/04/15
VBA Makeup Quiz
VBA codes
Calculate the portfolio:
Clear the data:
Overview for R Language
• Basic data types
• Operators in R Programming • Control Structures
• Functions for reading data
• Loop Functions
• Examples
Basic data types
R Programming works with numerous data types, including • Scalars
• Vectors (numerical, character, logical)
• Matrices
• Lists
• Data frames
Vectors
• The Vector is the most basic Data structure in R programming. R Vector can hold a collection of similar types of elements (type
may be an integer, double, char, Boolean, etc.)
Create R Vector using Range
Create R Vector using Sequence (seq) Operator
Creating R Vector using concatenation c
Vectors
• Access R Vector Elements
Vectors
• In R Programming, we can manipulate the Vector elements in following ways:
Vectors
• Important Functions of Vector in R
typeof(Vector): This method tells you the data type of the vector.
Sort(Vector): This method helps us to sort the items in the Ascending order.
length(Vector): This method counts the number of elements in a vector.
head(Vector, limit): This method return the top six elements (if you Omit the limit). If you specify the limit as 4 then, it returns the first 4 elements.
tail(Vector, limit): It returns the last six elements (if you Omit the limit). If you specify the limit as 2, then it returns the last two elements.
Vectors
• Arithmetic Operations on R Vector
Matrices
• The Matrix in R is the most two-dimensional Data structure. In R Matrix, data is stored in row and columns, and we can access the matrix element using both the row index and column index (like an Excel File).
If you observe the above syntax, data is a Vector and
• nrow: Please specify the number of Rows you want to create. For example, nrow = 3 will create a Matrix of 3 Rows
• ncol: Please specify the number of Columns you want to create. For example, ncol = 2 will create a Matrix of 2 Columns
• byrow: It is FALSE by default, but you can as per the requirement. If it is TRUE, then Matrix elements will be arranged by Rows
• dimnames: It is used to change the default Row and Column names to more meaningful names.
Matrices
• Create Matrix in R
Matrices
• Simple approach to create matrix in R
Matrices
• Accessing R Matrix Elements
Matrices
• Accessing Subset of a Matrix in R
Matrices
• Modify R Matrix Elements
Matrices
• R Matrix Addition and Subtraction
Lists
• The R List is one of the most powerful and useful data structure in real-time. Lists allow us to store different types of elements such as integer, string, Vector, matrix, list (nested List), Data Frames, etc. Because of this, most people called it an advanced vector.
• Create R List
Lists
• Creating R List using Vectors
Lists
• Creating R List using Matrix, Vectors
Lists
• Accessing R List Elements
In R programming, we can access the elements in a List using the index position. Using this index value, we can access or alter/change each and every individual element present in the List.
Data frames
The Data Frame in R is a table or two-dimensional data structure. In R Data Frames, data is stored in row and columns, and we can access the data frame elements using the row index and column index. The following are some of the characteristics of the R Data Frame:
• A data frame is a list of variables, and it must contain the same number of rows with unique row names.
• The Column Names should not be Empty
• Although r data frame supports duplicate column names by using check.names = FALSE, It is always preferable to use unique Column names.
• The data stored in a data frame can be Character type, Numerical type, or Factors.
Data frames
• Create Data Frame in R
Data frames
Data frames
• Access Elements of a Data Frame
Data frames
• Access Elements of a Data Frame
Data frames
• Accessing R Data Frame items using $
Data frames
• Accessing Low Level elements of R Data Frame
Data frames
• Modifying R Data Frame Elements
Data frames
Add Elements to R Data Frame
• cbind(Data Frame, Values): This method is used to add extra Columns with values. In general, we prefer Vector as values parameter
• rbind(Data Frame, Values): This method is used to add extra Row with values.
Operators
Operator Description
Operator <, <=
>, >= ==
!=
^ or ** !X
X & Y X I Y
isTRUE(x) %in%
Description
Less than, less than or equal to greater than, greater than or equal to Exactly equal to
Not equal to
Exponentiation
Not X
X AND Y
X OR Y
test if X is TRUE
x %in% c(2,4): if x belongs to c(2,4), return True
+
–
*
/
^ or ** x %% y
x %/% y
Addition
Subtraction
Multiplication
Division
Exponentiation
modulus (x mod y) 5%%2 is 1
integer division 5%/%2 is 2
R If Else Statement
R For Loop
• The R For Loop is used to repeat a block of statements until there are no items in the Vector. For loop is one of the most used loops in any programming language. Let us see the syntax of the For Loop in R:
While Loop in R
• The While loop in R Programming is used to repeat a block of statements for a given number of times until the specified expression is False.
R Repeat
• The R Repeat executes the statements inside the code block multiple number times.
R Repeat
R Break Statement
• The Break and Next in R Programming are the two essential statements used to alter the flow of a program. In R Programming, Loops are used to execute a particular block of statements for N number of times, until the test expression is false. There will be some situations where we have to terminate the loop without executing all the statements. In these situations, we can use this R Break statement and Next statements.
• The R Break statement is very useful to exit from any loop such as For Loop, While Loop, and Repeat Loop. While executing these loops, if R finds the break statement inside them, it will stop executing the statements and immediately exit from the loop.
R Break Statement
R Next Statement
• We generally use this R Next statement inside the For Loop and While Loop. While executing these loops, if the compiler finds the R Next statement inside them, it will stop the current loop iteration and starts the new iteration from the beginning.
R Read CSV Function
The basic syntax to read the data from a csv file using R programming is as shown below
• file: You have to specify the file name, or Full path along with file name. You can also use the URL of the external (online) csv files. For example, sample.csv or “C:/Users/Suresh/Documents/R Programs/sample.csv”
• header: If the csv contains Columns names as the First Row then please specify TRUE otherwise, FALSE
• sep: It is a short form of separator. You have to specify the character that is separating the fields. ” , “ means data is separated by comma
• quote: If your character values (FirstName, Education column etc) are enclosed in quotes then you have to specify the quote type. For double quotes we use: quote = “\”” in r read.csv function
R Read CSV Function
Accessing csv file Data
• In R programming, read.csv function will automatically convert the data into Data Frame. So, all the functions that are supported by the Data Frame can be used on csv data.
NOTE: User-defined function name should exactly match with the calling function.
Functions in R Programming
• R Functions Syntax
•Function_Name: It can be any name you wish to give. Avoid using the system reserved keywords.
•Arguments: Every function accepts 0 or more arguments, and it completely depends upon the user requirements. For example, add(2, 3).
•Local Variable Declaration: Sometimes, we may need some temporary variable to operate within a particular function. Then we can declare those variables inside the function. Remember, these variables are available to this particular function only; we can’t access them outside this function.
•Logic: Any mathematical or any calculations you want to implement.
•Executable Statement: Any print statements to print some data from this particular function.
Functions in R Programming
Loop Functions
Loop Functions
Example 1: transpose of a matrix
Example 1: Solution
# a poor alternative to built-in t() function
mytrans <- function(x) { if (!is.matrix(x)) {
warning("argument is not a matrix: returning NA")
return(NA_real_) }
y <- matrix(1, nrow=ncol(x), ncol=nrow(x)) for (i in 1:nrow(x)) {
for (j in 1:ncol(x)) {
y[j,i] <- x[i,j] }
} return(y) }
# try it
z <- matrix(1:10, nrow=5, ncol=2) tz <- mytrans(z)
Example 2: R Home_work
Example 2: Solution
Task 1
Task 2
Example 3:Learn by Example
• Review the LearnRbyExample.R
Example 4
manager<-c(1:5)
date<-c("10/24/08","10/28/08","10/1/08","10/12/08","5/1/09")
country<-c("us","us","uk","uk","uk")
gender<-c("M","F","F","M","F")
age<-c(32,45,25,39,99)
q1<-c(5,3,3,3,2)
q2<-c(4,5,5,3,2)
q2<-c(5,2,2,4,1)
q3<-c(5,2,5,4,1)
q4<-c(5,5,5,NA,2)
q5<-c(5,5,2,NA,1)
leadership<-data.frame(manager,date,country,gender,age,q1,q2,q3,q4,q5,stringAsFactors=F)
leadership
#please use within() to modify the original dataframe, add a column named “agecat”.
If the age<55, agecat=“young”; If age>75, agecat=“elder”; if 55<=age<=75, agecat=“middle age”
Example 5
cash<-data.frame(company = c("A", "A", "B"), cash_flow = c(100, 200, 300),
year = c(1, 3, 2)) mylist<-list(cash,cash)
lapply(mylist, function(x) x[x==100]=NA )
#run this codes and check whether the result is right. You are supposed to replace 100 in the orinal list to NA. If it is wrong, how do you plan to modify the codes
Example 5
cash<-data.frame(company = c("A", "A", "B"), cash_flow = c(100, 200, 300),
year = c(1, 3, 2)) sapply(cash,class)
apply(cash,2,class)
#What’s the difference?
If you use apply(x), “as.matrix” will be applied
to x at first, which will transform each element into a charater.
Be careful when using
lapply(), sapply(), apply(), tapply(), mapply()
Example 6
authors <- data.frame(
surname = I(c("Tukey", "Venables", "Tierney", "Ripley", "McNeil")), nationality = c("US", "Australia", "US", "UK", "Australia"), deceased = c("yes", rep("no", 4)))
books <- data.frame(
name = I(c("Tukey", "Venables", "Tierney",
"Ripley", "Ripley", "McNeil", "R Core")), title = c("Exploratory Data Analysis",
"Modern Applied Statistics ...", "LISP-STAT",
"Spatial Statistics", "Stochastic Simulation", "Interactive Data Analysis",
"An Introduction to R"),
other.author = c("Ripley", "Ripley", "Ripley", NA, NA, "Ripley", "Venables & Smith"))
authors books
#merge authors and books by merging 'susername' and 'name' . Only rows with data from both x and y are included in the output
#merge authors and books by merging 'susername' and 'name' . Extra rows will be added to the output, one for each row in x that has no matching row in y.
#merge authors and books by merging 'susername' and 'name' . Extra rows will be added to the output, one for each row in x that has no matching row in y.
#merge authors and books by merging 'susername' and 'name' , All extra rows will be added to the output
#list all the nationality of author who have written book with Ripley