Lab #1 Assignment
Due Date: October 11th, 11:59pm
This assignment is worth 10% of your final grade and has 4 sections. You will need to submit your ¡°lab_1_lastname_firstname.R¡± file that you create in Part 1. Be sure to save while you write your code for this assignment. Please number each part in your assignment in the form of a comment.
Section 1.
Copyright By PowCoder代写 加微信 powcoder
Part 1.1. Create a new folder called lab_1_assignment on your computer and change your R working directory to that folder. Create a new file and save it as lab_1_lastname_firstname in this folder. All your code for this assignment should be saved in this .R file. (0.5 mark)
Section 2.
Part 2.1. From September 29th to October 5th, you were asked to document the following:
– the date (YYYY/MM/DD)
– hours of sleep (hours)
– commute time (minutes)
– delays in commute (yes/no)
– social interactions (number)
– happiness (number, 1 [very unhappy] to 10 [very happy])
Using this data, please proceed to Part 2.2.
Part 2.2. You will need to create six vectors in R, one for each of the different types of data. Please assign the corresponding data to the following vector names: date, sleep_hours, commute_time, CommuteDelays, social_interactions, and happiness. Below each vector, you will also need to comment what type of variable is contained in that vector (excluding the vector that contains the dates). (3 marks)
Part 2.3. Using the six vectors from Part 2.2., you will need to combine these vectors to create one dataset/dataframe. You can name this dataframe whatever you would like and this dataframe should appear in your Global Environment. There should be seven rows of data in this dataset. Please print this dataset so it is fully visible in your console. (1.5 marks)
Section 3.
Part 3.1. Last week you were instructed to save the data you collected in a CSV file. Be sure to save or move this CSV file to your lab_1_lastname_firstname folder. Import this csv file into RStudio and assign it the name my_data. Print the dataframe to your console. (0.5 marks)
Part 3.2. You will need to create a new variable/column in the my_data dataset which should be called sleep_more_6hours. The sleep_more_6hours column will only contain
two values: either a 0 or 1. To assign the appropriate value (i.e. 0 or 1) to each row of data, you will need to use the data from your sleep_hours column and use the following criteria: assign a 0 if the number of sleep hours was less than or equal to 6, and assign a 1 if the number of sleep hours was greater than 6. (1.0 mark)
Part 3.3. Using the new column you created in Part 3.2. (i.e. sleep_more_6hours), identify how many rows of data contain sleep hours less than or equal to 6 hours (i.e. 0) and how many rows of data contain sleep hours more than 6 hours (i.e. 1). Please comment the number of rows for each value. Hint: You need to use a function in R to receive full marks for this part. (1.0 mark)
Part 3.4. During the R portion of lecture, you learned how to subset datasets. Using these new acquired skills, create a new dataframe (you can name this dataframe whatever you would like) which only contains your data from the first three days of data
collection (i.e. September 29th, September 30th, and October 1st). Hint: This new dataframe should contain three rows of data with seven columns/variables (please include the new column that you created in Part 3.3). (0.5 mark)
Part 3.5. For this part, you will be creating a new vector in R. You can name this vector whatever you would like, and this vector will need to contain all the data (i.e. 7 elements) from one of your columns in the my_data dataframe. Be sure to use a different vector name than what was used in Part 2.2. (0.5 mark)
Part 3.6. Using the my_data dataframe from Part 3.3 (i.e. the dataframe should include
the column you created in Part 3.3), completely remove your last row of data (i.e. October 5th data) so it no longer exists in your my_data dataframe. (0.5 mark)
Section 4. Reflecting back on what you have learned in lecture over the past 4 weeks, you will need to provide a response to the following questions in the form of a comment
at the bottom of your code: Using the data that you collected over the past 7 days, what variables do you suspect were the most accurate to collect/capture? On the other hand, which variable(s) do you feel was the least accurate and why? (1.0 mark)
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com