程序代写代做代考 data science Assignment 3: Using the dplyr R package for bioprocess data

Assignment 3: Using the dplyr R package for bioprocess data
CHEN40770: Data Science For Biopharmaceutical Manufacturing 14/10/2020
Contents
1 Introduction 1 2 Submitting your assignment 2 3 Questions 2
1 Introduction
In our practical session we learned how to manipulate data with the dplyr package in R.
For this assignment you will use the simulated bioprocess dataset from a simulated penicillin fermenta- tion process created by Dr. Stephen Goldrick at Univerisity College London. For more informaton see www.industrialpenicillinsimulation.com
Load the data as follows:
library(chen40770data1)
Load the tidyverse packages as follows:
library(tidyverse)
## — Attaching packages —————————————————- tidyverse 1.3.0 —
## v ggplot2 3.3.1
## v tibble 3.0.1
## v tidyr 1.1.0
## v readr 1.3.1
v purrr 0.3.4
v dplyr 1.0.2
v stringr 1.4.0
v forcats 0.5.0
## Warning: package ¡¯dplyr¡¯ was built under R version 4.0.2
## — Conflicts ——————————————————- tidyverse_conflicts() —
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
1

2 Submitting your assignment
To submit this assignment add your answers to the R script in your project folder in R Studio Cloud. The R script should be properly commented and adhere to the tidverse style guide at all times. At the top of the R script please include your name and student number.
Important Remember to save your work.
3 Questions
Total: 10 Marks
**All questions must be answered using a single dplyr statement using
1. What is average Carbon evolution rate in g/L for the entire dataset? (2 marks)
2. What is the percentage of missing values in the Offline Biomass concentration column (2 marks)
Hint: The missing values can be identified using the is.na() function. It is also possible to conduct a calculation within the summarise function
3. What is the minimum and maximum number of rows (measurements) captured for each batch in the dataset? (2 marks)
Hint You can use pipe the output from 1 summarize() function to another
4. What is the minimum Oxegen offgas (%) used in any fermentation batch between 60 and 70 hrs
in culture? (2 marks)
5. Which of 3 control strategies (recipe, operator & raman) have the largest median Generated heat between 100 and 150 hours of culture. When showing your answer sort the median from high to low (2 Marks)
Hint: You can include defect when identifying the control strategy
2