Instructions
BEng Biomedical Engineering Computational Statistics
Coursework 2 – Practical Examination
First, read the following instructions.
DO NOT TURN OVER OR START WORKING UNTIL INSTRUCTED TO DO SO.
This examination consists of 3 questions, worth a total of 75 marks. Answer all 3 questions. If you do not manage to complete all questions then please submit the work that you have completed. Each question will involve writing MATLAB code to meet some requirements. For some parts you will be required to provide short written answers. Please type these as comments in your MATLAB script file making it clear which question part they correspond to.
The questions should be completed individually under examination conditions. You may not consult with any other students for the duration of the examination. This is an open-book examination. You may refer to any text books, course materials, source code or other materials available from the internet or KEATS. You may not communicate with others using e.g. email, instant messaging or social media sites during the examination.
Before you start, you should download a zip file from the link Coursework 2 download files under the Assessment section on the module KEATS page. Download the zip file and extract the folder and its contents onto your computer. Write and save all of your work in the examination in this folder.
Time allowed: 1 hour 30 minutes.
When the invigilator instructs you to stop work please do so immediately. You should then combine all of your solution files into a single zip file and submit it through KEATS as instructed.
Assessment
Your coursework will be marked based on the answers you provide and how well the code you write meets the requirements specified.
The overall mark for this coursework will make up 35% of your total mark for this module.
1
Question 1
Researchers are investigating factors that might influence the exam performance of students. Initially, the researchers focused on parental income as the influencing factor and arranged the students into 5 groups of parental income based on percentiles (i.e. 0-20%, 21-40%, 41-60%, 61-80% and 81-100%).
The study data are supplied in the file exam performance part one.mat. Each variable contains exam scores (in %) for the named group. For example, the variable G1 contains the exam scores for those students who come from a family with income in the 0-20% group. There are 100 students in each group. Throughout the following analysis you can assume that the data are normally distributed.
Create a MATLAB script called q1.m and write all of your answers to this question in this file. For written answers, that do not require MATLAB code, write your answers as a comment in the script (i.e. start the line with a % symbol).
(a) Are the data collected for this initial study univariate or multivariate?
(b) How many groups are there in the initial study data?
(c) For each of the groups produce an appropriate visualisation of the distribution of the data collected.
(d) Perform a one-way ANOVA test on these data to determine, with 95% confidence whether there is any difference in mean exam percentage across the groups. Comment on the result of the test.
(e) As well as parental income, the following additional factors were subsequently included in the study: air quality (around school); diet; literacy level and attendance record. For each factor students were classed into low (poor) or high (good) categories. These study data are supplied in the file exam performance part two.mat. Each factor contains exam scores (in %) for the group and category involved. For example, the variable Diet poor contains the exam scores for those students categorised as having a poor diet.
There are 500 students in each category.
Again working to a 95% degree of confidence, perform appropriate hypothesis tests to determine which of the five potentially influencing factors might affect exam performance.
[2 marks]
[2 marks]
[8 marks]
[8 marks]
Computational Statistics Page 2 of 8 Coursework 2 – Practical Examination
[9 marks]
Question 2
A biomedical engineering company has developed a new robotic surgery technique for use in high risk operations. To evaluate their technique, a trial has been performed in which 50 patients underwent surgery using the new robotic technique and 45 patients had surgery using the traditional approach. Patients were assigned randomly into one of the two groups (traditional or robotic surgery). For each operation, the survival or death of the patients and the type of surgery were recorded.
All data are available to you in the file surgery.mat, which contains the following variables: • survival: Whether or not each patient survived: ‘A’ = alive; ‘D’ = dead.
• surgery: The type of surgery undergone: ‘T’ = traditional; ‘R’ = robotic.
Create a MATLAB script called q2.m and write all of your answers to this question in this file. For written answers, that do not require MATLAB code, write your answers as a comment in the script (i.e. start the line with a % symbol).
(a) What type of statistical data were recorded in this study?
(b) Load the data into the MATLAB workspace and produce an appropriate visualisation of the relationship between the two variables.
(c) Perform an appropriate hypothesis test to determine if the survival of patients having this type of operation is affected by the type of surgery used. Work to a 95% degree of confidence and state your conclusion in plain English.
[4 marks]
[6 marks]
[6 marks]
Computational Statistics Page 3 of 8 Coursework 2 – Practical Examination
Question 3
A research project is investigating whether a new proton therapy treatment machine can lead to better outcomes than traditional radiotherapy treatment of cancer patients. A cohort of 47 patients had their tumour volumes (in mm3) measured using Positron Emission Tomography imaging before treatment. 12 of the 47 patients, who were considered to have more advanced cancer, were selected for proton therapy treatment, and the other 35 underwent radiotherapy. The tumour volumes for all patients were measured again after treatment.
All measurements are available to you in the file tumour volumes.mat, which contains the following variables:
• volume before radio: Tumour volumes for the 35 radiotherapy patients before treat- ment.
• volume after radio: Tumour volumes for the 35 radiotherapy patients after treatment. • volume before proton: Tumour volumes for the 12 proton therapy patients before treat-
ment.
• volume after proton: Tumour volumes for the 12 proton therapy patients after treat- ment.
Create a MATLAB script called q3.m and write all of your answers to this question in this file. For written answers, that do not require MATLAB code, write your answers as a comment in the script (i.e. start the line with a % symbol).
(a) Based on the provided data, use MATLAB to determine whether radiotherapy and/or proton therapy are effective at reducing tumour size. Be as thorough as possible in your data analysis, and show all of your reasoning when choosing an appropriate test. Use a 95% confidence level.
[26 marks]
[4 marks]
(b) Explain any concerns that you have about this experiment and how they might be ad- dressed.
Computational Statistics Page 4 of 8 Coursework 2 – Practical Examination