代写 R math statistic software STAT0004 Take Home Assessment 2019

STAT0004 Take Home Assessment 2019
Deadline: 16:00 Wednesday 24 April 2019
You will work in a group (of approximately 3 students) to produce a single report.
1
• •
• •



• • •
Rules for the take home assessment
Your group will submit, via the Submit your take home assessment Moodle page, one PDF file containing your report and one R program file.
Each group member must click their respective “Submit” buttons in order for the group’s submission to be successful and final. By ticking the submission declaration box in Moodle you are agreeing to the following declaration:
Declaration: I am aware of the UCL Statistical Science Department’s reg- ulations on plagiarism for assessed coursework. I have read the guidelines in the student handbook and understand what constitutes plagiarism. I hereby affirm that the work my group is submitting for this in- course assessment is entirely our own.
The Turn-It-In⃝R plagiarism detection system may be used to scan your submission for evidence of plagiarism and collusion.
You will work together within your group and the usual plagiarism and collusion reg- ulations do not apply to this form of interaction. However, they do apply to collusion with other groups or plagiarism of work from other groups or from other sources. Any plagiarism will normally result in zero marks for all students involved, and may also mean that your overall examination mark is recorded as non-complete. Guidelines as to what constitutes plagiarism may be found in Departmental Student Handbooks. The relevant excerpt from the Statistical Science handbook is also posted on Moodle. Late submission will incur a penalty unless there are extenuating circumstances (e.g. medical) supported by appropriate documentation. Penalties are set out in the latest editions of the Statistical Science Department student handbooks, available from the departmental web pages.
Failure to submit this in-course assessment may mean that your overall examination mark is recorded as “non-complete”, i.e., you will not obtain a pass for the course. All members of a group will be awarded the same mark for the assignment.
I may ask you as a group to come and discuss your output with me.
You will receive, via Moodle, feedback on your work and a provisional grade — grades are provisional until confirmed by the Statistics Examiners’ Meeting in June 2019.
1

2 Tasks description
As a group, you will describe and analyse daily returns (see the ‘Problems’ section for a formal definition) of two stocks in the period 2015 – 2018. The data are described in the ‘Data’ section at the end of this document. You do not need to investigate their source further — just report on the data as they are presented to you. Also do not introduce other data into your work. Your group will prepare and submit a single, short, structured report that addresses each of the problems set out below. All the summary statistics in the report should come from an R program, or be readable from plots in the report. Your report and program will be marked by me (Tengyao Wang) and you may be required to discuss them with me. You will receive group specific feedback on your submission.
To complete this assignment successfully you need to start work very soon after the assignment is set and to plan your time carefully. It is quite possible to complete the assignment by the end of the second term.
Problems
The return of a stock on a given trading day is defined as the increase in its closing price from the previous trading day (negative if it is a decrease). Mathematically, if pi is the closing price on trading day i, then the return on day i is pi − pi−1.
Please address the following problems in your take home assessment.
1. Using techniques such as summary statistics and plots, describe the returns of the two stocks assigned to your group on each of the trading days (except 2 Jan 2015, where closing price of the previous trading day is unknown). Your description should include both univariate and multivariate analysis.
2. A stock analyst claims that there is increased market volatility in 2018 compared to previous years, represented by an increase in the magnitude of stock price movements.
(a) Using the absolute values of the returns for the first of your two stocks, carry out an appropriate two sample t-test for the analyst’s claim.
(b) Explain to this analyst, in non-technical terms, the result of your test in a single sentence.
3. An investor wants to predict returns of a stock on a given day using its return from the previous trading day.
(a) Using the second of your two stocks, build an appropriate simple linear model to investigate the feasibility of her strategy. Set out the model in mathematical notation and find the regression coefficients.
(b) Explain to this investor, in non-technical terms, the meaning of your regression coefficients and how well her strategy might work.
2

For the linear model, you may assume that the residual plots raise no issues about model assumptions or fit and you should not attempt to analyse or study them (I know that this is not what normally happens but I am trying to make your life easier).
Report
Name your report group*.pdf, where * should be replaced with your group number (e.g., for Group 01, it should be group01.pdf). The report should be consistent with the following:
• You must use the Microsoft Word template provided on the Moodle page for your report and are not allowed to change its font, font sizes or margins (if you wish to be allowed to use alternative word processing software then you must agree the details with me before submission). If the template has been changed, up to 4% of marks can be lost and I will reformat the document to the template standard, to which the following point will apply.
• The report must not be longer than 2 pages (2 sides) in A4 paper, including figures. I will only mark the first 2 pages of any report. Note that this doesn’t mean that you should aim to fill all the space available to you. Writing more text doesn’t necessarily get you more marks.
• The report must be capable of being read on its own: i.e., it should not refer to the R program but just contain data/plots from the program’s output.
• Please save your report as a PDF file from Microsoft Word (FILE, Export, Create PDF/XPS).
• It must be written in clear comprehensible English with readable and well-labelled figures.
• Your report should be anonymous — i.e., there should be no mention of group members’ names anywhere in your submission.
R program
Name your R program group*.r, where * should be replaced with your group number (e.g., for Group 01, it should be group01.r). Your R program should:
• Be clearly laid out and well commented with both a description of the program at the start and suitable notes throughout.
• Assume that the working directory has already been set to the location of the data file and to where any output files will be stored, i.e., there should be no setwd() command or reference to directories.
• Import the data from the data.csv file.
• Create an output file named output.txt using the sink() function, containing only
the statistics you use in your report. Your program may investigate other things but the output file should contain all the information you use in your report and should
3

be created when I run your program using the source() function in R. The output should be well laid out and contain appropriate descriptions. The output file itself should not be included in your submission.
• Create a .jpg image file for each plot (or set of plots) that you use in your report, and no others. Name the image files fig1.jpg, fig2.jpg, . . . , following the same order in which they appear in your report.
• Be anonymous — i.e., there should be no mention of group members’ names anywhere in your submission.
Your program may use non-standard packages and you should assume that they are installed on my computer.
Data
The data for your group are available in Moodle. Each group will receive a text file, named tickers*.txt with * being your group number (e.g., for Group 01, it should be tickers01.txt), that contains the ticker symbols (abbreviations of stocks’ names) of two stocks they will analyse. You only need to analyse the two stocks given to your group.
In addition, all groups will use the same master data file, data.csv, which has 150 columns:
• Year: 2015 to 2018
• Month: 1 to 12
• Day: 1to31,thefirstdayis2Jan2015andthelast28Dec2018.
• TradingDay: 1 to 252, indexing days in which stocks are traded within a year.
• Columns 5 to 150: each column contains the closing prices, in US dollars, of a specific
stock (represented by its ticker symbol in the column header) over the data period.
4