Matlab代写: ELEC2103/9103 Assignment

Statistical and predictive data analysis

Statistical and predictive analysis

Modelling, predicting, and verifying the accuracy of models are vital skills in engineering and other fields. This assignment will assess your ability to develop and validate statistical models using MATLAB, and in particular, the Statistics and Machine Learning Toolbox.

Key information

  •   Due Wednesday 5 October 2016, worth 15% of your final grade.
  •   Submissioncomprisesoneormorem-files,andmustbuildontheelec2103a.mfileprovided.
  •   Submission via Turnitin (Blackboard).
  •   Feedback available until Wednesday 21 September 2016 (via Piazza, please).

    Background
    This assignment asks you to explore and analyse data collected through the Smart Grid, Smart

    City Program.

    The Smart Grid, Smart City Program, which concluded in 2014, was arguably one of the widest-ranging technology assessments of smart grid products in the world. It involved approximately 17,000 electricity customers in consumer-focused trials examining how residential customers could contribute to peak demand management through behavioural changes.

    The Smart Grid, Smart City Program focused on residential customers, as they represent the largest user group in Australia, and generally have more discretion over when and how much energy they use. Little was known before the Smart Grid, Smart City trials about how customers perceived, or how they might respond to, the opportunities that smart grid technologies offer.

    Much more background information on the Smart Grid, Smart City Program can be found at: http://www.industry.gov.au/Energy/Programmes/SmartGridSmartCity/Pages/default.aspx

    The assignment task

    You are to explore and analyse part of the Smart Grid, Smart City Customer Trial Data. You are to complete your analysis using MATLAB, and present your analysis as a report contained in a script files that can be published to a report in html using MATLAB’s Publish features. You are encouraged to share ideas, but your submitted assignment must be uniquely your own.

    The data

    The Smart Grid, Smart City Customer Trial data is available from: https://data.gov.au/dataset/smart-grid-smart-city-customer-trial-data

    There are a number of data sets available here, including:

  •   ElectricityUseIntervalReadings,showingthehalfhourlyintervalmeterreadings(kWh)of

    electricity consumption and generation for each of the participating households in the Smart Grid Smart City customer trial. A sample of this data is shown on the cover of the assignment description.

  •   CustomerHouseholdData,showingdetailsofindividualhouseholdsparticipatinginthe Smart Grid Smart City customer trial, including the number of inhabitants, age of inhabitants, income level, appliance ownership and use (including solar generation), type of dwelling, rental status, location, and other details.

1

 Peak Events and Peak Event Response data, which shows when peak network events were called by the electricity retailer or distribution business as part of a tariff offering, and the response of trial customers, during the Smart Grid Smart City customer trials.

This data comprises varied sets of time-series, cross sectional and panel data, of both numerical and categorical types.

You can also draw on other data sources to inform your analysis. If you have something particular in mind, I can advise you of whether it is freely available and where to find it.

Submission requirements

Your assignment will submitted via Turnitin in the form of a .zip file named in the following format for undergraduate students:

FamilyName_FirstName_SID_ELEC2103_2016.zip or for Masters students:

FamilyName_FirstName_SID_ELEC9103_2016.zip E.g. Chapman_Archie_000000000_ELEC2103_2016.zip

The main file will be called elec2103a.m (regardless of if you are undergrad or postgrad). You are provided with a MATLAB script stub to get you started, which is available on Blackboard. Submit the .zip file containing your main file, any custom functions that you write and the data that is needed to complete your analysis.

Assignment criteria and grades

The assignment will be given a grade out of 15. Marks will be allocated in two tranches, as follows:

First, in order to earn a pass, the minimum requirements of this assignment are to:

  1. Get some data from the Smart Grid, Smart City Customer Trial into a useable format.
  2. Analyse the data in MATLAB by modelling/fitting it, using regression, classification or

    ANOVA methods.

  3. Assess the statistical errors in your model.
  4. Make a prediction using your model, perhaps into the future (for time series data) or

    across a new subset (for categorical data, i.e. across a new set of households).

  5. Discuss your results, including making an assessment of the reliability of the prediction,

    while also making appropriate use of plots and/or charts.

  6. Put your analysis in a publishable MATLAB script that runs without errors. Build on the

    provided m-file stub, and make sure your submitted .zip file contains your version of this file (not the stub), any custom function m-files that you write, and all of the data that is needed to complete your analysis.

Satisfying each of the minimum requirements 1-5 above will earn you 1 mark, while requirement 6 is worth 2.5 marks. Accordingly, if you do these reasonably well you will get a pass in the assignment, i.e. 7.5 marks. Here, “reasonably” means more than joining two points at different times with a straight line and projecting into the future.

Examples of the requisite techniques to pass the assignment will be covered in the lectures and labs. Note that if the main MATLAB script you submit doesn’t run, I will spend a very small amount of time trying to assess the remaining requirements.

2

Second, to earn higher grades, you either need to do the first part extremely well (which might get you up to a credit), or you will need to add one or two additional advanced forms of statistical analysis and/or prediction, performed on the same data set. You can choose, but these could include:

  •   Formalstatisticalcomparisonsofmorethanonemodelormethodofanalysis.
  •   Advancedstatisticalanalysistestingtheassumptionsofyourmodellingchoice,suchastests

    of heteroscedasticity, multicollinearity, etc.

  •   Bootstrapping,jackknifingorsomeotherresampling-basedvalidationofyourmodel.
  •   Sophisticateduseofmorethanonedataset.
  •   DoingsomethingimpressivewithMATLAB’svisualisationorGUItools(onlyoneofthese).
  •   Useofanadvancedstatisticalestimationormachinelearningtechnique,withjustification.

    This could include:

o UsingMATLAB’sneuralnetworktools(whichdoesn’ttakemucheffort);
o Advanced time series analysis, such as ARIMA or GARCH models;
o EstimatingastochasticvolatilityorhiddenMarkovmodel;
o UsingBayesianmodels;
o Advancedclusteringand/orhierarchicalanalysis;
o Ifyouhaveaninterestinsignalprocessing,youcouldinvestigatenon-parametric

kernel estimators (akin to kernel smoothing techniques), principle component analysis,

or apply a series of bandpass filters over a time series and see what you get. Or come up with something else, after discussing with me.

Please note that simply applying a fancy method is insufficient for the purposes of this assessment. Instead, you will need to justify your choice. For example, writing “I used an Ornstein-Uhlenbeck process to model the variations in X” is not sufficient justification; while writing “An Ornstein-Uhlenbeck process is used because mean-reverting processes are appropriate for the setting of X” is much better. Another good justification is choosing an advanced model based on insights drawn from the output of a simpler one.

Please limit yourself to three forms of analysis or investigations in total, including your first approach that satisfies the minimum requirements listed in requirements 1-6. If you include more I will only assess the first three. If you are unsure where the boundaries of an analysis are, contact me for clarification.

Marks: In addition to the 7.5 marks available for requirements 1-6, each additional piece of analysis will be worth at most 5 marks, at my discretion. This means for the assignment, you can score up to 17.5/15! This is not a mistake, but is included to encourage you to try new things. For example, trying two advanced methods and scoring only 3.5/5 for each gets you 14.5/15. However, note that any score >15marks will be given only the full 15% weighting in your final unit grade. Curb your enthusiasm accordingly.

For the additional analyses, the more sophisticated the techniques you use, the higher mark you will score. Examples of these advance techniques will be not be covered in the lectures or the labs. You are required to discover them for yourself, but do feel free to discuss your proposed approaches with me (via Piazza).

There are no minimum or maximum lengths to the submission, but treat this like you are trying to convince a busy person that you have something important to say – being terse and direct is not a bad thing in engineering and business communication.

Late assignments will be penalised by deducting 2 marks and by reducing the maximum grade achievable by2 marks, for each 24 hours overdue, including weekends. Don’t be late.

Best of luck, Archie

3