编程代写 SQL database hadoop Hive MySQL Coursework 2.i

MySQL Coursework 2.i
Overview

The second coursework for the Databases/Big Data topic is in two parts

· This work starts by extending that done on using Access to generate a SQL server-based solution, firstly using MySQL and then MS-SQL (with data warehousing between two servers).

· It then moves to BigData processing techniques (e.g. Hadoop, Hive, Spark etc.)

The breakdown of these tasks are

· Use of MySQL (this document)
· Use of MS-SQL
· Big Data Processing tasks

Submission (an overview)

The submission of coursework two will comprise

1. SQL scripts for the generate of required database schemas
2. Backup/detached SQL database files
3. Code/scripts etc. for the BigData tasks
4. Written Report (covering all aspects of the work)

Full details on the submission process (e.g. for creating (1) & (2) are be provided).

Marking rubrics have be provided for tasks (to ensure all aspects of the work are understood).

For the written report, a template & marking rubric will be provided.

Practical support

The seminar sessions will provide a chance for you to develop, with support, your solutions and will also allow you to ask for feedback on your solutions prior to the final submission (I will of course be answering queries by email and/or team).

Coursework 2, Task 1: Instructions (MySQL tasks)

This section details the work required for the first of these, the MySQL implementation of the ‘student marks system’ (based on the original Access work)

Tasks 1 & 2 are to be done via a single SQL script which processes the sample data into the relevant tables without any updating of the original table, for task 3 generate views for the data.

Task 4 requires the creation of a stored procedures that manipulate the marks table and (for 4.3 & 4.4) updates all statistics in a single function. The fact that the any script has been run should be logged in an audit table (you will need to add an additional table for this).

1. Database Table generation

Remember: You are to generate a single script to complete this task

· To rationalise the data into a series of tables (remembering to consider field sizes/types)
· Define the relationships between the tables; they must be created in such a fashion as to ensure complete data integrity across the database

2. Statistics table

Remember: You are to generate a single script to complete this task

A specific requirement of the database is that there is a table of student IDs & full year statistics (credits taken, credits passed, year average etc.).

3. Views

Once the tables have been created, you are to generate views to achieve the following

· V1 Produce a list of students and their marks.

· V2 A list of module statistics (min, max & average mark)

· V3 Produce a list of students who failed to pass in any academic year. There are three ways this occurs

· An overall year average of < 40 · More than 20 credits in the range 30-39 · A mark below 30 Note: Views should provide such fields as would make the data usable (i.e. a list of marks would need to include details such as the student ID, Name etc.). The names of the views should be indicative to the tasks they perform. 4. Data processing stored procedures The following stored procedure are to be developed 4.1 A procedure to update the FULL statistics table from the marks data – the fact the procedure has been run (and when) should be logged to a table in the 4.2 A procedure to update the statistics table from the marks data for a given student– the fact the procedure has been run (and when) should be logged to a table in the 4.3 An add/update procedure that, for a given student/year/module, will add a new mark or update an existing mark 4.4 A delete procedure that, for a given student/year/module/mark will delete an existing mark Remember: · Running of a stored procedure is to be logged · Stored procedures 4.3 & 4.4 should, as part of their operation, update the statistics table 5. Calculation of weighted average (a reminder) All averages are to be weighted averages, calculated as Average = sum ( module_mark * module_credits ) / sum (module_credits) Work to come… (the MS-SQL part). You will create an equivalent system using MS-SQL but add in a security model, management activities & systems for exchanging data between different systems etc.