A3 – Data from Heterogeneous Data Sources

Computer Science 280 – A3 1 radu/2015

Introduction

A3 – Data from Heterogeneous Data Sources

This practical assignment covers several topics taught in this course, such as: functional programming for data access, LINQ, LINQ to SQL, LINQ to FileDb, LINQ to Excel, and a few basic concepts related to the modern web REST/MVC pattern (here we use ASP.NET MVC4 with Razor syntax).

Overview

You are first required to complete a command-line skeleton, A3.CS, which offers a uniform read-only interface to several data sources, which hold similar (or not identical) data:

1) SqlCompact: the Orders, Customers, Employees tables from the SQL Compact Northwind.sdf demo database

2) FileDb: the Orders,fdb, Customers.fdb, Employees.fdb tables from the FileDb “Northwind NoSql Database” demo database folder

3) Excel: the Orders, Customers, Employees spreadsheets from the Northwind.xls demo workbook

Note: The Web demo shows a fourth option, SqlLocal (or SqlExpress), which is not required, because these SQL servers are not properly installed in the labs

The given skeleton, A3.CS, must be carefully completed with functional LINQ C# code (query expressions or fluent methods), individually by each student.

You also receive a Web MVC skeleton, A3MVC. If properly completed, the same A3.CS file will transform the MVC skeleton into a fully working Web application!

Starting materials
o A2CMD: folder with the skeleton A3.CS and all required files and libraries to

complete the command-line version
o A2MVC: folder with the skeleton MVC Web application – only the completed

A3.CS is still needed

o Online MVC demo (available in the university network only): http://130.216.41.42/MvcAssignmentPrecompiled

Computer Science 280 – A3 2 radu/2015

ER Model
Conceptually, our three table entities are related in two master/detail relationships, as

indicated by the following diagram (Orders/Customers N:1, Orders/Employees N:1):

Operations

o Inner join of the three tables: Orders⋈Customers⋈Employees

For each data source, you need to perform, in order, the following operations:

o Project the required columns: OrderID, ShipCountry, CompanyName, ContactName, EmployeeName (build from FirstName and LastName)

o Sort unambiguously, as required: either by the OrderID column or by a pair of two columns, such as (CountryName, OrderID) – a user chosen column plus OrderID (to ensure a complete order)

o Paginate, by skipping a number of rows and taking the required number for the “window”

For full marks, all these operations must be written in C# and LINQ, but executed, as much as possible, on the database server (after their automatic translation to SQL)

Computer Science 280 – A3 3 radu/2015

The final composite row is represented by this model:

public class MyJoin {
public int OrderID { get; set; }
public string ShipCountry { get; set; } public string CompanyName { get; set; } public string ContactName { get; set; } public string EmployeeName { get; set; }

}

// Orders
// Orders
// Orders⋈Customers
// Orders⋈Customers
// from Orders⋈Employee

Caveats

o Null EmployeeID may appear in the SqlCompact database

o Column names may contain spaces, e.g. [Order ID], in the SqlCompact database

o FileDb does not offer a join API and SqlCompact offers only some “tricky” pagination support – so these operations need to be performed in memory

o You must not use any third party non-standard libraries, except the provided libraries (specific to FileDb and Excel) and Dynamic LINQ layer (very useful to avoid to avoid complex programming structures)

Methods to implement

You must implement the following three major public stubs, which currently return empty row sequences:

public List<MyJoin> SqlCompact(
string sort, string sortDir, int rowsPerPage, int page, out int count, out string sortCriteria, out string sql) { … }

public List<MyJoin> NoSql(
string sort, string sortDir, int rowsPerPage, int page, out int count, out string sortCriteria) { … }

public List<MyJoin> Excel(
string sort, string sortDir, int rowsPerPage, int page, out int count, out string sortCriteria) { … }

Computer Science 280 – A3 4 radu/2015

Besides these three major methods, you can define additional helper methods and types, but only in the specifically reserved areas (“***”) without changing anything else.

The above three methods use the following input parameters, which also correspond to the “standard” query parameters in typical MVC applications:

o page: page number, starting with 1, default 1 (int)
o rowsPerPage: rows per page (page size), default 10 (int)
o sort: name of sort column, default “OrderID” (string)
o sortDir: sort direction, ”ASC” or “DESC”, default “ASC” (string)

All three methods return a list of rows, typed as List<MyJoin>. This list must contain the final result (after all joining, projecting, sorting, and paginating).

The output parameters are non-standard, but quite useful:
o count: the numbers of rows in the three-way join, before pagination – useful to

determine the total number of pages required

o sortCriteria: a string composed from sort and sortDir – in the format directly usable by Dynamic LINQ, e.g.

OrderID ASC

ContactName DESC, OrderID DESC

o sql: the automatically generated SQL code – useful to appreciate the performance quality, e.g. the amount of work actually performed on the database side and the number of round trips to it (chatty vs chunky)

Outputs

Your code will be judged by how well it performs its task, i.e. returning the required row lists and output parameters. Your own code must not write anything, except tracing lines on standard error. In the model solution, these “error” lines start with three dots (“… “), but are not considered in the evaluation.

The already given code serializes the returned rows as strings are writes them to a file called _out.log. Please do not interfere with this convention, as this file is the primary result assessed by the markers. Also, the given code prints the returned sql strings to standard error. The provide batch files collects all standard error lines to a file called _error.log. This might be useful for your self-assessment.

Computer Science 280 – A3 5 radu/2015

Main marking criteria

The markers will check if the main output file, _out.log, is identical to the expected model output, on a variety of test cases – these identity checks include everything: Unicode chars, spaces, letter case…

The markers will also check how your completed A3.CS file will work in the given MVC skeleton.

Additionally, the markers will check the quality of your code, especially:

o the proper usage of high-level functional C# LINQ code (e.g. as opposed to raw SQL code) and

o the quality of automatically generated SQL code (which may affect the performance).

For the last item, the markers will mainly rely on an SQL Profiler (which can display all SQL code actually executed)

Submission
Submit electronically, to the COMPSCI web dropbox, a 7z archive containing your

completed skeleton, called A3.CS, and a corresponding compilation batch, _A3.BAT. Please do not include any other files – even if you develop your submission with VS!

Please keep all file, methods and parameters naming conventions – this will ensure that your submission will be properly marked.

Please recheck it and ensure that it compiles and works in the labs (not only on your home machine)!

Please keep your receipt, and, just in case, keep also a copy of the submitted archive!

Deadline

Friday 12 October, 2015, 12:00 noon. Do not leave it for the last minute, please. Remember that you can resubmit and, by default, we only consider your last submission.

This assignment carries 4% towards the final mark for the paper. Late submissions will incur penalties, 0.5% off for each hour late, for up to three days.