程序代写代做代考 database chain SQL Microsoft Word – Setup.docx

Microsoft Word – Setup.docx

Setup

The code for this assignment has been pushed to the same repository that you forked
for the previous assignment. That means all you need to do is sync your repository with
the main repository. There are multiple ways to do this, perhaps the easiest way is to
login to the BitBucket website, go to the page for your repository, and click the “Sync
Now” link on the right side of the screen. You can also use eclipse or the command line
to perform a pull from upstream, if you prefer.

Please note that this assignment builds off of the previous assignment. In particular,
you will need a working Catalog class for this assignment to work. The other classes
are used a bit, but are not as important. If you did not complete the previous
assignment, or are concerned that your implementation is significantly flawed, please
speak with me as soon as possible so that we can figure out a solution.

Overview

In this assignment, you will implement relational operations as discussed in class. Your
first task will be to implement the operations themselves. You will then use a parser to
help translate SQL queries into these relational operations, effectively allowing users to
query the data stored on your server.

Relations

Your first task is to complete the Relation and Aggregator classes. Relation is a class
that performs relational algebra operations on the data in your database. Examine the
methods and come up with a design before beginning your implementation.

Some implementation hints for Relation:

* Notice that each relational algebra operation returns a Relation. This will allow us to
chain together many operations to form more complex queries.

* Select operations will all take the form: column (operation) constant. An example
WHERE clause might be WHERE a1 = 530.

* Something to consider for each operation is how the operation will affect the number
and type of columns in the resulting relation (if this is affected at all). Be sure to create
a new TupleDesc if necessary and use this when creating new Tuplesfor the result.

* The join operation should be an inner join, there is no need to worry about outer
joins for this assignment. You may also assume that all join conditions will use equals
(i.e. not greater than or less than).

* In class we discussed how relations consist of a set of tuples, however you’ll notice
that for this assignment we are using an ArrayList instead of a set. This means that
duplicates may exist in the result set, but that should be considered acceptable
behavior for the purpose of this assignment.

* Be aware that some relational operations may result in a relation that contains no
tuples (the empty set). You should be able to accommodate these situations, and the
resulting Relation should still have the proper TupleDesc even if it does not contain any
tuples.

Some implementation hints for Aggregator:

* Aggregates can get complicated, so for the purposes of the assignment, we are going

to restrict what kinds of aggregate queries are possible. There are two types of
aggregates to consider: aggregates with GROUP BY and aggregates without GROUP BY.
For aggregates without GROUP BY, you can assume that the relation being aggregated
has a single column (the column that is being aggregated). For aggregates that include
GROUP BY you can assume that the relation will have exactly two columns: the first
column will be the column containing the groups, and the second column will contain
the data to be aggregated.

* Note that not all aggregation operations apply to all data types.

* The Aggregator class has two primary methods. Merge accepts one tuple at a time
and updates the aggregate accordingly, and getResult returns the result of the
aggregation. This design should greatly simplify the aggregate method in
the Relation class.

* You do not need to change the column name of any columns involved in an
aggregate. Once you have finished these two classes, your code should pass
the Relation tests.

Queries

Next, we will shift our focus to queries. Parsing queries is difficult, so you will not be
tasked with doing that. Instead, we will use an open source parser called JSQLParser.

You can find the API for this parser here.

If you examine the Query class, you’ll notice that query execution has already been
started for you. The code given creates a parse tree that breaks a SQL query down into
its component parts. This is not the same thing as a query tree, in fact the code that

you will be writing will have a similar effect as a query tree. Your task is to get the
necessary information out of the parse tree and use it in combination with
your Relation class to execute the query.

Processing a parse tree often utilizes a design pattern called the visitor pattern. While
you are not tasked with creating any visitors yourself, you will notice that a few visitor
classes have been provided to you that you will need to use. As an example,
consider ColumnVisitor. Once you have used the API to extract the column information
from the parse tree, create a new ColumnVisitor for each column node, then
call accept() on each node, passing the visitor in as a parameter. You can then ask
the ColumnVisitor to tell you the name of the column contained in the node.

For example, if you have a variable called node that contains column information, you

might do something like this:

ColumnVisitor cv = new ColumnVisitor();

node.accept(cv);

System.out.println(cv.getColumn()); //should print the column name retrieved

from the node

Some implementation hints for Query:

* The API is your friend. Take a look at what classes are being referred to in the visitor
classes to get an idea of what classes to look into in the API.

* You are responsible for the following SQL keywords: SELECT, FROM, WHERE, GROUP
BY, aggregate functions (MIN, MAX, AVG, COUNT, SUM), AS (for columns only, not

tables).

* You will need to use the Catalog to access tables.

* The only method that you should need to modify for this part is the execute method
in the Query class. You should definitely examine the three visitor classes, but there is
no need to modify them.

* You should never have to instantiate the AggregateExpressionVisitor class. You’ll
notice that it is used within the ColumnVisitor class (which you will have to use, along
with the WhereExpressionVisitor class).

* Pay attention to order of operations! They definitely matter.

* The queries you are provided will follow the same rules for aggregates laid out in the
previous section.

* Casting is acceptable, but should be used sparingly, as it is a bit dangerous. Only cast
if you are absolutely certain about the type you are casting to!

* You do need to handle the all columns (*) operator in the SELECT clause.

* You should be able to handle queries that join more than two tables together.

Testing

You have been provided some tests that you can use to check basic functionality of
each of the objects. These tests are NOT comprehensive. Just because you pass the
tests does not mean that you will get a perfect score on this assignment. Please feel

free to write some of your own tests to check your implementation. Your tests will not
be part of your grade. When we do grade your assignment, we will replace the existing
tests with some more thorough tests and record the output.

Rubric

The below rubric is intended as a guideline and will be followed as closely as possible,
however submissions that deviate too much from the specification may be graded using
an alternate rubric.

Relation (55 points): select (10 points), project (10 points), rename (5 points), join (15
points), aggregate (5 points), toString (5 points), constructor and getters and setters (5
points)

Aggregate (15 points): merge (10 points), everything else (5 points)

Query (30 points): SELECT (5 points), FROM (5 points), WHERE (5 points), JOIN (5
points), Aggregates (5 points), AS (5 points)

I reserve the right to deduct points for code that is poorly formatted or poorly
documented.

Total: 100 points

Submission

To submit your homework, simply commit and push your code to your repository by the
due date. Be sure to put your names at the top of each file. Any submissions that are
committed and pushed after the due date will be discarded.

Acknowledgements

This lab was mostly created by me, however since it is based on hw1 which was taken
from MIT OpenCourseWare, it is still appropriate to provide attribution to them:

Sam Madden, Robert Morris, Michael Stonebraker, Carlo Curino, 6.830 / 6.814 Database
Systems, Fall 2010. (MIT OpenCourseWare: Massachusetts Institute of Technology),
https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-830-
database-systems-fall-2010/index.htm (Accessed 9/11/17). License: Creative commons
BY-NC-SA