程序代写代做代考 chain Java database data structure concurrency CS44800 Project 2

CS44800 Project 2

The RDBMS Lower-Level Layers

Spring 2017
Due: February 27, 2017, 11:59PM

(There will be a 10% penalty for each late day up to four days. The assignment will not

be accepted afterwards.)

Notes:

● This project should be carried out in groups of two. Choose a partner as soon as possible.

● This project will be based on Minibase, a small relational DBMS, structured into several layers. In

this project, you are required to implement parts of the lower-level layers: BM – the buffer

manager layer and HFPage – the heap file layer. Layers should be tested and turned in

separately.

PART 1: Buffer Management
1. Introduction
You are required to implement a simplified version of the Buffer Manager layer without support

for concurrency control or recovery. You will be given the code for the lower layer: the Disk

Space Manager.

You should begin by reading the chapter on Disks and Files to get an overview of buffer

management. In addition, HTML documentation is available for Minibase. There is a link to the

Minibase home page here. In particular, you should read the description of the DB class, which

you will call extensively in this assignment. The Java documentation for the diskmgr package

can be found here. You should also read the code under diskmgr/ carefully to learn how the

package is declared and how exceptions are handled in Minibase.

2. The Buffer Manager Interface
The simplified Buffer Manager interface that you will implement in this assignment allows a

client (a higher level program that calls the Buffer Manager) to allocate/de-allocate pages on

disk, to bring a disk page into the buffer pool and pin it, and to unpin a page in the buffer pool.

The methods that you have to implement are described below:

http://research.cs.wisc.edu/coral/mini_doc/intro/single_user.html
https://www.cs.purdue.edu/homes/pearso29/MinibaseJavadoc/index.html

/**

* Create the BufMgr object.

* Allocate pages (frames) for the buffer pool in main memory and

* make the buffer manage aware that the replacement policy is

* specified by replacerArg (e.g., LH, Clock, LRU, MRU, LFU, etc.).

* @param numbufs number of buffers in the buffer pool

* @param lookAheadSize: Please ignore this parameter

* @param replacementPolicy Name of the replacement policy, that parameter will be set to “MRU” (you

can safely ignore this parameter as you will implement only one policy)

public BufMgr(int numbufs, int lookAheadSize, String replacementPolicy) {};

/**

* Pin a page.

* First check if this page is already in the buffer pool.

* If it is, increment the pin_count and return a pointer to this

* page.

* If the pin_count was 0 before the call, the page was a

* replacement candidate, but is no longer a candidate.

* If the page is not in the pool, choose a frame (from the

* set of replacement candidates) to hold this page, read the

* page (using the appropriate method from {\em diskmgr} package) and pin it.

* Also, must write out the old page in chosen frame if it is dirty

* before reading new page.__ (You can assume that emptyPage==false for

* this assignment.)

* @param pageno page number in the Minibase.

* @param page the pointer point to the page.

* @param emptyPage true (empty page); false (non-empty page)

public void pinPage(PageId pageno, Page page, boolean emptyPage) {};

/**

* Unpin a page specified by a pageId.

* This method should be called with dirty==true if the client has

* modified the page.

* If so, this call should set the dirty bit

* for this frame.

* Further, if pin_count>0, this method should

* decrement it.

*If pin_count=0 before this call, throw an exception

* to report error.

*(For testing purposes, we ask you to throw

* an exception named PageUnpinnedException in case of error.)

* @param pageno page number in the Minibase.

* @param dirty the dirty bit of the frame

public void unpinPage(PageId pageno, boolean dirty) {};

/**

* Allocate new pages.

* Call DB object to allocate a run of new pages and

* find a frame in the buffer pool for the first page

* and pin it. (This call allows a client of the Buffer Manager

* to allocate pages on disk.) If buffer is full, i.e., you

* can’t find a frame for the first page, ask DB to deallocate

* all these pages, and return null.

* @param firstpage the address of the first page.

* @param howmany total number of allocated new pages.

* @return the first page id of the new pages.__ null, if error.

public PageId newPage(Page firstpage, int howmany) {};

/**

* This method should be called to delete a page that is on disk.

* This routine must call the method in diskmgr package to

* deallocate the page.

* @param globalPageId the page number in the data base.

public void freePage(PageId globalPageId) {};

/**

* Used to flush a particular page of the buffer pool to disk.

* This method calls the write_page method of the diskmgr package.

* @param pageid the page number in the database.

public void flushPage(PageId pageid) {};

/**

* Used to flush all dirty pages in the buffer pool to disk

public void flushAllPages() {};

/**

* Returns the total number of buffer frames.

public int getNumBuffers() {}

/**

* Returns the total number of unpinned buffer frames.

public int getNumUnpinned() {}

3. Internal Design
The buffer pool is a collection of frames (page-sized sequence of main memory bytes) that is

managed by the Buffer Manager. Page objects are implemented as an array of bytes and deal

with the buffer pool at the byte array level. Note that the size of the Minibase pages is defined in

the interface GlobalConst of the global package. Before jumping into coding, please make sure

that you understand how the Page object is defined and manipulated in Java Minibase.

In addition, you should maintain an array bufDescr[numbuf] of descriptors, one per frame. Each

descriptor is a record with the following fields:

page number, pin_count, dirtybit.

The pin_count field is an integer, page number is a PageId object, and dirtybit is a boolean. This

describes the page that is stored in the corresponding frame. You are free to extend this

descriptor to hold additional information if necessary for your implementation. A page is

identified by a page number that is generated by the DB class when the page is allocated, and

is unique over all pages in the database. The PageId type is defined as an integer type in global

package.

A simple hash table should be used to figure out which frame a given disk page occupies. You

should implement your own hash table class and not use existing HashTable Java classes. The

hash table should be implemented (entirely in main memory) by using an array of pointers to

lists of pairs. The array is called the directory and each list of

pairs is called a bucket. Given a page number, you should apply a hash function to find the

directory entry pointing to the bucket that contains the frame number for this page, if the page is

in the buffer pool. If you search the bucket and do not find a pair containing this page number,

the page is not in the pool. If you find such a pair, it will tell you the frame in which the page

resides.

The hash function must distribute values in the domain of the search field uniformly over the

collection of buckets. If we have HTSIZE buckets, numbered 0 through M-1, a hash function h of

the form h(value) = (a*value+b) mod HTSIZE works well in practice. HTSIZE should be chosen

to be a prime number.

When a page is requested, the buffer manager should do the following:

1. Check the buffer pool (by using the hash table) to see if it contains the requested page. If the

page is not in the pool, it should be brought in as follows:

(a) Choose an unused frame. If all frames are occupied (i.e., the buffer pool is full),

choose a frame for replacement using the MRU policy as described below.

(b) If the frame chosen for replacement is dirty, flush it (i.e., write out the page that it

contains to disk, using the appropriate DB class method).

replacement; the pin_count and dirtybit for the frame should be initialized to 0 and

FALSE, respectively.

(d) Delete the entry for the old page from the Buffer Manager’s hash table and insert an

entry for the new page. Also, update the entry for this frame in the bufDescr array to

reflect these changes.

2. Pin the requested page by incrementing the pin_count in the descriptor for this frame and

return a pointer to the page to the requestor.

The Most Recently Used (MRU) Replacement Policy

The MRU replacement policy keeps track of how recently buffered pages have been accessed.

When the buffer pool is full and requires more room to host an unbuffered page, MRU will select

the most recently used page as a victim (i.e., will be evicted from the buffer pool to allow

buffering a new page).

4. Error protocol
Though the Throwable class in Java contains a snapshot of the execution stack of its thread at

the time it was created and also a message string that gives more information about the error

(exception), we have decided to maintain a copy of our own stack to have more control over the

error handling.

We provide the chainexception package to handle the Minibase exception stack. Every

exception created in your bufmgr package should extend the ChainException class. The

exceptions are thrown according to the following rule:

Error caused by an exception caused by another layer:

For example: (when try to pin page from diskmgr layer)

try {

Minibase.BufferManager.pinPage(pageId, page, true);

}

catch (Exception e) {

throw new DiskMgrException(e, “DB.java: pinPage() failed”);

}

Error not caused by exceptions of others, but needs to be acknowledged:

For example: (when try to unpin a page that is not pinned)

if (pin\_count == 0) {

throw new PageUnpinnedException (null, “BUFMGR: PAGE_NOT_PINNED.”);

}

Basically, the ChainException class keeps track of all the exceptions thrown across the different

layers. Any exceptions that you decide to throw in your bufmgr package should extend the

ChainException class.

For testing purposes, we ask you to throw the following exceptions in case of error (use the

exact same name, please):

BufferPoolExceededException

Throw a BufferPoolExceededException when an attempt is made to pin a page to the buffer

pool with no unpinned frame left.

PagePinnedException

Throw a PagePinnedException when an attempt is made to free a page that is still pinned.

HashEntryNotFoundException

Throw a HashEntryNotFoundException when the page specified by PageId is not found in the

buffer pool.

Feel free to throw other new exceptions as you see fit. The provided test cases can help you

figure out where and which exceptions need to be thrown. But make sure that you follow the

error protocol when you catch an exception. Also, think carefully about what else you need to do

in the catch phase. Sometimes you do need to unroll the previous operations when failure

happens.

5. Where to Find Makefiles, Code, etc.
Please copy this file (proj21.zip) into your own Unix local directory. The directory lib contains the

jar file that implements the disk manager, you will use this file to compile your code. The content

of the directory src is:

under bufmgr directory

You should make all your code for bufmgr package implemented under this directory.

under diskmgr directory

The code for the diskmgr package has been available for you to study in order to understand

what is happening when the Buffer Manager is making calls to the Disk Manager. The Javadoc

also gives information on Disk Manager behavior. You should not have to make any changes to

code here.

under tests directory

TestDriver.java, BMTest.java: Buffer manager test driver program.

The provided test cases are given to you in order to give you a starting place. They are not

going to be the test cases used for grading. You are encouraged to add your own test cases in

order to test for corner cases.

https://www.cs.purdue.edu/homes/pearso29/cs448_S17/project2/proj21.zip

We provide a sample Makefile to compile your project. You will have to set up any

dependencies by editing this file. You may need to design your own Makefile. Whatever you do,

please make sure that the classpath is correct.

You can find other useful information by reading the java documentation on other packages,

such as the global and diskmgr package.

6. What to Turn in
You should turn in copies of your code together with the Makefile and a readme file. All files

need to be zipped in a file named: your_career_login1_your_career_login2_bufmgr.zip.

In the readme file, put the name of your group members, and the feature you would like the TAs

to know. The TAs should be able to compile/run your program using make on the university

Linux machines. You do not need to include the library file and other files provided.

The directory structure of your zip file should be identical to the directory structure of the

provided zip file (i.e., having the directory src, the Makefile, …), except the top-level name

(should be your career login above). Your grade may be reduced 5% off if you do not follow this.

In the readme file, make sure to state the roles of each member in the group (i.e., who did

what). Only one member should submit the project on behalf of the other member (please

indicate in the read me file who is the one submitting).

PART 2: Heap Files
1. Introduction
In this part, you will implement the Heap file layer. You will be given the documentation for the

lower layers (Buffer Manager and Disk Space Manager), as well as the documents for

managing records on a Heap file page. You can find the package index for the above here.

This assignment has three parts. You have to do the following:

1. Familiarize yourself with the HeapFile, HFPage, Buffer Manager and Disk Space Manager

interfaces.

2. Implement the HeapFile class. You can ignore concurrency control and recovery issues, and

concentrate on implementing the interface routines. You should deal with free space

intelligently, using either a page directory to identify pages with room for records. When a record

is deleted, you must update your information about available free space, and when a record is

inserted, you must try to use free space on existing pages before allocating a new page.

3. Run the tests provided.

The major methods of HeapFile.java that you will implement include:

https://www.cs.purdue.edu/homes/pearso29/MinibaseJavadoc/index.html

public HeapFile(String name)

public RID insertRecord(byte[] record)

public Tuple getRecord(RID rid)

public boolean updateRecord(RID rid, Tuple newRecord)

public boolean deleteRecord(RID rid)

public int getRecCnt() //get number of records in the file

public HeapScan openScan()

Note: For the purposes of this assignment, we have slightly different requirements than the full

implementation of Minibase. The high-level descriptions in the Javadoc of how methods perform

is the same. However, for any conflicts between the Javadoc and this provided interface in

terms of return types or parameters, refer to this interface. The provided test cases can also be

used to check which version is correct.

2. Available Documentation
You should begin by reading the chapter on Disks and Files, to get an overview of the HF layer

and buffer management.

3. Page Directory Requirements
In order to track free space on pages in the file and be able to efficiently implement the insertRecord,

updateRecord, and deleteRecord methods, you will need to use a Page Directory data structure. These

operations should be implemented in O(N). All other operations should take at most O(1). However, your

HeapFile will also have to efficiently handle variable-length records. Although a Linked List

implementation to track free space would technically satisfy the O(N) requirements, this performs

relatively poorly for variable-length records. A Linked List implementation will therefore not receive full

credit compared to a Page Directory implementation.

In addition, your Page Directory needs to handle the case where the Page Directory is too large to fit in

one page. You must make use of Pages to store the Page Directory on-disk. It is not sufficient to store the

Page Directory as an in-memory data structure.

4. Representing Records
Records in the database are represented using the Tuple class. Although the Minibase

documentation provided to you does have a specification for a Tuple class, this is implemented

at a higher level of the database than we need for this project.

You will implement your own, simple Tuple class for the purposes of this project. The Tuple

class essentially needs to just provide accessor methods for the data stored in the Tuple. You

should not implement any sort of validation logic in the Tuple class, just return the data.

Validation and error handling should be done in the relevant HeapFile methods.

The basic interface the Tuple class needs to satisfy is as follows:

public Tuple()

public Tuple(byte[] data, int offset, int length)

public int getLength()

public byte[] getTupleByteArray()

You can ignore the offset variable. It will not be used in this assignment.

5. Classes to Familiarize Yourself With First
There are three main packages with which you should familiarize yourself: heap, bufmgr,

diskmgr. Note that part of the heap package contains implementation for HFPage. The java

documentation of HFPage is provided to you. A HeapFile is seen as a collection of records.

Internally, records are stored on a collection of HFPage objects.

6. Compiling Your Code and Running the Tests
Copy this file (proj22.zip) to your own local directory and study them carefully. The files are:

– In directory src/heap: Again, the java documentation for package bufmgr, diskmgr and heap

are online.

Note that you DO NOT have to throw the same exceptions as documented for the heap

package. However, for testing purposes, we DO ask you to use InvalidUpdateException to

signal any illegal operations on the record. In other error situations, you should throw exceptions

as you see fit following the error protocol introduced in the Buffer Manager Assignment.

Please check that you are not redefining an exception already defined in the provided

heapAssign.jar library. If an exception you want to create is already defined, just use the

version provided in the library. All required exceptions that the test cases are expecting

in order to compile should be included in the provided library.

– In directory src/tests: This directory contains the source code of the test. The provided test

cases are given to you in order to give you a starting place. They are not going to be the test

cases used for grading. You are encouraged to add your own test cases in order to test for

corner cases.

– In directory lib:heapAssign.jar: this is a library that has the implementation of the Disk

Manager and Buffer Manager layers. As you know, you will develop the Buffer Manager layer in

the first part of this project. We are providing this library to help you start working with the

second part as soon as possible and also to help you test your code comparing the behavior of

your code with the one obtained used the library.

6. What to Turn in
Part 2 & 3 should be compiled together. See Part 3 to know what to turn in.

https://www.cs.purdue.edu/homes/pearso29/cs448_S17/project2/proj22.zip

PART 3: Heap Scans
After completing the buffer manager and heap file layers, you will be able to implement a basic

scan of a given file. A heap scan is simply an iterator that traverses the file’s directory and data

pages to return all records. The major methods of HeapScan.java that you will implement include

the following (please see the javadoc for detailed descriptions):

protected HeapScan(HeapFile hf)

protected void finalize() throws Throwable

public void close()

public boolean hasNext()

public Tuple getNext(RID rid)

Internally, each heap scan should consist of (at least) the following:

Directory Position

The current directory page and/or entry in the scan.

Record Position

The current data page and/or RID in the scan.

The Tuple class is the wrapper of an array of data. Internally it should contain a declaration

byte[] data and other methods called by the test driver.

Your implementation should have at most one directory page and/or at most one data page

pinned at any given time.

What to Turn in
Part 2 & 3 should be compiled together. You should turn in copies of your code together with the

Makefile and a readme file. All need to be zipped in a file

named:your_career_login1_your_career_login2_heapfile.zip.

In the readme file, put the name of your group members, and the feature you would like me to

know. I should be able to compile/run your program using make. You don’t need to include the

library file and other files provided. In the readme file, you should also include the roles of each

group member in the project, i.e., who implemented what.

The directory structure of your zip file should be identical to the directory structure of the

provided zip file (i.e., having the directory src, the Makefile, …), except the top-level name

(should be your Purdue login above). Your grade may be deduced 5% off if you don’t follow this.

Only one member is allowed to submit Part 2 & 3 (the same person who submitted Part 1).

You should upload all your zip files using Blackboard.

https://www.cs.purdue.edu/homes/pearso29/MinibaseJavadoc/index.html

Related Posts