School of Information Technology & Computer Science Session: Autumn 2020
University of Wollongong
Scope
Lecturer: Janusz R. Getta
CSCI317 Database Performance Tuning Assignment 4
Published on 25 May 2020
This assignment includes the tasks in improving performance of database applications through transformations of SQL statements, replacing SQL statements with more efficient SQL statements, transformation of database applications written in PL/SQL and JDBC, tuning data buffer caches and using IN-Memory Column Store.
This assignment is due by Saturday, 13 June, 2020, 11.00 pm (sharp).
This assignment is worth 20% of the total evaluation in the subject.
A submission procedure is explained at the end of assignment specification.
This assignment consists of 6 tasks and specification of each task starts from a new page.
It is recommended to solve the problems before attending a laboratory class in order to efficiently use supervised laboratory time.
A submission marked by Moodle as “late” is treated as a late submission no matter how many seconds it is late.
A policy regarding late submissions is included in the subject outline.
A submission of compressed files (zipped, gzipped, rared, tared, 7-zipped, lhzed, … etc) is not allowed. The compressed files will not be evaluated.
All files left on Moodle in a state “Draft(not submitted)” will not be evaluated.
It is expected that all tasks included within Assignment 4 will be solved individually without any cooperation with the other students. If you have any doubts, questions, etc. please consult your lecturer or tutor during lab classes or office hours. Plagiarism will result in a FAIL grade being recorded for the assessment task.
Please read very carefully information included in Prologue section below about the software environment to be used in the subject.
Prologue
In this subject we use Oracle 19c database server running under Oracle Linux 7.4 operating system on a virtual machine hosted by VirtualBox. To start Oracle database server you have to start VirtualBox first. To start VirtualBox navigate through the following menus: Start->All Programs->Oracle VM VirtualBox->Oracle VM VirtualBox.
It is explained in Cookbook for CSIT115 Recipe 1.1, Step 1 “How to start VirtualBox ?” (https://www.uow.edu.au/~jrg/115/COOKBOOK/e1-1-frame.html) how to start VirtualBox.
When VirtualBox is started, import an appliance included in a file OracleLinux7.4- 64bits-Oracle19c-22-JAN-2020.ova and located on a drive VMs(E:) in a folder Virtual Machines\CSCI317-Janusz.
When ready, power on a virtual machine OracleLinux7.4-64bits-Oracle19c- 22-JAN-2020.
A password to a Linux user ORACLE is oracle and a password to Oracle users SYSTEM and SYS (database administrators) is also oracle. Generally, whenever you are asked about a password then it is always oracle, unless you change it.
When logged as a Linux user, you can access Oracle database server either through a command line interface (CLI) SQLcl or through Graphical User Interface (GUI) SQL Developer.
You can find in Cookbook, Recipe 1, How to access Oracle 19c database server, how to use SQL Developer, how to use basic SQL and SQLcl, and how to create a sample database ? more information on how to use SQLcl and SQL Developer.
It is strongly recommended to drop the relational tables of TPC-HR benchmark database and to drop all tablespaces created in the previous Assignment. Then, recreate TPC-HR benchmark database in a way explained in Assignment 1, task 1. The other option is to import a new copy of virtual machine with Oracle 19c and again repeat Assignment 1, task 1.
Tasks
Task 1 (5 marks)
An objective of this task is to improve performance of UPDATE statement.
In this task you must operate on the original state of a sample benchmark TPC-HR database. It is explained at the end of Prologue section how to return to the original state of the database.
In this task you must operate on the original state of a sample benchmark TPC-HR database. It is explained at the end of Prologue section how to return to the original state of the database.
Perform the following steps.
(1) ReadandanalyseSQLstatementsincludedascriptfiletask1.sql.
(2) Use SQL scripts utlestat.sql and utlestat.sql to find the performance statistics from processing of a script task1.sql. We are only interested in I/O operations over tablespaces I/O load distribution among disk drives and data files. Save in a file report1.txt the performance statistics we are interested in.
(3) Next, improve the performance of a script task1.sql through implementation of another sequence of SQL statements that implement the same functionality as SQL statements in the original script task1.sql. The main objective of the optimization is to minimize the total number of I/O operations on a tablespace tpchr. Save your solution in a script solution1.sql.
(4) Use SQL scripts utlestat.sql and utlestat.sql to find the performance statistics from processing of a script solution1.sql. We are only interested in I/O operations over tablespaces I/O load distribution among disk drives and data files.
Append the statistics we are interested in to a file report1.txt. Note that we are not interested in a complete report generated by a script utlestat.sql. Only, information about I/O operations over tablespaces I/O load distribution among disk drives and data files is required.
Save a report from processing of script file solution1.sql in a file solution1.lst.
To get a report from processing of a file solution1.sql use SQLcl and set the options ECHO and FEEDBACK set to ON such that all SQL statements processed are included in the report !
Y ou must put the following SQLcl statements
SPOOL solution1
SET ECHO ON
SET FEEDBACK ON
SET LINESIZE 300
SET PAGESIZE 300
at the beginning of each SQL script implemented and the following statement at the end of the script
SPOOL OFF
A report from processing of the script must have NO syntax errors !
Deliverables
A file solution1.lst that contains a report from the processing of a script solution1.sql, and file report1.txt that contains the required performance statistics from processing of the scripts task1.sql and solution1.sql.
Please remember that inclusion of the entire reports from processing of utlestat.sql scores no marks.
Task 2 (5 marks)
Performance tuning of stored PL/SQL
In this task you must operate on the original state of a sample benchmark TPC-HR database. It is explained at the end of Prologue section how to return to the original state of the database.
One of the very weak points of cost-based query optimizers is translation of queries that include the calls to stored PL/SQL functions. Consider the following PL/SQL stored function TSIZE (the first part of a file task2.sql).
CREATE OR REPLACE FUNCTION TSIZE RETURN NUMBER AS
TS NUMBER;
BEGIN
SELECT AVG(COUNT(*)) INTO TS
FROM LINEITEM
GROUP BY L_ORDERKEY;
RETURN(TS); END;
/
Assume that a stored PL/SQL function TSIZE has been used in SELECT statement in
the following way (the second part of a file task2.sql).
SELECT COUNT(*)
FROM ( SELECT L_ORDERKEY, COUNT(*)
FROM LINEITEM
GROUP BY L_ORDERKEY
HAVING COUNT(*) > (SELECT DISTINCT TSIZE
FROM LINEITEM) );
Process a script file task2.sql to create a stored function and to process SELECT
statement that uses the function. The processing takes forever. I stopped it after 15 minutes. Your task is to improve implementation of a stored function and a query included in a script
task2.sql such that processing takes as short time as possible.
Implement SQL script file solution2.sql that performs the following actions.
(1) First,thescriptcreatesanewandmoreeffectivestoredfunctionTSIZE.
(2) Next,thescriptprocessesanewandimprovedSELECTstatementthatretrievesfrom TPC-HR database the same information as the original SELECT statement.
Processing of a script file solution2.sql must create a report file solution2.lst.
When ready use the scripts utlbstat.sql and utlestat.sql to measure in I/O operations over tablespaces I/O load distribution among disk drives and data files generated by processing of a script solution2.sql. Save the important fragments of a report generated by utlestat.sql in a file report2.txt.
Append to a file report2.txt your explanations why processing of the original implementation takes so long.
To get a report from processing of a file solution2.sql use SQLcl and set the options ECHO and FEEDBACK set to ON such that all SQL statements processed are included in the report !
Y ou must put the following SQLcl statements
SPOOL solution2
SET ECHO ON
SET FEEDBACK ON
SET LINESIZE 300
SET PAGESIZE 300
at the beginning of each SQL script implemented and the following statement at the end of the script
SPOOL OFF
A report from processing of the script must have NO syntax errors !
The script must be processed with SQLcl options ECHO and FEEDBACK set to ON such that all SQL statements processed are included in the report !
Deliverables
A file solution2.lst that contains a report from the processing of a script solution2.sql, and file report2.txt that contains the required performance statistics from processing of the scripts solution2.sql and your explanations why processing of task2.sql takes so long.
Please remember that inclusion of an entire report from processing of utlestat.sql scores no marks.
Task 3 (5 marks)
An objective of this task is to improve performance of JDBC application.
In this task you must operate on the original state of a sample benchmark TPC-HR database. It is explained at the end of Prologue section how to return to the original state of the database.
Consider implementation of JDBC application in a file task3.java.
Your task is to improve performance of the application.
Implement JDBC application with the same functionality as an application included in a file task3.java and save it in a file solution3.java.
Use Linux command time to measure time spend on processing of the application before and after the improvements in the following way.
time java task3
time java solution3
Explain in the comments attached at the end of a file solution3.java why the original application was slower than the improved one and include the results from testing with time command.
Deliverables
A file solution3.java with the improved application, with the explanations why the original application was slower than the improved one and with the results from testing with time command.
Task 4 (5 marks)
An objective of this task is to improve performance of JDBC application.
In this task you must operate on the original state of a sample benchmark TPC-HR database. It is explained at the end of Prologue section how to return to the original state of the database.
Consider implementation of JDBC application in a file task4.java.
Your task is to improve performance of the application.
Implement JDBC application with the same functionality as an application included in a file task4.java and save it in a file solution4.java.
Use Linux command time to measure time spend on processing of the application before and after the improvements in the following way.
time java task4
time java solution4
Explain in the comments attached at the end of a file solution4.java why the original application was slower than the improved one and include the results from testing with time command.
Deliverables
A file solution4.java with the improved application, with the explanations why the original application was slower than the improved one and with the results from testing with time command.
Task 5 (5 marks)
Tunning data buffer cache
In this task you must operate on the original state of a sample benchmark TPC-HR database. It is explained at the end of Prologue section how to return to the original state of the database.
Consider SQL script task5.sql that performs many frequent accesses to the relational tables of TPC-HR database. Use AWR to find the total number of physical and logical read block operations, for example see Cookbook, Recipe 8.2 How to use Automatic Workload Repository (AWR) ?, Step 5 How to generate and how to read AWR reports ?
(1) Connect as a user SYSTEM and flush data buffer cache with a statement: ALTER SYSTEM FLUSH BUFFER_CACHE.
(2) Use AWR to find the total number of physical operations performed during the processing of a script task5.sql.
(3) Increase the size of System Global Area (SGA) by 400Mb. It is explained in Cookbook, Recipe 6.2 How to find and how to change the values of system initialization parameters ? step 4 How to change a system initialization parameter with ALTER SYSTEM statement and SCOPE option ? how to change the size of SGA.
(4) TheobjectiveofthistaskistousetheincreasedcapacityofSGAtoincreasethesizes of DEFAULT, KEEP, and RECYCLE data buffer caches and to assign the relational tables used in SQL statements of task5.sql to DEFAULT, KEEP, and RECYCLE data buffer caches in a way that minimizes the total number of physical operations performed by the script task5.sql.
Assume that you can invest only additional 400Mbytes into all three data buffer
caches. Use ALTER SYSTEM statement to set the values of system initialisation
parameters db_cache_size, db_keep_cache_size, db_recycle_cache_size.
It is explained in Cookbook, Recipe 6.2 How to find and how to change the values of system initialization parameters ? step 4 How to change a system initialization parameter with ALTER SYSTEM statement and SCOPE option ? how to change the sizes of data buffer caches.
Assign the relational tables of a sample database to the appropriate data buffer caches. It is possible to it with ALTER TABLE statement in the following way.
ALTER TABLE ORDERS STORAGE (BUFFER_POOL KEEP);
ALTER TABLE LINEITEM STORAGE (BUFFER_POOL DEFAULT);
Connect as SYSTEM and flush data buffer cache with a statement: ALTER SYSTEM FLUSH BUFFER_CACHE.
(5) Next, use AWR to find the total number of physical operations performed during the execution of the script task5.sql after the reallocation of transient storage and assignments of relational tables to data buffer caches. Save in a file solution5.lst a report from processing SQL statements processed to increase the size of SGA, to increase the sizes of data buffer caches, to allocate the relational tables to data buffer caches.
Append at the end of a file solution5.lst the total number of physical and logical read block operations reported by AWR before and after re-allocation of transient storage and assignments of relational tables to data buffer caches.
Deliverables
A file solution5.lst with the results from AWR before reallocation of transient storage, SQL statements processed to allocate the relational tables to data buffer caches and the results from AWR after the allocations.
Please remember that inclusion of an entire report obtained from AWR scores no marks.
Task 6 (5 marks)
Using In-Memory Column Store
In this task you must operate on the original state of a sample benchmark TPC-HR database. It is explained at the end of Prologue section how to return to the original state of the database.
An objective of this task is to speed up processing of SELECT statements included in a script task6.sql by assigning the relational tables to In-Memory Column Store.
(1) Make sure that the size of In-Memory area is 400Mb. If it is necessary, increase the size of SGA by 400Mb and allocate it to In-Memory Column Store. It is explained the lecture slides how to do it.
(2) Implement SQL script solution6.sql that finds the query processing plans for SELECT statements included in a script task6.sql.
(3) PopulateIn-MemoryColumnStorewithrelationaltablesand/orcolumnsofrelational tables used in a script task6.sql to improve performance in the best way. Include SQL statements that Populate In-Memory Column Store with relational tables into script file solution6.sql.
(4) Insert into a script solution6.sql the statements that find the query processing plans for SELECT statements included in a script task6.sql after Populate In- Memory Column Store with relational tables.
(5) Whenreadyprocessascriptsolution6.sqltocreateareportsolution6.lst.
To get a report from processing of a file solution6.sql use SQLcl and set the options ECHO and FEEDBACK set to ON such that all SQL statements processed are included in the report !
Y ou must put the following SQLcl statements
SPOOL solution6
SET ECHO ON
SET FEEDBACK ON
SET LINESIZE 300
SET PAGESIZE 300
at the beginning of each SQL script implemented and the following statement at the end of the script
SPOOL OFF
A report from processing of the script must have NO syntax errors !
Deliverables
A file solution6.lst that contains a report from the processing of a script solution6.sql.
Submission
Note, that you have only one submission. So, make it absolutely sure that you submit the correct files with the correct contents. No other submission is possible!
Submit the files solution1.lst, report1.txt, solution2.lst, report2.txt, solution3.lst, solution4.lst, solution5.lst, and solution6.lst through Moodle in the following way:
(1) Access Moodle at http://moodle.uowplatform.edu.au/
(2) TologinuseaLoginlinklocatedintherightuppercornertheWebpageorinthe
middle of the bottom of the Web page
(3) When logged select a site CSCI317 (S120) Database Performance
Tuning
(4) ScrolldowntoasectionSubmissions
(5) ClickatalinkInthisplaceyoucansubmittheoutcomesof
Assignment 4
(6) Click at a button Add Submission
(7) Moveafilesolution1.lstintoanareaYoucandraganddropfiles
here to add them. Y ou can also use a link Add…
(8) Repeatstep(7)forthefilesreport1.txt,solution2.lst,report2.txt,
solution3.lst, solution4.lst, solution5.lst, and
solution6.lst.
(9) Click at a button Save changes
(10) Click at a button Submit assignment
(11)Click at the checkbox with a text attached:By checking this box, I
confirm that this submission is my own work, … in order to
confirm the authorship of your submission. (12) Click at a button Continue
A policy regarding late submissions is included in the subject outline.
End of specification