SCHOOL OF DESIGN, COMMUNICATION AND IT
INFT6201 – BIG DATA TUTORIAL PROJECT 1
This tutorial project is based on a dataset similar to that used in a study by Katok & Kwasnica (2007). The authors conducted a laboratory experiment and compared the revenues of different auction mechanisms, namely static first-price sealed-bid (FPSB) auctions and dynamic Dutch auctions with different clock speeds. The paper can be downloaded here:
http://link.springer.com.ezproxy.newcastle.edu.au/article/10.1007/s10683-007-9169-x
To complete this tutorial project, it is not essential to completely understand every detail of the study. Rather, the study serves an example for conducting statistical data analysis based on behavioural observations from an electronic market environment. The dataset is provided on Blackboard (“Auction.csv”).
EXERCISE 1 (1 MARKS) [R-CODE]
Use R to load the dataset. Determine the number of lines and columns in the dataset. Safe these values into two separate variables called “numberoflines” and “numberofcolumns”. Display these variables on the screen using the cat-command.
EXERCISE 2 (2 MARKS) [R-CODE]
Use R to compute the mean value (variable: HValue1) for the setting “Sealed”. Then, use R to compute the mean value (variable: HValue1) for the setting “Dutch30”. Display the difference in means across the two settings on the screen using the cat-command.
EXERCISE 3 (1 MARKS) [R-CODE]
Use R to determine the standard deviation and the variance of the profit of the winning bidder (variable: HProfit) in the setting “Sealed”. Display these values on the screen using the cat-command. Note: Please only take the observations of the winning bidders into account.
EXERCISE 4 (1 MARK)
In your own words, describe the difference between a standard error (of the mean) and a standard deviation.
EXERCISE 5 (2 MARKS) [R-CODE] Use R to determine the median price (HPrice) across the four different settings (“Sealed”, “Dutch30”,
“Dutch10”, “Dutch1”) and combine them in a vector using the c()-command. Note: Please only take
1/2
the observations of the winning bidders into account. Safe the vector into a variable called “medianPrices”.
Then, use the cat-command to display the minimum median price (i.e., the lowest of the 4 median prices) and the maximum median price (i.e., the highest of the 4 median prices) on the screen. The minimum and maximum values should be determined based on the newly created variable “medianPrices”.
EXERCISE 6 (1 MARK) [R-CODE] Use if() to compare the median prices in the “Dutch10” and the “Dutch30” setting and display a
textual statement on the screen that describes which of the two median prices is higher.
EXERCISE 7 (2 MARKS) [R-CODE]
Write a function that determines the 95% CI of the mean for a given vector x. Call this function “calc95CI”. The function should return a vector of two values: (i) the lower bound of the 95% CI and (ii) the upper bound of the 95% CI. Use the c() to combine the two values into a vector.
Use this function to display the 95% CI of the mean for the prices in the “Sealed” setting. Note: Please only take the observations of the winning bidders into account.
CI = confidence interval
REFERENCES
Katok, E., & Kwasnica, A. M. (2007). Time is money: The effect of clock speed on seller’s revenue in Dutch auctions. Experimental Economics, 11(4), 344–357. http://link.springer.com.ezproxy.newcastle.edu.au/article/10.1007/s10683-007-9169-x
2/2