QUESTION/ANSWER BOOKLET COMPSCI 367
THE UNIVERSITY OF AUCKLAND
SEMESTER Two 2016 Campus: City
COMPUTER SCIENCE Artificial Intelligence (Time allowed: TWO hours)
This examination is out of 100 marks.
Attempt ALL questions.
Write your answers in the space provided in this booklet. There is space at the back for answers that overflow the allotted space.
The use of calculators is NOT permitted.
This is a closed book exam.
Surname: Forenames: Student ID: Login (UPI):
Page 1 of 17
QUESTION/ANSWER BOOKLET
Question 1
COMPSCI 367 Student ID: _______________
A major constraint on developing knowledge-based systems is the knowledge engineering bottleneck, caused by the difficulty of eliciting knowledge from domain experts. Describe three potential solutions to this bottleneck that you have learned from COMPSCI.367.
[10 marks]
Page 2 of 17
QUESTION/ANSWER BOOKLET COMPSCI 367 Student ID: _______________
Question 2
a) Describe how you would conduct a Turing Test.
b) Explain the relationship between the Turing Test and Searle’s thought experiment the Chinese Room.
Page 3 of 17
QUESTION/ANSWER BOOKLET COMPSCI 367 Student ID: _______________
Question 3
Why would people develop ontologies for domains such as medicine, law or commerce? Your answer should discuss some of the advantages of using ontologies in complex domains.
[10 marks]
Page 4 of 17
QUESTION/ANSWER BOOKLET COMPSCI 367 Student ID: _______________
Question 4
a) Why is blind or brute-force search not feasible for solving the majority of real- world problems? Give an example where blind search will not work.
b) How does heuristic search improve on blind search? Give an example of where heuristic search is used.
Page 5 of 17
QUESTION/ANSWER BOOKLET COMPSCI 367 Student ID: _______________
Question 5
A multiagent system is one that consists of a number of agents, which interact with one-another. In the most general case, agents will be acting on behalf of users with different goals and motivations. Describe the three abilities agents will need to successfully interact with each other.
[10 marks]
Page 6 of 17
QUESTION/ANSWER BOOKLET
Question 6
COMPSCI 367 Student ID: _______________
A bank in has asked us to write an app that allows customers to check on the status of their credit card accounts. In particular, customers whose cards have been disabled should be able to use the software to determine why the card was disabled, and to assess which activities will likely lead to their card being disabled.
A card can become disabled when the account is over its limit, when the card has expired, or when a fraud alert has occurred. Fraud alerts often occur when a card is used internationally or when the card is used at a new vendor, e.g. a store where the customer has never shopped before.
We construct the following belief network containing 7 Boolean variables:
1. International is true if and only if the cardholder uses the card overseas.
2. NewVendor is true if and only if the cardholder uses the card at a new vendor.
3. Alert is true if and only if the bank issues a fraud alert.
4. Limit is true if and only if the account is over its limit.
5. Expired is true if and only if the card has expired.
6. Disabled is true if and only if the card is disabled.
7. Contacted is true if and only if the bank has contacted the cardholder.
Page 7 of 17
QUESTION/ANSWER BOOKLET COMPSCI 367 Student ID: _______________
a) Consider the following scenario: Let’s say you are a cardholder. You are travelling in Europe and your card stops working. However, you did not get contacted by the bank. What are the observed variables and their values in this scenario?
b) The following table shows the conditional probability distribution of the variable Contacted, P( Contacted | Alert, Disabled ).
What is the probability that the bank issues a fraud alert but it does not contact the cardholder? State why.
Page 8 of 17
QUESTION/ANSWER BOOKLET COMPSCI 367 Student ID: _______________
c) Based on your answer in (b), what is then the probability that the bank issues a fraud alert given the evidence in (a)?
Question 7
A presidential election involves two candidates, C and T. Suppose we divide the the presidential race into n time periods. At any time period, the race could be in any one of three states: q1 (C leads), q2 (Tie), and q3 (T leads).
Let’s assume that the race can be described by a stationary Markov chain: The state at any given time t depends on the state at time t-1, as illustrated below.
a) Write down the transition matrix for this Markov chain.
Page 9 of 17
QUESTION/ANSWER BOOKLET COMPSCI 367 Student ID: _______________
b) At each time period, we do not have a perfectly clear picture on the state of the race, but rather, we only have the result of an opinion poll. The poll surveys a small sample of voters and indicates whether any candidate seems to be leading, or they are in a tie. However, the result of the poll may be biased.
At each time period, each candidate may choose to focus on one issue from Employment, Foreign Policy, Immigration, Crimes, and Health Care. The issues both candidates focus on at any time period affect the race in the next time period.
Now you are asked to develop a program which, given the sequence of poll results, and the history of focused issues by both candidates in all earlier time periods, outputs a prediction of the current state of the presidential race.
Describe in detail how you would use a hidden Markov model to solve this problem. For this question you need to state:
• how you may model this scenario using a hidden Markov model (you don’t need to give precise probabilities);
• what problem you need to solve;
• an algorithm that solves this problem (give a rough sketch of how it works).
Page 10 of 17
QUESTION/ANSWER BOOKLET COMPSCI 367 Student ID: _______________
Question 8
Suppose you are a developer of an anonymous social network. You need to write a program that guesses users’ genders (Male or Female) based on their preferences to different genres of movies. Every movie belongs to one or more of the genres Comedy, Action, Romance, Drama and Horror. For each user we count the number of movies in each genre that the user likes (this is indicated by whether they click the “like” button for this movie).
For this task, you first collected information of a few volunteers, as listed in the following table:
Answer the following questions.
a) What are the input features and target features of this problem?
b) Explain the difference between the problems of classification and regression, then tell whether this problem is classification or regression.
Page 11 of 17
QUESTION/ANSWER BOOKLET COMPSCI 367 Student ID: _______________
c) You plan to use instance-based learning to solve this problem. Assuming you use the Manhattan distance as your distance function, describe in detail how you would apply the nearest neighbour algorithm on the following test example.
d) Would you get a different output to the test example above if you choose 3NN algorithm instead? Why could it be better to use 3NN instead of NN?
Page 12 of 17
QUESTION/ANSWER BOOKLET COMPSCI 367 Student ID: _______________
e) You then consider applying decision tree learning to solve this problem. For this you compare the number of actions movies against the numbers of movies of other genres liked by a user and define two Boolean features:
• AR is true if and only if the user likes more action movies than romance movies.
• AH is true if and only if the user likes more action movies than horror movies Let’s say you want to build the decision tree based on information gain (just like in ID3). If you can only split on one of the Boolean features AR and AH above, which feature would you choose? State your reasoning in detail.
Page 13 of 17
QUESTION/ANSWER BOOKLET COMPSCI 367 Student ID: _______________
Question 9
You want to write a program for a ski field which should tell whether a given day is suitable for skiing. We can measure a day’s temperature (in Celsius) and wind level (in the range {−1,0,1}). Based on these measurements, the program should divide the days into two classes, Positive (suitable for skiing) and Negative (unsuitable for skiing). Your training set consists of the following examples:
a) If you apply the perceptron learning algorithm on the training set above starting from an arbitrary initial weight vector, will the algorithm converge?
b) Show the detailed steps of the perceptron learning algorithm starting with the weight vector (w0, w1, w2) = (1,0,0) until it has trained all input once. You should apply the examples in the given order. For each step, write down the applied example, the classification result and the update of the weight vector.
Page 14 of 17
QUESTION/ANSWER BOOKLET COMPSCI 367 Student ID: _______________
Question 10
Are the following sentences true? Explain why.
a) A machine learning algorithm computes a model from a set of training data, so as to be able to make reliable predictions on test data. The problem of overfitting occurs usually when the model is too simple and does not fit the training data well, and as a result the model cannot make good predictions.
b) Back propagation is a common method of training neural networks. When it trains a network using a training example, it determines first the values of the hidden units and then the values of the output units. The algorithm then passes the error back through the network, first computing the errors on the hidden units then the errors on the output nodes, and updates the weights based on the derivative of the error.
OVERFLOW PAGE 1
Page 15 of 17
QUESTION/ANSWER BOOKLET COMPSCI 367 Student ID: _______________
This page was left blank for any questions that overflow. (If you have used this page, please indicate clearly under the relevant question that you have overflowed to this page.)
Page 16 of 17
QUESTION/ANSWER BOOKLET COMPSCI 367 Student ID: _______________
OVERFLOW PAGE 2
This page was left blank for any questions that overflow. (If you have used this page, please indicate clearly under the relevant question that you have overflowed to this page.)
_________________
Page 17 of 17