CISC 489/689 Natural Language Processing Homework 2
Due at 11:59pm Monday April 6th, 2020
1. Assign POS tags using the Penn Treebank (PTB) tags for the words in the following two sentences.
a. John and Mary bought a refrigerator with three doors .
b. It was purchased from a very small store near their house .
2. Assume a training data with four sentences where each word has been assigned a PTB POS tag.
• buffalo/NNS flying/VBG is/VBZ dangerous/JJ
• flying/JJ planes/NNS are/VBZ numerous/JJ
• I/PRP saw/VBZ Mary/NNP flying/VBG planes/NNS
• He/PRP planes/VBZ shelves/NNS
a. Create an HMM from this training data by (i) calculating the likelihood probabilities for each word given each POS and (ii) calculating the transition probabilities where states are POS tags.
b. Next draw a table where the columns are positions in the sentence and the rows are names of states (start, end, POS tags) and fill in the probability scores assigned by the Viterbi algorithm assigning POS tags to the string “flying planes.”
3. Assume we have the following twelve grammar rules. Consider the sentence “The rain rains down”.
1. S→NP VP
2. NP→N
3. NP→DT N
4. VP→VADVP
5. VP→V
6. ADVP → ADV
7. DT→the
8. N→rain
9. N→rains
10. V→rain
11. V → rains
12. ADV → down
Apply the CKY algorithm with these grammar rules on the above sentence and show the contents of the parse table for the parse.
4. Design a grammar that will cover the following sentences.
a. i want to fly from boston at 838 am and arrive in denver at 1110 in the morning b. what flights are available from pittsburgh to baltimore on thursday morning
c. what is the arrival time in san francisco for the 755 am flight leaving washington d. What is the cheapest airfare from tacoma to orlando
e. I want round trip fares from pittsburgh to philadelphia under 1000 dollars f. i need a flight tomorrow from columbus to minneapolis
g. what kind of aircraft is used on a flight from cleveland to dallas
h. show me the flights from pittsburgh to los angeles on thursday
i. what kind of ground transportation is available in denver j show me the flights from dallas to san francisco
k. what is the airport at orlando
l. what is the cheapest flight from boston to bwi
Your grammar should be close to X-bar principles. Your grammar can stop at the “preterminal” level (N, V, Adv, Conj, etc.) and needn’t expand all the way to lexical/terminal level (dallas, 838, etc.) Therefore, assume that normally an X (preterminal) will project to X-Bar (i.e., with X-Bar on left-hand-side of the rule and X on the right-hand-side of the rule) and include any modifications of X. You are also allowed to use modification rules such as VP-> VP PP. Recall modification is normally optional. Similarly, X-bar will project to XP and include all complements (required arguments (e.g., VP->V-bar NP). Also rules for conjunction are of form X -> X conj X or XP -> XP conj XP.
Show your parse tree for sentence a and b. Ignoring semantics and relying only on syntax, almost all sentences here will be ambiguous (i.e., have multiple parse trees). Show at least two trees for sentence f.
5. Each day, a doctor asks a patient about how the patient feels on that particular day. Assume the patient response will be one of the following: normal, cold or dizzy. Based on the response, the doctor guesses whether the patient is ok or unwell.
a) Model this with an HMM.
b) The patient visited the doctor on 3 consecutive days. Assume the HMM
probabilities were given, what method will you use to infer about the doctor’s belief about the most likely sequence of the health condition of this patient if the patient’s response to the doctor’s queries was normal, dizzy and cold respectively.