COMP6714 Project Specification (stage 2)
October 4, 2018
1 COMP6714 18s2 Project
2 Stage 2: Modify a baseline model of hyponymy classification
2.1 Deadline and Late Penalty
The project deadline is 23:59 26 Oct 2018 (Fri).
Late penalty is -10% each day for the first three days, and then -20% each day afterwards.
2.2 Objective
As explained in stage 1, in this project you need to build a system that can extract hyponym and hypernym
relations from a sentence.
Now, we provide you with the baseline model and you are required to modifiy it according to the
specification given below. The baseline model follows the achitecture introduced in the Stage 1 spec.
2.3 Run (and understand) the baseline model
In order to play with the baseline model, you just need to execute the following command:
python train.py
where * You can modify config.py to change hyper-parameters. * You can modify randomness.py to
manipulate the randomness (e.g., change random seed). * If you want to test the performance of the trained
model, then you need to implement a test method by yourself.
We suggest that you read and understand the baseline model first.
2.4 Your tasks
You need to complete your implementation in the file todo.py. You are required to implemnet the following
three methods. * get char sequence() * new LSTMCell() * evaluate()
The modified todo.py will be submitted for evaluation.
NOTE: you can modify config.py to enable the above modifications in your model.
2.5 Task 1: Implement evaluate() (30%)
You are required to implement the evaluate() method in todo.py. This method computes the F1 score of
the given predicted tags and golden tags (i.e., ground truth).
The input arguments of evaluate() are: * golden list is a list of list of tags, which stores the golden
tags. * predict list is a list of list of tags, which stores the predicted tags.
The method should return the F1 score based on golden list and predict list. In this project, we
only consider the phrase level matching for TAR and HYP (O is not considered). Two entities are matched
when both the boundaries and the tags are the same.
For example, given
1
https://en.wikipedia.org/wiki/Hyponymy_and_hypernymy
golden_list = [[’B-TAR’, ’I-TAR’, ’O’, ’B-HYP’], [’B-TAR’, ’O’, ’O’, ’B-HYP’]]
predict_list = [[’B-TAR’, ’O’, ’O’, ’O’], [’B-TAR’, ’O’, ’B-HYP’, ’I-HYP’]]
• The first TAR in golden list does not match with predict list, as the boundary is not incorrect
(e.g., predict list[0][1] is O, which should be I-TAR for a correct matching).
• The second TAR in golden list matches with the second TAR in predict list, as both the boundary
and the tag are the same.
• The number of false positives in the above example is 2, the number of false negative in the above
example is 3, and the number of true positive is 1. Therefore, the F1 should be 0.286.
NOTE: * The length of the two lists are the same, and length of the i-th instance in both lists are the
same as well. Which means that you do not need to handle the alignment issue.
2.6 Task 2: Implement new LSTMCell() (30%)
You are required to implement a new version of the LSTM Cell (i.e., new LSTMCell() in todo.py), which
has a different logic of controlling the input gate.
Instead of separately deciding what to forget and what we should add new information to, we make those
decisions together. We only forget when we’re going to input something in its place. We only input new
values to the state when we forget something older.
Specifically, before the modification, we have
Ct = ft ∗ Ct−1 + it ∗ C̃t
where it is the activation vector of the input gate, and ft is the activation vector of the forget gate.
By letting it = 1 − ft, after the modification, we have
Ct = ft ∗ Ct−1 + (1 − ft) ∗ C̃t
NOTE: * Your implementation should base on the original implementation (i.e.,
torch.nn. functions.rnn.LSTMCell()). Please read and understand it first. * Please do not change the
input arguments of the method, i.e.,
def new_LSTMCell(input, hidden, w_ih, w_hh, b_ih=None, b_hh=None):
• We do not use GPU in this project, therefore, you do not need to change the following part (or simply
remove them from your implementation), as input.is cuda is always false:
if input.is_cuda:
igates = F.linear(input, w_ih)
hgates = F.linear(hidden[0], w_hh)
state = fusedBackend.LSTMFused.apply
return state(igates, hgates, hidden[1]) if b_ih is None else state(igates, hgates, hidden[1], b_ih, b_hh)
• In order to avoid unnecessary errors during the project evaluation, please do not change the following
three lines from the original implementation, although you may not use all the vaiables.
hx, cx = hidden
gates = F.linear(input, w_ih, b_ih) + F.linear(hx, w_hh, b_hh)
ingate, forgetgate, cellgate, outgate = gates.chunk(4, 1)
2
2.7 Task 3: Implement get char sequence() (20%)
You are required to implement a BiLSTM layer for Character embedding (i.e., char BiLSTM in the fol-
lowing figure). The output of the char BiLSTM will be concatenated with word embedding from the data
preprocessing to form the new input tensor.
The new architecture will be like
More specifically, you need to implement the method get char sequence() in todo.py. Its input argu-
ments are: * model is an object of sequence labeling, refer to line 49 of model.py to see how the method
is called * batch char index matrices is a tensor that can be viewed as a list of matrices storing char ids,
where each matrix corresponds to a sentence, each sentence corresponds to a list of words, and each word
corresponds to a list of char ids. * batch word len lists is tensor that can be viewed as a list of lists.
Where each list corresponds to a sentence, and stores the length of each word.
The output dimension of the char BiLSTM is defined in config.py (i.e., char lstm output dim)
Hint: * We suggest you to read and understand the code in model.py first, especially the part of BiLSTM
layer.
An example of the char BiLSTM layer is as below:
2.8 Report (20%)
You are required to experiment your implementation and submit a report (named as report.pdf). The
report should at least answer the following questions (you should answer them in different sections)
• How do you implement evaluate()?
• How does Modification 1 (i.e., storing model with best performance on the development set) affect the
performance?
• How do you implement new LSTMCell()?
• How does Modification 2 (i.e., re-implemented LSTM cell) affect the performance?
• How do you implement get char sequence()?
• How does Modification 3 (i.e., adding Char BiLSTM layer) affect the performance?
You may need to implement a test function in order to test the performance of models.
2.9 Submission
You need to submit the following 2 files: 1. todo.py 2. report.pdf
NOTE: The detail of how to submit your files will be announced later in the Piazza forum.
2.10 Bonus
After completing the project, you are welcomed to implement your own model (rather than modifying the
given baseline implementation). If you choose to do so, please make sure that
1. your implementation outperforms the baseline model by a large margin (the number will be announced
later) on the given test set.
2. you report the implementation details as a short report.
There are some research papers for your reference: * Long Short-Term Memory as a Dynamically Com-
puted Element-wise Weighted Sum – ACL18 * Deep contextualized word representations – NAACL18 * Fast
and accurate entity recognition with iterated dilated convolutions – EMNLP17 * Neural models for sequence
chunking – AAAI17
2.10.1 Submission of Bonus Part
If you choose to do a bonus part, you need to submit a .zip file which contains: 1. the code of your model
2. the report (as a pdf file)
3
http://www.aclweb.org/anthology/P18-2116
http://www.aclweb.org/anthology/P18-2116
https://arxiv.org/abs/1802.05365
http://aclweb.org/anthology/D17-1283
http://aclweb.org/anthology/D17-1283
https://arxiv.org/abs/1701.04027
https://arxiv.org/abs/1701.04027
The report should contain at least the following two parts: 1. The implementation detail of your model
(e.g., what are the differences between your model and the baseline model). 2. The instruction of how to
execute your code.
NOTE: * It is unnecessary to include the training, development and testing files in your submission. *
The detail of the bonus part will be announced later in the Piazza forum.
4
COMP6714 18s2 Project
Stage 2: Modify a baseline model of hyponymy classification
Deadline and Late Penalty
Objective
Run (and understand) the baseline model
Your tasks
Task 1: Implement evaluate() (30%)
Task 2: Implement new_LSTMCell() (30%)
Task 3: Implement get_char_sequence() (20%)
Report (20%)
Submission
Bonus
Submission of Bonus Part