程序代写代做代考 PA4 README

PA4 README

This README describes an overview of the scripts as well as how to run them.

Setup

shell script pip install -r requirements.txt

Run

There are three scripts to be run sequentially.

1. Prepare Data

prepare_data.py loads raw PTB dataset, performs preprocessing at the token level and exports the outputs for future use.

“`shell script python src/prepare_data.py

usage: PA4 Data Loading & Preprocessing Argparser [-h] [–ptb_dir PTB_DIR] [–out_dir OUT_DIR] [–lower] [–reverse_sent] [–prune] [–XX_norm] [–closing_tag] [–force]

optional arguments: -h, –help show this help message and exit –ptb_dir PTB_DIR path to data directory –out_dir OUT_DIR path to data outputs –lower whether to lower all sentence strings –reverse_sent whether to reverse the source sentences –prune whether to remove parenthesis for terminal POS tags –XX_norm whether to normalize all POS tags to XX –closing_tag whether to attach closing tags –force whether to force running entire script “#### Source-side Preprocessing *–lower: lower-case all source sentences *–reverse_sent`: reverse the order of source sentences

Target-side Preprocessing

Our chief aim here is to linearize a phrase structure tree, which is bracketed and has POS tags as its vocabularies. * –prune: Terminal POS tags introduce an additional set of parenthesis with itself being the only element within. Whether to remove this parenthesis. * For exapmle, (TAG1 (TAG2 ) ) -> (TAG1 TAG2 ) * –XX_norm: Various POS tags show up in PTB. Whether to normalize all POS tags to XX. * For example, (TAG1 (TAG2 .. -> (XX (XX .. * –closing_tag: Whether to append closing POS tag after the closing parenthesis. * For example, (TAG1 (TAG2 ) ) -> (TAG1 (TAG2 )TAG2 )TAG1

2. Training

train.py performs training and also periodically evaluates model’s performance on dev set.

Care should be taken when running attentional Seq2seq. It is computationally heavier than the vanilla version and will grab more memory due to broadcasting a hidden dimension across all time-steps when decoding.

“`shell script python src/train.py

usage: PA4 Training Argparser [-h] [–data_dir DATA_DIR] [–model_dir MODEL_DIR] [–glove_dir GLOVE_DIR] [–model {seq2seq,bahdanau,luong_dot,luong_general}] [–embed_dim EMBED_DIM] [–rnn {gru,lstm}] [–num_layers NUM_LAYERS] [–hidden_dim HIDDEN_DIM] [–dropout DROPOUT] [–finetune_pretrained] [–epochs EPOCHS] [–eval_every EVAL_EVERY] [–batch_size BATCH_SIZE] [–learning_rate LEARNING_RATE] [–teacher_forcing_ratio TEACHER_FORCING_RATIO] [–seed SEED] [–resume_training]

optional arguments: -h, –help show this help message and exit

Path Hyperparameters: –data_dir DATA_DIR path to data directory –model_dir MODEL_DIR path to model outputs –glove_dir GLOVE_DIR path to glove dir; should be specified when using GloVe

Model Definition Hyperparameters: –model {seq2seq,bahdanau,luong_dot,luong_general} which model to init and train –embed_dim EMBED_DIM embedding dimension –rnn {gru,lstm} type of rnn to use in encoder and decoder –num_layers NUM_LAYERS number of rnn layers in encoder and decoder –hidden_dim HIDDEN_DIM rnn hidden dimension –dropout DROPOUT dropout probability –finetune_pretrained whether to make pretrained embeddings trainable –teacher_forcing_ratio TEACHER_FORCING_RATIO teacher forcing ratio

Experiment Hyperparameters: –epochs EPOCHS number of training iterations –eval_every EVAL_EVERY interval of epochs to perform evaluation on dev set –batch_size BATCH_SIZE size of mini batch –learning_rate LEARNING_RATE learning rate –seed SEED seed value for replicability –resume_training whether to resume training “`

3. Inference

predict.py only requires model_dir which contains the latest checkpoint from training

“`shell script python src/predict.py

usage: PA4 Inference Argparser [-h] –model_dir MODEL_DIR

optional arguments: -h, –help show this help message and exit –model_dir MODEL_DIR path to model dir “`

Inference processes a single sentence at a time. While this is conventionally done with a while-loop which breaks when EOS token is predicted, we instead create an arbitrary threshold at 3 * source sequence length and perform a for-loop, thereby effectively eliminating any chances of infinite loop. Empirically, on average the linearized trees tend to have length about ~2.5 times that of source sequence, although this will vary by flags used when preparing data in Step 1.

Run All

To run it all, execute run_all.sh. First, make it executable: shell script chmod +x ./run_all.sh You must specify data_dir and model_dir, exactly in this order, followed by optional flags used in individual scripts. Flags only recognized by individual scripts will be ignored by others.

For example: shell script ./run_all.sh [YOUR_DATA_DIR] [YOUR_MODEL_DIR] –rnn GRU –XX_norm –closing_tag –force

Interpreting the results

There are 2 metrics in use, BLEU and TOKEN-LEVEL ACCURACY, both of which are not exactly conventional in syntactic parsing. Token-level accuracy can be especially deceiving as a model that outputs only closing brackets will score very high when linearized phrase structure tree contains no closing POS tags. But they give us some sense of training progress.

The reason why we do not use the canonical Bracketed F1 score (EVALB) is that it requires well-formed trees as input, but a model has to be sufficiently trained to produce such high-quality trees. But this is difficult to achieve given the hardware constraints in terms of time and CPU/GPU specs. An extensive post-processing heruistics to fix ill-formed trees can also help.

Reference

– Grammar as a Foreign Language