Spaces:
Running
Running
| # Simultaneous Machine Translation | |
| This directory contains the code for the paper [Monotonic Multihead Attention](https://openreview.net/forum?id=Hyg96gBKPS) | |
| ## Prepare Data | |
| [Please follow the instructions to download and preprocess the WMT'15 En-De dataset.](https://github.com/pytorch/fairseq/tree/simulastsharedtask/examples/translation#prepare-wmt14en2desh) | |
| Another example of training an English to Japanese model can be found [here](docs/enja.md) | |
| ## Training | |
| - MMA-IL | |
| ```shell | |
| fairseq-train \ | |
| data-bin/wmt15_en_de_32k \ | |
| --simul-type infinite_lookback \ | |
| --user-dir $FAIRSEQ/example/simultaneous_translation \ | |
| --mass-preservation \ | |
| --criterion latency_augmented_label_smoothed_cross_entropy \ | |
| --latency-weight-avg 0.1 \ | |
| --max-update 50000 \ | |
| --arch transformer_monotonic_iwslt_de_en save_dir_key=lambda \ | |
| --optimizer adam --adam-betas '(0.9, 0.98)' \ | |
| --lr-scheduler 'inverse_sqrt' \ | |
| --warmup-init-lr 1e-7 --warmup-updates 4000 \ | |
| --lr 5e-4 --stop-min-lr 1e-9 --clip-norm 0.0 --weight-decay 0.0001\ | |
| --dropout 0.3 \ | |
| --label-smoothing 0.1\ | |
| --max-tokens 3584 | |
| ``` | |
| - MMA-H | |
| ```shell | |
| fairseq-train \ | |
| data-bin/wmt15_en_de_32k \ | |
| --simul-type hard_aligned \ | |
| --user-dir $FAIRSEQ/example/simultaneous_translation \ | |
| --mass-preservation \ | |
| --criterion latency_augmented_label_smoothed_cross_entropy \ | |
| --latency-weight-var 0.1 \ | |
| --max-update 50000 \ | |
| --arch transformer_monotonic_iwslt_de_en save_dir_key=lambda \ | |
| --optimizer adam --adam-betas '(0.9, 0.98)' \ | |
| --lr-scheduler 'inverse_sqrt' \ | |
| --warmup-init-lr 1e-7 --warmup-updates 4000 \ | |
| --lr 5e-4 --stop-min-lr 1e-9 --clip-norm 0.0 --weight-decay 0.0001\ | |
| --dropout 0.3 \ | |
| --label-smoothing 0.1\ | |
| --max-tokens 3584 | |
| ``` | |
| - wait-k | |
| ```shell | |
| fairseq-train \ | |
| data-bin/wmt15_en_de_32k \ | |
| --simul-type wait-k \ | |
| --waitk-lagging 3 \ | |
| --user-dir $FAIRSEQ/example/simultaneous_translation \ | |
| --mass-preservation \ | |
| --criterion latency_augmented_label_smoothed_cross_entropy \ | |
| --max-update 50000 \ | |
| --arch transformer_monotonic_iwslt_de_en save_dir_key=lambda \ | |
| --optimizer adam --adam-betas '(0.9, 0.98)' \ | |
| --lr-scheduler 'inverse_sqrt' \ | |
| --warmup-init-lr 1e-7 --warmup-updates 4000 \ | |
| --lr 5e-4 --stop-min-lr 1e-9 --clip-norm 0.0 --weight-decay 0.0001\ | |
| --dropout 0.3 \ | |
| --label-smoothing 0.1\ | |
| --max-tokens 3584 | |
| ``` | |