haonan-li commited on
Commit
80dabb2
1 Parent(s): cc5ee22

add README

Browse files
Files changed (1) hide show
  1. README.md +69 -0
README.md ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+
5
+ #### Current Training Steps: 108,000
6
+
7
+
8
+ This repo contains a merged model using low-rank adaptation (LoRA) for LLaMA-13b
9
+ fit on the [Stanford-Alpaca-52k](https://github.com/tatsu-lab/stanford_alpaca)
10
+ and [databricks-dolly-15k](https://github.com/databrickslabs/dolly/tree/master/data) data in 52 languages.
11
+
12
+ ### Dataset Creation
13
+
14
+ 1. English Instructions: The English instuctions are obtained from [alpaca-52k](https://github.com/tatsu-lab/stanford_alpaca), and [dolly-15k](https://github.com/databrickslabs/dolly/tree/master/data).
15
+ 2. Instruction Translation: The instructions (and inputs) are translated into the target languages using Google Translation API (conducted on April 2023).
16
+ 3. Output Generation: We generate output from `gpt-3.5-turbo` for each language (conducted on April 2023).
17
+
18
+ <h3 align="center">
19
+ <img src="https://raw.githubusercontent.com/fajri91/eval_picts/master/BactrianX_dataset.jpg" width="950" align="center">
20
+ </h3>
21
+
22
+ ### Training Parameters
23
+
24
+ The code for training the model is provided in our [github](https://github.com/mbzuai-nlp/Bactrian-X), which is adapted from [Alpaca-LoRA](https://github.com/tloen/alpaca-lora).
25
+ This version of the weights was trained with the following hyperparameters:
26
+
27
+
28
+ - Epochs: 10
29
+ - Batch size: 128
30
+ - Cutoff length: 512
31
+ - Learning rate: 3e-4
32
+ - Lora _r_: 64
33
+ - Lora target modules: q_proj, k_proj, v_proj, o_proj
34
+
35
+
36
+ That is:
37
+
38
+ ```
39
+ python finetune.py \
40
+ --base_model='decapoda-research/llama-13b-hf' \
41
+ --num_epochs=5 \
42
+ --batch_size=128 \
43
+ --cutoff_len=512 \
44
+ --group_by_length \
45
+ --output_dir='./bactrian-x-llama-13b-lora' \
46
+ --lora_target_modules='q_proj,k_proj,v_proj,o_proj' \
47
+ --lora_r=64 \
48
+ --micro_batch_size=32
49
+ ```
50
+
51
+ Instructions for running it can be found at https://github.com/MBZUAI-nlp/Bactrian-X.
52
+
53
+ ### Discussion of Biases
54
+
55
+ (1) Translation bias; (2) Potential English-culture bias in the translated dataset.
56
+
57
+
58
+ ### Citation Information
59
+
60
+ ```
61
+ @misc{li2023bactrianx,
62
+ title={Bactrian-X : A Multilingual Replicable Instruction-Following Model with Low-Rank Adaptation},
63
+ author={Haonan Li and Fajri Koto and Minghao Wu and Alham Fikri Aji and Timothy Baldwin},
64
+ year={2023},
65
+ eprint={2305.15011},
66
+ archivePrefix={arXiv},
67
+ primaryClass={cs.CL}
68
+ }
69
+ ```