Update README.md
Browse files
README.md
CHANGED
@@ -204,20 +204,20 @@ NPROC_PER_NODE=4 xtuner train ./pretrain.py --deepspeed deepspeed_zero2
|
|
204 |
The checkpoint and tensorboard logs are saved by default in ./work_dirs/. I only train it for 1 epoch to be same as the original LLaVA paper. Some researches also report that training for multiple epochs will make the model overfit the training dataset and perform worse in other domains.
|
205 |
|
206 |
This is my loss curve for llava-siglip-internlm2-1_8b-pretrain-v1:
|
207 |
-
:
|
217 |
-

|
208 |
|
209 |
And the learning rate curve:
|
210 |
+

|
211 |
|
212 |
2. Instruction following fine-tuning
|
213 |
```
|
214 |
NPROC_PER_NODE=4 xtuner train ./finetune.py --deepspeed deepspeed_zero2
|
215 |
```
|
216 |
Here is my loss curve (the curve fluctuates strongly because the batch size is small, and I only record batch loss instead of epoch loss):
|
217 |
+

|
218 |
|
219 |
And the learning rate curve:
|
220 |
+

|
221 |
|
222 |
## Transfer the checkpoints to Huggingface safetensor format
|
223 |
```
|