Slightly modified mpt-30b, which has some updates to allow gradient checkpointing/etc., to be compatible with qlora training code. Original model: https://huggingface.co/mosaicml/mpt-30b My fork of qlora with mpt-30b support: https://github.com/jondurbin/qlora Differences in the qlora scripts: - requires adding `--mpt True` for mpt-based models - uses `--num_train_epochs` instead of `--max_steps` - uses airoboros prompt format (mostly 1:1 with vicuna) rather than alpaca, and expects an input file in JSONL format with "instruction" and "response" __I think there's a bug in gradient accumulation, so if you try this, maybe set gradient accumulation steps to 1__ *my first attempts used batch size 6, with gradient accumulation steps 16, but results of three epochs with gradient accumulation vs without were quite a bit worse* __5 epochs seemed to achieve the best results, but YMMV__ Full example of tuning (used for airoboros-mpt-30b-gpt4-1.4): ``` source /workspace/venv/bin/activate export PYTHONPATH=./mpt-30b export WANDB_API_KEY=[redacted] export WANDB_PROJECT=airoboros-mpt-30b-gpt4-1.4 python qlora.py \ --model_name_or_path ./mpt-30b \ --output_dir ./$WANDB_PROJECT-checkpoints \ --num_train_epochs 5 \ --logging_steps 1 \ --save_strategy steps \ --data_seed 11422 \ --save_steps 100 \ --save_total_limit 3 \ --evaluation_strategy "no" \ --eval_dataset_size 2 \ --max_new_tokens 8192 \ --dataloader_num_workers 3 \ --logging_strategy steps \ --remove_unused_columns False \ --do_train \ --lora_r 64 \ --lora_alpha 16 \ --lora_modules all \ --double_quant \ --quant_type nf4 \ --bf16 \ --bits 4 \ --warmup_ratio 0.03 \ --lr_scheduler_type constant \ --dataset ./instructions.jsonl \ --dataset_format airoboros \ --model_max_len 8192 \ --gradient_checkpointing \ --per_device_train_batch_size 6 \ --gradient_accumulation_steps 1 \ --learning_rate 0.0001 \ --adam_beta2 0.999 \ --max_grad_norm 0.3 \ --lora_dropout 0.05 \ --weight_decay 0.0 \ --seed 11422 \ --trust_remote_code \ --mpt True \ --report_to wandb ``` ### Merged model Run the `merge_weights.py` script in the qlora repo: https://github.com/jondurbin/qlora/blob/main/merge_weights.py Then, copy all of the original python files from the mpt-30b repo into your output directory: https://huggingface.co/mosaicml/mpt-30b/tree/main