loghugging25 commited on
Commit
8ed9715
·
verified ·
1 Parent(s): 7c17d0b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -133,7 +133,7 @@ llamafactory-cli train logicsct_train_Phi4_qlora_sft_otfq.yaml # VRAM used
133
  llamafactory-cli chat logicsct_inference_Phi4_qlora_sft_otfq.yaml # VRAM used: 30927MiB for inference of base model + QLoRA adapter
134
  llamafactory-cli export logicsct_export_Phi4_qlora_sft.yaml # VRAM used: 665MiB + about 29 GB of system RAM for exporting a merged verison of the model with its adapter
135
  llamafactory-cli export logicsct_export_Phi4_qlora_sft_Q4.yaml # VRAM used: 38277MiB for a 4bit quant export of the merged model
136
- llamafactory-cli chat logicsct_inference_Phi4_qlora_sft_otfq_Q4.yaml # VRAM used: 9255MiB-11405MiB VRAM for inference of the 4bit quant merged model (increasing with increasing context length)
137
  ```
138
 
139
  ### Comparison of Open Source Training/Models with OpenAI Proprietary Fine-Tuning
 
133
  llamafactory-cli chat logicsct_inference_Phi4_qlora_sft_otfq.yaml # VRAM used: 30927MiB for inference of base model + QLoRA adapter
134
  llamafactory-cli export logicsct_export_Phi4_qlora_sft.yaml # VRAM used: 665MiB + about 29 GB of system RAM for exporting a merged verison of the model with its adapter
135
  llamafactory-cli export logicsct_export_Phi4_qlora_sft_Q4.yaml # VRAM used: 38277MiB for a 4bit quant export of the merged model
136
+ llamafactory-cli chat logicsct_inference_Phi4_qlora_sft_otfq_Q4.yaml # VRAM used: 9255MiB-11405MiB for inference of the 4bit quant merged model (increasing with increasing context length)
137
  ```
138
 
139
  ### Comparison of Open Source Training/Models with OpenAI Proprietary Fine-Tuning