Question Answering
Transformers
Safetensors
German
phi3
text-generation
Connect-Transport
Logics Software
German support chatbot
Deutscher KI Chatbot
Kundenservice Chatbot
Deutscher Chatbot
KI-Chatbots für Unternehmen
Chatbot for SMEs
Question-answering
QLoRA fine-tuning
LLM training
custom_code
text-generation-inference
Update README.md
Browse files
README.md
CHANGED
@@ -133,7 +133,7 @@ llamafactory-cli train logicsct_train_Phi4_qlora_sft_otfq.yaml # VRAM used
|
|
133 |
llamafactory-cli chat logicsct_inference_Phi4_qlora_sft_otfq.yaml # VRAM used: 30927MiB for inference of base model + QLoRA adapter
|
134 |
llamafactory-cli export logicsct_export_Phi4_qlora_sft.yaml # VRAM used: 665MiB + about 29 GB of system RAM for exporting a merged verison of the model with its adapter
|
135 |
llamafactory-cli export logicsct_export_Phi4_qlora_sft_Q4.yaml # VRAM used: 38277MiB for a 4bit quant export of the merged model
|
136 |
-
llamafactory-cli chat logicsct_inference_Phi4_qlora_sft_otfq_Q4.yaml # VRAM used: 9255MiB-11405MiB
|
137 |
```
|
138 |
|
139 |
### Comparison of Open Source Training/Models with OpenAI Proprietary Fine-Tuning
|
|
|
133 |
llamafactory-cli chat logicsct_inference_Phi4_qlora_sft_otfq.yaml # VRAM used: 30927MiB for inference of base model + QLoRA adapter
|
134 |
llamafactory-cli export logicsct_export_Phi4_qlora_sft.yaml # VRAM used: 665MiB + about 29 GB of system RAM for exporting a merged verison of the model with its adapter
|
135 |
llamafactory-cli export logicsct_export_Phi4_qlora_sft_Q4.yaml # VRAM used: 38277MiB for a 4bit quant export of the merged model
|
136 |
+
llamafactory-cli chat logicsct_inference_Phi4_qlora_sft_otfq_Q4.yaml # VRAM used: 9255MiB-11405MiB for inference of the 4bit quant merged model (increasing with increasing context length)
|
137 |
```
|
138 |
|
139 |
### Comparison of Open Source Training/Models with OpenAI Proprietary Fine-Tuning
|