Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
@@ -111,18 +111,21 @@ This repository also includes GGUF format models optimized for use with **llama.
|
|
111 |
| File | Size | Format | Use Case | RAM Required |
|
112 |
|------|------|--------|----------|--------------|
|
113 |
| `merged-sci-model.gguf` | 14GB | F16 | Maximum quality inference | ~16GB |
|
|
|
|
|
|
|
114 |
| `merged-sci-model-q4_k_m.gguf` | 4.1GB | Q4_K_M | Balanced quality/performance | ~6GB |
|
115 |
|
116 |
### Usage with Ollama
|
117 |
|
118 |
**1. Download and create Modelfile:**
|
119 |
```bash
|
120 |
-
# Download the
|
121 |
-
wget https://huggingface.co/basiphobe/sci-assistant/resolve/main/merged-sci-model-
|
122 |
|
123 |
# Create Modelfile
|
124 |
cat > Modelfile << 'EOF'
|
125 |
-
FROM ./merged-sci-model-
|
126 |
TEMPLATE """<|im_start|>system
|
127 |
You are a specialized medical assistant for people with spinal cord injuries. Your responses should always consider the unique needs, challenges, and medical realities of individuals living with SCI.<|im_end|>
|
128 |
<|im_start|>user
|
@@ -152,12 +155,12 @@ cd llama.cpp
|
|
152 |
make
|
153 |
|
154 |
# Download model
|
155 |
-
wget https://huggingface.co/basiphobe/sci-assistant/resolve/main/merged-sci-model-
|
156 |
```
|
157 |
|
158 |
**2. Interactive chat:**
|
159 |
```bash
|
160 |
-
./main -m merged-sci-model-
|
161 |
--temp 0.7 \
|
162 |
--repeat_penalty 1.1 \
|
163 |
-c 4096 \
|
@@ -168,7 +171,7 @@ wget https://huggingface.co/basiphobe/sci-assistant/resolve/main/merged-sci-mode
|
|
168 |
|
169 |
**3. Single prompt:**
|
170 |
```bash
|
171 |
-
./main -m merged-sci-model-
|
172 |
--temp 0.7 \
|
173 |
-c 2048 \
|
174 |
-p "<|im_start|>system\nYou are a specialized medical assistant for people with spinal cord injuries.<|im_end|>\n<|im_start|>user\nWhat exercises are good for someone with paraplegia?<|im_end|>\n<|im_start|>assistant\n"
|
@@ -176,10 +179,13 @@ wget https://huggingface.co/basiphobe/sci-assistant/resolve/main/merged-sci-mode
|
|
176 |
|
177 |
### Performance Comparison
|
178 |
|
179 |
-
- **F16 Model** (`merged-sci-model.gguf`): Maximum quality,
|
180 |
-
- **
|
|
|
|
|
|
|
181 |
|
182 |
-
|
183 |
|
184 |
## Intended Use
|
185 |
|
|
|
111 |
| File | Size | Format | Use Case | RAM Required |
|
112 |
|------|------|--------|----------|--------------|
|
113 |
| `merged-sci-model.gguf` | 14GB | F16 | Maximum quality inference | ~16GB |
|
114 |
+
| `merged-sci-model-q6_k.gguf` | 5.6GB | Q6_K | High quality with good compression | ~8GB |
|
115 |
+
| `merged-sci-model-q5_k_m.gguf` | 4.8GB | Q5_K_M | Excellent quality/size balance | ~7GB |
|
116 |
+
| `merged-sci-model-q5_k_s.gguf` | 4.7GB | Q5_K_S | Good quality, slightly smaller | ~7GB |
|
117 |
| `merged-sci-model-q4_k_m.gguf` | 4.1GB | Q4_K_M | Balanced quality/performance | ~6GB |
|
118 |
|
119 |
### Usage with Ollama
|
120 |
|
121 |
**1. Download and create Modelfile:**
|
122 |
```bash
|
123 |
+
# Download the Q5_K_M model (recommended balance of quality/size)
|
124 |
+
wget https://huggingface.co/basiphobe/sci-assistant/resolve/main/merged-sci-model-q5_k_m.gguf
|
125 |
|
126 |
# Create Modelfile
|
127 |
cat > Modelfile << 'EOF'
|
128 |
+
FROM ./merged-sci-model-q5_k_m.gguf
|
129 |
TEMPLATE """<|im_start|>system
|
130 |
You are a specialized medical assistant for people with spinal cord injuries. Your responses should always consider the unique needs, challenges, and medical realities of individuals living with SCI.<|im_end|>
|
131 |
<|im_start|>user
|
|
|
155 |
make
|
156 |
|
157 |
# Download model
|
158 |
+
wget https://huggingface.co/basiphobe/sci-assistant/resolve/main/merged-sci-model-q5_k_m.gguf
|
159 |
```
|
160 |
|
161 |
**2. Interactive chat:**
|
162 |
```bash
|
163 |
+
./main -m merged-sci-model-q5_k_m.gguf \
|
164 |
--temp 0.7 \
|
165 |
--repeat_penalty 1.1 \
|
166 |
-c 4096 \
|
|
|
171 |
|
172 |
**3. Single prompt:**
|
173 |
```bash
|
174 |
+
./main -m merged-sci-model-q5_k_m.gguf \
|
175 |
--temp 0.7 \
|
176 |
-c 2048 \
|
177 |
-p "<|im_start|>system\nYou are a specialized medical assistant for people with spinal cord injuries.<|im_end|>\n<|im_start|>user\nWhat exercises are good for someone with paraplegia?<|im_end|>\n<|im_start|>assistant\n"
|
|
|
179 |
|
180 |
### Performance Comparison
|
181 |
|
182 |
+
- **F16 Model** (`merged-sci-model.gguf`): Maximum quality, largest memory footprint
|
183 |
+
- **Q6_K Model** (`merged-sci-model-q6_k.gguf`): Near-maximum quality with 60% size reduction
|
184 |
+
- **Q5_K_M Model** (`merged-sci-model-q5_k_m.gguf`): Excellent quality retention, good balance
|
185 |
+
- **Q5_K_S Model** (`merged-sci-model-q5_k_s.gguf`): Very good quality, slightly more compressed
|
186 |
+
- **Q4_K_M Model** (`merged-sci-model-q4_k_m.gguf`): Good quality, smallest size, recommended for resource-constrained environments
|
187 |
|
188 |
+
All models use the **ChatML** template format and support up to **32K context length**.
|
189 |
|
190 |
## Intended Use
|
191 |
|