HuggingFaceTB
/

SmolLM3-3B

Text Generation

Model card Files Files and versions

eliebak HF Staff commited on about 10 hours ago

Commit

b49c771

·

verified ·

1 Parent(s): 90486a3

Update README.md

Files changed (1) hide show

README.md +15 -0

README.md CHANGED Viewed

@@ -87,6 +87,21 @@ print(tokenizer.decode(output_ids, skip_special_tokens=True))
 >[!TIP]
 > We recommend setting `temperature=0.6` and `top_p=0.95` in the sampling parameters.
 ### Enabling and Disabling Extended Thinking Mode
 We enable extended thinking by default, so the example above generates the output with a reasoning trace. For choosing between enabling, you can provide the `/think` and `/no_think` flags through the system prompt as shown in the snippet below for extended thinking disabled. The code for generating the response with extended thinking would be the same except that the system prompt should have `/think` instead of `/no_think`.

 >[!TIP]
 > We recommend setting `temperature=0.6` and `top_p=0.95` in the sampling parameters.
+### Long context processing
+The current `config.json` is set for context length up to 65,536 tokens. To handle longer inputs (128k or 256k), we utilize YaRN you can change the `max_position_embeddings` and rope_scaling` to:
+```
+{
+  ...,
+  "rope_scaling": {
+    "factor": 2.0, #2x65536=131 072
+    "original_max_position_embeddings": 65536,
+    "type": "yarn"
+  }
+}
+```
 ### Enabling and Disabling Extended Thinking Mode
 We enable extended thinking by default, so the example above generates the output with a reasoning trace. For choosing between enabling, you can provide the `/think` and `/no_think` flags through the system prompt as shown in the snippet below for extended thinking disabled. The code for generating the response with extended thinking would be the same except that the system prompt should have `/think` instead of `/no_think`.