Update README.md
Browse files
README.md
CHANGED
@@ -26,6 +26,10 @@ language:
|
|
26 |
|
27 |

|
28 |
|
|
|
|
|
|
|
|
|
29 |
A `0.6B` parameter draft (speculative decoding) model for use with [deepseek-ai/DeepSeek-V3-0324](https://huggingface.co/deepseek-ai/DeepSeek-V3-0324) and [deepseek-ai/DeepSeek-V3](https://huggingface.co/deepseek-ai/DeepSeek-V3).
|
30 |
|
31 |
**NOTES**:
|
|
|
26 |
|
27 |

|
28 |
|
29 |
+
***NOTE***: *This is just a slightly improved version that I trained using `"max_position_embeddings": 65536` + `"rope_scaling": {"factor": 2.0, ...` rather than setting the `rope_scaling` after training...*
|
30 |
+
|
31 |
+
---
|
32 |
+
|
33 |
A `0.6B` parameter draft (speculative decoding) model for use with [deepseek-ai/DeepSeek-V3-0324](https://huggingface.co/deepseek-ai/DeepSeek-V3-0324) and [deepseek-ai/DeepSeek-V3](https://huggingface.co/deepseek-ai/DeepSeek-V3).
|
34 |
|
35 |
**NOTES**:
|