sometimesanotion
/

Lamarck-14B-v0.6

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

sometimesanotion commited on 17 days ago

Commit

0048610

·

verified ·

1 Parent(s): d7c0e06

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -29,7 +29,7 @@ Lamarck 14B v0.6:  A generalist merge focused on multi-step reasoning, prose, an
 Previous releases were based on a SLERP merge of model_stock+della branches focused on reasoning and prose.  The prose branch got surprisingly good at reasoning, and the reasoning branch became a strong generalist in its own right.  Some of you have already downloaded it as [sometimesanotion/Qwen2.5-14B-Vimarckoso-v3](https://huggingface.co/sometimesanotion/Qwen2.5-14B-Vimarckoso-v3).
-A notable contribution to the middle to upper layers of Lamarck v0.6 comes from [Krystalan/DRT-o1-14B](https://huggingface.co/Krystalan/DRT-o1-14B).  It has a fascinating research paper: [DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought](https://huggingface.co/papers/2412.17498).  It is only a minor contribution, as I have not resolved IFEval issues with larger merges.  Rigorously tested CoT has not yet arrived in Lamarck.
 Lamarck 0.6 hit a whole new level of toolchain-automated complexity with its multi-pronged merge strategies:

 Previous releases were based on a SLERP merge of model_stock+della branches focused on reasoning and prose.  The prose branch got surprisingly good at reasoning, and the reasoning branch became a strong generalist in its own right.  Some of you have already downloaded it as [sometimesanotion/Qwen2.5-14B-Vimarckoso-v3](https://huggingface.co/sometimesanotion/Qwen2.5-14B-Vimarckoso-v3).
+A notable contribution to the middle to upper layers of Lamarck v0.6 comes from [Krystalan/DRT-o1-14B](https://huggingface.co/Krystalan/DRT-o1-14B).  It has a fascinating research paper: [DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought](https://huggingface.co/papers/2412.17498).
 Lamarck 0.6 hit a whole new level of toolchain-automated complexity with its multi-pronged merge strategies: