Was this trained on top of Llama 3.1 70B or Llama 3.3 70B?

#4
by ddh0 - opened

The blog post makes it seem like it was trained on top of L3.3, but the model card shows it as being tuned on top of L3.1:

Screenshot 2025-07-31 at 4.40.43 PM.png

Deep Cogito org

It was trained on top of Llama 3.1 base model. (We used Llama 3.3 in the blog to compare against as a baseline.)

drishanarora changed discussion status to closed

that's pretty impressive

Sign up or log in to comment