Was this trained on top of Llama 3.1 70B or Llama 3.3 70B?

by ddh0 - opened 8 days ago

ddh0

8 days ago

The blog post makes it seem like it was trained on top of L3.3, but the model card shows it as being tuned on top of L3.1:

Deep Cogito org 7 days ago

It was trained on top of Llama 3.1 base model. (We used Llama 3.3 in the blog to compare against as a baseline.)

drishanarora changed discussion status to closed 7 days ago

5 days ago

that's pretty impressive

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment