Update model types

#14
by CombinHorizon - opened

would you please update the following: (but not all of them are clear-cut)

๐ŸŸข pretrained

๐Ÿ”ถ fine-tuned

  • mistralai/Mixtral-8x22B-Instruct-v0.1 (InstructFT)
  • mistralai/Mixtral-8x7B-Instruct-v0.1 (InstructFT)
  • mistralai/Mistral-7B-Instruct-v0.3 (InstructFT)
  • CohereForAI/aya-23-35B (InstructFT)
  • CohereForAI/aya-23-8B (InstructFT)
  • mii-llm/minerva-chat-v0.1-alpha-sft
  • google/recurrentgemma-2b-it

๐ŸŸฆ RL-tuned , RLHF, DPO, ORPO, ..)

  • Qwen/Qwen2-72B-Instruct
  • meta-llama/Meta-Llama-3-70B-Instruct
  • microsoft/Phi-3-small-8k-instruct
  • microsoft/Phi-3-medium-4k-instruct
  • Qwen/Qwen1.5-110B-Chat
  • CohereForAI/c4ai-command-r-plus
  • microsoft/Phi-3-medium-128k-instruct
  • meta-llama/Llama-2-70b-chat-hf
  • CohereForAI/c4ai-command-r-v01
  • mii-llm/maestrale-chat-v0.4-beta
  • Qwen/Qwen2-7B-Instruct
  • microsoft/Phi-3-mini-4k-instruct
  • meta-llama/Meta-Llama-3-8B-Instruct
  • google/gemma-1.1-7b-it
  • google/gemma-7b-it
  • Qwen/Qwen2-1.5B-Instruct
  • google/gemma-1.1-2b-it
  • google/gemma-2b-it
  • Qwen/Qwen2-0.5B-Instruct

btw other leaderboards changed to use a different categorization system
๐ŸŸข pretrained
๐ŸŸฉ continuously pretrained
๐Ÿ”ถ fine-tuned on domain-specific datasets
๐Ÿ’ฌ chat models (RLHF, DPO, IFT, ...
๐Ÿค base merges and moerges

some also add a
๐Ÿ†Ž : language adapted (FP, FT, ...)

so it's
๐ŸŸข pretrained โ†’ ๐ŸŸข or ๐ŸŸฉ
๐Ÿ”ถ fine-tuned โ†’ ๐Ÿ”ถ
โญ• merged โ†’ ๐Ÿค
๐ŸŸฆ RL-tuned โ†’ ๐Ÿ’ฌ

mii-llm org

Hello, and thank you for your thorough comment!
This is feasible, however, I've identified two issues:

  • For some models featured on the leaderboard, I lack the means to determine if they have been fine-tuned, RL-tuned, or continuously pre-trained due to lack of information in the model card.
  • Certain models might belong to multiple categories (e.g., merge + fine-tune). In such cases, how should I prioritize one category over another?

Sign up or log in to comment