Ring-mini-2.0

πŸ€— Hugging Face   |   πŸ€– ModelScope

Introduction

We present a compact yet powerful reasoning model Ring-mini-2.0. It has 16B total parameters, with 1.4B parameters are activated per input token (non-embedding 789M). Although Ring-mini-2.0 is quite compact, it still reaches the top-tier level of sub-10B dense LLMs and even matches or surpasses much larger MoE models, through pre-training on 20T tokens of high-quality data and enhanced through long-cot supervised fine-tuning and multi-stage reinforcement learning.

Model Downloads

Model #Total Params #Activated Params Context Length Download
Ring-mini-2.0 16.8B 1.4B 128K πŸ€— HuggingFace
Ring-lite-2507 16.8B 2.75B 128K πŸ€— HuggingFace

Evaluation

For a comprehensive evaluation of the quality of our reasoning models, we implemented automatic benchmarks to assess their performance including math, code and science. The results indicate Ring-mini-2.0 achieves comparable performace with Ring-lite-2507 while activating only half parameters.

Quickstart

πŸ€— Hugging Face Transformers

Here is a code snippet to show you how to use the chat model with transformers:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "inclusionAI/Ring-mini-2.0"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto",
    trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "Give me a short introduction to large language models."
messages = [
    {"role": "system", "content": "You are Ring, an assistant created by inclusionAI"},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=True
)
model_inputs = tokenizer([text], return_tensors="pt", return_token_type_ids=False).to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=8192
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

Deployment

Please refer to GitHub

License

This code repository is licensed under the MIT License.

Citation

TODO

Downloads last month
27
Safetensors
Model size
16.3B params
Tensor type
BF16
Β·
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for inclusionAI/Ring-mini-2.0

Finetuned
(1)
this model

Collection including inclusionAI/Ring-mini-2.0