metadata

license: other

Art v0 3B

Art v0 3B is our inaugural model in the Art series, fine-tuned from Qwen/Qwen2.5-3B-Instruct using a specialized dataset generated with Gemini 2.0 Flash Thinking.

Model Details

Base Model: Qwen2.5-3B-Instruct
Architecture: Transformer
Size: 3B parameters

Usage

The model incorporates a reasoning mechanism using specific tags:

<|start_reasoning|> model's reasoning process <|end_reasoning|> model's response

Recommendations

Use the model without quantization
Use the tokenizer chat template
Use a low temperature 0.1-0.3 and repetition_penalty of 1.1

Training Details

This experimental model was trained on a curated dataset generated using Gemini 2.0 Flash Thinking. Detailed training methodology, dataset, and code are available exclusively to our community members.

About Us

We are a community-funded AI research lab focused on advancing open-source AGI development. Art v0 3B represents our first step in the Art series of models. Our community members support us through Patreon donations.

Community Access

Our supporters get exclusive access to:

Training dataset
Training code and methodology
Behind-the-scenes development insights
Future model previews

Join Our Community