metadata
license: other
Art v0 3B
Art v0 3B is our inaugural model in the Art series, fine-tuned from Qwen/Qwen2.5-3B-Instruct using a specialized dataset generated with Gemini 2.0 Flash Thinking.
Model Details
- Base Model: Qwen2.5-3B-Instruct
- Architecture: Transformer
- Size: 3B parameters
Usage
The model incorporates a reasoning mechanism using specific tags:
<|start_reasoning|> model's reasoning process <|end_reasoning|> model's response
Recommendations
- Use the model without quantization
- Use the tokenizer chat template
- Use a low temperature 0.1-0.3 and repetition_penalty of 1.1
Training Details
This experimental model was trained on a curated dataset generated using Gemini 2.0 Flash Thinking. Detailed training methodology, dataset, and code are available exclusively to our community members.
About Us
We are a community-funded AI research lab focused on advancing open-source AGI development. Art v0 3B represents our first step in the Art series of models. Our community members support us through Patreon donations.
Community Access
Our supporters get exclusive access to:
- Training dataset
- Training code and methodology
- Behind-the-scenes development insights
- Future model previews