|
--- |
|
license: other |
|
--- |
|
|
|
# Auto Regressive Thinker (Art) v0 3B |
|
|
|
Art v0 3B is our inaugural model in the Art series, fine-tuned from **Qwen/Qwen2.5-3B-Instruct** using a specialized dataset generated with **Gemini 2.0 Flash Thinking**. |
|
[Read more about the Art series](https://blog.agi-0.com/posts/art-series) |
|
|
|
## Model Details |
|
- **Base Model:** Qwen2.5-3B-Instruct |
|
- **Architecture:** Transformer |
|
- **Size:** 3B parameters |
|
|
|
## Usage |
|
|
|
The model incorporates a reasoning mechanism using specific tags: |
|
```python |
|
<|start_reasoning|> model's reasoning process <|end_reasoning|> model's response |
|
``` |
|
|
|
### Recommendations |
|
- Use the model without quantization |
|
- Use the tokenizer chat template |
|
- Use a low temperature 0.1-0.3 and repetition_penalty of 1.1 |
|
|
|
## Training Details |
|
This experimental model was trained on a curated dataset generated using Gemini 2.0 Flash Thinking. Detailed training methodology, dataset, and code are available exclusively to our community members. |
|
|
|
## About Us |
|
We are a community-funded AI research lab focused on advancing open-source AGI development. Our community members support us through Patreon donations. |
|
|
|
## Community Access |
|
Our supporters get exclusive access to: |
|
- Training dataset |
|
- Training code and methodology |
|
- Behind-the-scenes development insights |
|
- Future model previews |
|
|
|
[Join Our Community](https://blog.agi-0.com/posts/join-us) |