AGI-0
/

Art-v0-3B

Model card Files Files and versions Community

Art-v0-3B / README.md

gr0010's picture

Update README.md

7a55f84 verified 4 months ago

|

history blame contribute delete

1.37 kB

	---
	license: other
	---

	# Auto Regressive Thinker (Art) v0 3B

	Art v0 3B is our inaugural model in the Art series, fine-tuned from Qwen/Qwen2.5-3B-Instruct using a specialized dataset generated with Gemini 2.0 Flash Thinking.
	[Read more about the Art series](https://blog.agi-0.com/posts/art-series)

	## Model Details
	- Base Model: Qwen2.5-3B-Instruct
	- Architecture: Transformer
	- Size: 3B parameters

	## Usage

	The model incorporates a reasoning mechanism using specific tags:
	```python
	<\|start_reasoning\|> model's reasoning process <\|end_reasoning\|> model's response
	```

	### Recommendations
	- Use the model without quantization
	- Use the tokenizer chat template
	- Use a low temperature 0.1-0.3 and repetition_penalty of 1.1

	## Training Details
	This experimental model was trained on a curated dataset generated using Gemini 2.0 Flash Thinking. Detailed training methodology, dataset, and code are available exclusively to our community members.

	## About Us
	We are a community-funded AI research lab focused on advancing open-source AGI development. Our community members support us through Patreon donations.

	## Community Access
	Our supporters get exclusive access to:
	- Training dataset
	- Training code and methodology
	- Behind-the-scenes development insights
	- Future model previews

	[Join Our Community](https://blog.agi-0.com/posts/join-us)