---
license: apache-2.0
datasets:
- shisa-ai/shisa-v2-sharegpt
- shisa-ai/shisa-v2-405b-ultrafeedback-armorm
language:
- ja
- en
base_model:
- Qwen/Qwen3-8B
---

This is a WIP version of Qwen3 8B post-trained on the full Shisa V2 recipe.

This is a *non-reasoning* model and thinking has been disabled in the default `chat_template`.

This will be replaced shortly by a V2.1, but preliminary benchmarks suggest that it is quite strong.


Shaberi (judged by GPT-4.1):

| Model                                | Average | ELYZA 100 | JA-MT | Rakuda | Tengu |
|--------------------------------------|---------|-----------|-------|--------|--------|
| 017-qwen3-8b-v2-dpo405b-clr-nothink  | **7.75**    | **7.88**      | **8.08**  | **8.08**   | **6.94**   |
| shisa-ai/shisa-v2-llama3.1-8b        | 7.14    | 7.54      | 6.83  | 7.85   | 6.34   |
| shisa-ai/shisa-v2-qwen2.5-7b         | 7.10    | 7.48      | 7.40  | 7.18   | 6.33   |

And JA MT-Bench (judged by GPT-4.1):

| Model                                | coding | extraction | humanities | math | reasoning | roleplay | stem | writing | Overall |
|--------------------------------------|--------|------------|------------|------|-----------|----------|------|---------|---------|
| 017-qwen3-8b-v2-dpo405b-clr-nothink  | **7.3**    | **7.55**       | **8.85**       | **9.3**  | **6.05**      | **7.9**      | **8.6**  | **8.9**     | **8.06**    |
| shisa-ai/shisa-v2-qwen2.5-7b         | 6.7    | 7.15       | 7.55       | 8.5  | 5.4       | **7.9**      | 7.5  | 7.7     | 7.3     |
| shisa-ai/shisa-v2-llama3.1-8b        | 5.3    | 6.95       | 8.4        | 6.55 | 5.95      | 7.65     | 7.25 | 7.9     | 6.99    |