Sugoi LLM 14B Ultra (HF version)

This is a 4-bit version of Sugoi 14B Ultra, quantized using GPTQmodel and the VNTL-v3.1-1k dataset. This quant should work better than GGUF for certain backends like vLLM and aphrodite-engine, which excel at asynchronous prompting.

Unleashing the full potential of the previous sugoi 14B model, Sugoi 14B Ultra delivers near-double translation accuracy compared to its quantized predecessor—achieving a BLEU score of 21.38 vs 13.67. Its prompt-following skills rival those of Qwen 2.5 Base, especially when handling the bracket-heavy text commonly found in RPG Maker projects.

Model Overview

Key Improvements
- Nearly 2× BLEU score boost over previous quantized version (21.38 vs 13.67).
- Stronger prompt adherence, especially with RPGM-style bracketed text.
Ideal Use Cases
- Japanese → English translation—especially for game dialogue or RPG text.
- Interactive environments—works well with chat UIs like LM Studio.

System Prompt & Settings

Must include a system prompt for best performance:

You are a professional localizer whose primary goal is to translate Japanese to English. You should use colloquial or slang or nsfw vocabulary if it makes the translation more accurate. Always respond in English.

Additional recommendations:

Context length: ~10 lines (too much may degrade quality).
In LM Studio, you can interactively ask grammar or context questions, or switch target language via the prompt (quality may vary).

Experimental Features

These features are experimental and may need tuning:

Tool Integration & JSON Output
RPGM Tag Preservation

Recommended Sampling Parameters

Parameter	Value
Temperature	0.1
Top-K	40
Top-P	0.95
Min-P	0.05
Repeat Penalty	1.1

Evaluation & Comparison

Quantitative: BLEU score doubled vs prior version (21.38 vs 13.67).
Qualitative: Effective with prompt complexity and RPG Maker markup—delivers clean and accurate translations.

Limitations & Usage Notes

Overly long context may “poison” the output—keep it around 10 lines for best results.
Experimental features like JSON formatting and tag preservation may not always work perfectly—review outputs carefully.
Performance may vary depending on the prompt complexity and UI/tool environment.
Only uncensored for translation task with translation system prompt, other use case such as roleplay,chat may still trigger qwen censoring.

Getting the Model

Available via Files and Versions tab above.

Delnith
/

Sugoi-14B-Ultra-HF-gptqmodel-4bit