Sugoi LLM 14B Ultra (HF version)

This is a 4-bit version of Sugoi 14B Ultra, quantized using GPTQmodel and the VNTL-v3.1-1k dataset. This quant should work better than GGUF for certain backends like vLLM and aphrodite-engine, which excel at asynchronous prompting.

Unleashing the full potential of the previous sugoi 14B model, Sugoi 14B Ultra delivers near-double translation accuracy compared to its quantized predecessor—achieving a BLEU score of 21.38 vs 13.67. Its prompt-following skills rival those of Qwen 2.5 Base, especially when handling the bracket-heavy text commonly found in RPG Maker projects.


Model Overview

  • Key Improvements

    • Nearly 2× BLEU score boost over previous quantized version (21.38 vs 13.67).
    • Stronger prompt adherence, especially with RPGM-style bracketed text.
  • Ideal Use Cases

    • Japanese → English translation—especially for game dialogue or RPG text.
    • Interactive environments—works well with chat UIs like LM Studio.

System Prompt & Settings

Must include a system prompt for best performance:

You are a professional localizer whose primary goal is to translate Japanese to English. You should use colloquial or slang or nsfw vocabulary if it makes the translation more accurate. Always respond in English.

Additional recommendations:

  • Context length: ~10 lines (too much may degrade quality).
  • In LM Studio, you can interactively ask grammar or context questions, or switch target language via the prompt (quality may vary).

Experimental Features

These features are experimental and may need tuning:

  1. Tool Integration & JSON Output
  2. RPGM Tag Preservation

Recommended Sampling Parameters

Parameter Value
Temperature 0.1
Top-K 40
Top-P 0.95
Min-P 0.05
Repeat Penalty 1.1

Evaluation & Comparison

  • Quantitative: BLEU score doubled vs prior version (21.38 vs 13.67).
  • Qualitative: Effective with prompt complexity and RPG Maker markup—delivers clean and accurate translations.

Limitations & Usage Notes

  • Overly long context may “poison” the output—keep it around 10 lines for best results.
  • Experimental features like JSON formatting and tag preservation may not always work perfectly—review outputs carefully.
  • Performance may vary depending on the prompt complexity and UI/tool environment.
  • Only uncensored for translation task with translation system prompt, other use case such as roleplay,chat may still trigger qwen censoring.

Getting the Model

Available via Files and Versions tab above.

Downloads last month
36
Safetensors
Model size
3.33B params
Tensor type
I32
·
BF16
·
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Delnith/Sugoi-14B-Ultra-HF-gptqmodel-4bit

Base model

Qwen/Qwen2.5-14B
Quantized
(9)
this model

Dataset used to train Delnith/Sugoi-14B-Ultra-HF-gptqmodel-4bit