VNTL Settings Guide for ST

#4
by Casual-Autopsy - opened

Alright, I think I've mess around with this model enough to create a guide, so here we go...

SillyTavern

Prerequisite

  • SillyTavern
  • Extensions
    • Required:
      • LALib https://github.com/LenAnderson/SillyTavern-LALib
      • Message Actions https://github.com/LenAnderson/SillyTavern-MessageActions
      • Send Button https://github.com/LenAnderson/SillyTavern-SendButton
    • Recommended:
      • Input History https://github.com/LenAnderson/SillyTavern-InputHistory
      • Keyboard https://github.com/LenAnderson/SillyTavern-Keyboard
      • Notebook https://github.com/SillyTavern/Extension-Notebook
      • Chat Top Bar https://github.com/SillyTavern/Extension-TopInfoBar
      • Backup Browser https://github.com/LenAnderson/SillyTavern-BackupsBrowser
      • Message Limit https://github.com/SillyTavern/Extension-MessageLimit
  • My ST Master Preset
  • AutoHotkey
  • Luna Translator
  • an LLM Backend(KoboldCpp recommended)
  • PC specs that allow you to run VNTL at atleast 10 t/s(5 t/s if you're desperate enough)

Backend Settings

Due to overfitting with 1k token only dataset, translation quality diminishes past the 1k point, but with the help of RoPE, this isn't too much of an issue, so if you use SillyTavern instead of Luna Translator as your LLM frontend, then use these backend configs to fix the quality drop.

Context: 4096

Custom RoPE:
  - RoPE Base: 6315084.4

TextGen

While the model card says to use neutral samplers with 0 temp, I've found these settings to work rather well, increasing writing quality, and even translation accuracy. Keep in mind the golden garbage in, garbage out rule and fix any mistakes you find when using these settings.

Screenshot 2025-01-28 171041.png

Screenshot 2025-01-28 171019.png

Screenshot 2025-01-28 171057.png

Screenshot 2025-01-28 171128.png

QR's

Eventually you'll end up with some translation weirdness that's hard to figure out and fix, in cases like that you can start the context anew by hiding all messages from the prompt. Here's the QR to do so.

/hide 0-{{lastMessageId}}

There will also be cases where you notice lag. SillyTavern isn't made to handle large amounts of messages loaded at once, so make sure to enable a message load limit when opening a chat in User Settings > Chat/Message Handling > # Msg. to Load

Once done that, create a QR to reload the chat whenever things start to get a bit laggy:

/chat-reload

QoL

Document Mode: setting chat style tp document mode allows to quickly enter message edit mode with any chat message by simply double clicking.
Screenshot 2025-01-28 170422.png

Auto-save message edits: Enabling auto-saving for message edits allows to quickly leave message edit mode with Esc and pairs well with document mode.
Screenshot 2025-01-28 170625.png

Setting Up RAG/Vector Storage for Better Translation Quality/Consistency (WIP)

1. Send Message Extension

Open the Send Message extension menu and create a send button with this script:

/send name="Japanese" {{input}} |
/setinput "{{noop}}" |

/data-bank-ingest |

/gen lock=on name="English" as=char "{{noop}}" |
/sendas name="English" {{pipe}} |

Disable clear input Because there's a glitch where input is cleared before it can be sent to the STscript (You'll have to redisable it every time you reopen ST)
To set as your default send, right click the send button on the chat bar and click the title of the custom send in the pop-up menu to set it as default.

This script has to be used since there's no easy way to get the role of a message through ST commands.

2. Vector Storage

Use the following messages to set up Vector Storage:

Screenshot 2025-02-02 181713.png

Screenshot 2025-02-02 181729.png

Screenshot 2025-02-02 181847.png

3. Main STscript

Create a new QRset called MTL MesAct and create a QR called Chat-Pair RAG. Paste the following script:

/if left="{{mes::name}}" right="Japanese" rule=neq {:
    /abort quiet=false QR must be used on a user role message |
:}|

/add {{mes::id}} 1 |
/let n {{pipe}} |
/messages names=on {{mes::id}}-{{var::n}} |

/let mesGrab {{pipe}} |

/re-replace find="/Japanese: /" replace="<\|start_header_id\|>Japanese<\|end_header_id\|>{{newline}}{{newline}}" {{var::mesGrab}} |
/re-replace find="/\n\nEnglish: /" replace="<\|eot_id\|><\|start_header_id\|>English<\|end_header_id\|>{{newline}}{{newline}}" {{pipe}} |

/let mesInst "{{pipe}}<\|eot_id\|>" |

/let entName "" |
/input rows=1 Write data-back entry name(Recommended to use the name of the character speaking) |

/if left="{{pipe}}" right="" rule=eq else={:
    /var key=entName as=string "{{pipe}}" |
:}
{:
    /var key=entName as=string "Translation-Snip" |
:}|


/databank-add name="{{var::entName}}_{{mes::id}}-{{var::n}}" "{{var::mesInst}}" |

Set to the following Icon:

Screenshot 2025-02-02 183446.png

Once done, send the cmd /messageactions "MTL MesAct" | to add it as a button to messages in the expandable menu.

4. Usage

The best way to use RAG to increase quality is to turn every chat pair where you've fixed the translation, and only your fixed translations.

Make sure to only add fixes you are absolutely positive is correct, else RAG will have the opposite effect on translation quality

I'll update this as I figure more stuff out.

Oh! Also to use VNTL with ST, you'll need AutoHotkey. While Luna Translator does have a websocket, currently no server plugin exists to grab the text it retrieves.

Hotkeys:
Ctrl + Alt + T: Target window for switching back to after sending message.
Ctrl + Alt + E: Enable/Disable Auto-pasting to SillyTavern
Here's what I use:

#SingleInstance Force
#Requires AutoHotKey v2.0+

previousClipboard := A_Clipboard
targetWindow := ""
grabToggle := true

CheckClipboard() {
    global previousClipboard
    currentClipboard := A_Clipboard

    if (currentClipboard != previousClipboard) {
        previousClipboard := currentClipboard
        HandleClipboardChange(currentClipboard)
    }
}

HandleClipboardChange(currentClipboard) {
    win := WinExist("SillyTavern")
    if (win AND grabToggle) {
        WinActivate(win)

        WinWaitActive(win)

        Send(currentClipboard)

        Sleep(500)

        Send("{Enter}")

        Sleep(500)

        WindowSwitchBack()
    }
}

WindowSwitchBack() {
    global targetWindow
    win := WinExist(targetWindow)
    if (win) {
        WinActivate(win)

        WinWaitActive(win)
    }
}

SetTimer(CheckClipboard, 500)

^!T:: {
    global targetWindow
    targetWindow := WinGetTitle("A")
    MsgBox("Target window set to: " targetWindow,,"T2")
}

^!E:: {
    global grabToggle
    if (grabToggle) {
        grabToggle := false
        MsgBox("Auto-paste: Disabled",,"T2")
    } else {
        grabToggle := true
        MsgBox("Auto-paste: Enabled",,"T2")
    }
}

Update 1:

  • Added QoL section and updated sampler settings in TextGen

Update 2:

  • Added RAG setup guide
  • Added Prerequisite section
  • Updated AutoHotkey script

Sign up or log in to comment