Yhyu13/oasst-rlhf-2-llama-30b-7k-steps-gptq-4bit

GPTQ 4-bit no actor version for compatibility that works in textgen-webui

Generated by using scripts from https://gitee.com/yhyu13/llama_-tools

Merged weights: https://huggingface.co/Yhyu13/oasst-rlhf-2-llama-30b-7k-steps-hf

Converted LLaMA weights: https://huggingface.co/Yhyu13/llama-30B-hf-openassitant

Delta weights: https://huggingface.co/OpenAssistant/oasst-rlhf-2-llama-30b-7k-steps-xor

OA has done a great jobs in RLHF their pre-trained weights. I must say it is tuned to spit out CoT step by step thinking without you actively prompting it to do so, which is a feature that we observe on ChatGPT and GPT-4.

But note, it still fails at logical paradox tasks such as era of time and bird shot. But none of the LLaMA based models or any available models other than GPT-4 and Claude+ can correct answer paradox questions anyway. So OA rlhf is expected to fail at these tasks, but I do like the RLHF-ed tone which make OA's response sounds professional and proficient.