failspy
/

Codestral-22B-v0.1-abliterated-v3

Text Generation

text-generation-inference

Model card Files Files and versions

failspy commited on Jun 3, 2024

Commit

ce64ab1

·

verified ·

1 Parent(s): 7cb2d3f

Update README.md

Files changed (1) hide show

README.md +5 -0

README.md CHANGED Viewed

@@ -29,6 +29,11 @@ Ablate + obliterated = Abliterated
 Anyways, orthogonalization/ablation are both aspects to refer to the same thing here, the technique in which the refusal feature was "ablated" from the model was via orthogonalization.
 ## A little more on the methodology, and why this is interesting
 To me, ablation (or applying the methodology for the inverse, "augmentation") seems to be good for inducing/removing very specific features that you'd have to spend way too many tokens on encouraging or discouraging in your system prompt.

 Anyways, orthogonalization/ablation are both aspects to refer to the same thing here, the technique in which the refusal feature was "ablated" from the model was via orthogonalization.
+## Why uncensor a code model?
+Honestly, this model seems pretty solid outside of code, and it's a perfect size model for 24GB once quantized.
+By ablating refusals, the model is overall more compliant to the user's requests, regardless of ethicality. It's worth remembering that sometimes even "good-aligned" requests can be refused and have to be prompt-engineered around.
 ## A little more on the methodology, and why this is interesting
 To me, ablation (or applying the methodology for the inverse, "augmentation") seems to be good for inducing/removing very specific features that you'd have to spend way too many tokens on encouraging or discouraging in your system prompt.