Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
ethz-spylab 's Collections
The Jailbreak Tax (Jailbreak Utility)
RLHF Poisoning
RLHF Trojan Competition

The Jailbreak Tax (Jailbreak Utility)

updated Apr 5

Models and dataset used in paper "The Jailbreak Tax: How Useful Are Your Jailbreak Outputs"

Upvote
1

  • ethz-spylab/Llama-3.1-70B-Instruct_refuse_math

    Text Generation • Updated Apr 16

  • ethz-spylab/Llama-3.1-70B-Instruct_refuse_biology

    Text Generation • Updated Apr 16

  • ethz-spylab/Llama-3.1-70B-Instruct_do_math_again

    Updated Feb 18

  • ethz-spylab/Llama-3.1-8B-Instruct_do_bio_again

    Updated Mar 7

  • ethz-spylab/Llama-3.1-8B-Instruct_refuse_bio

    Updated Apr 4

  • ethz-spylab/Llama-3.1-8B-Instruct_refuse_math

    Updated Apr 4

  • ethz-spylab/Llama-3.1-70B-Instruct_do_math_chat

    Updated Feb 21

  • ethz-spylab/Llama-3.1-8B-Instruct_do_math_chat

    Updated Feb 17

  • ethz-spylab/Llama-3.1-8B-Instruct_do_math_again

    Updated Feb 17

  • ethz-spylab/EvilMath

    Viewer • Updated Apr 16 • 487 • 63

  • ethz-spylab/Llama-3.1-70B-Instruct_do_biology_5e-5

    Updated Mar 6

  • ethz-spylab/Llama-3.1-70B-Instruct_do_biology_again_5e-5

    Updated Mar 6

  • ethz-spylab/Llama-3.1-8B-Instruct_do_bio

    Updated Mar 28
Upvote
1
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs