Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
h4c5 's Collections
translation
moderation-prompts
embeddings
camemberts

moderation-prompts

updated Apr 18
Upvote
-

  • mmathys/openai-moderation-api-evaluation

    Viewer • Updated Aug 28, 2023 • 1.68k • 409 • 32

  • Anthropic/hh-rlhf

    Viewer • Updated May 26, 2023 • 169k • 12k • 1.34k

  • WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs

    Paper • 2406.18495 • Published Jun 26, 2024 • 13

  • ShieldGemma: Generative AI Content Moderation Based on Gemma

    Paper • 2407.21772 • Published Jul 31, 2024 • 14

  • lmsys/lmsys-chat-1m

    Viewer • Updated Jul 27, 2024 • 1M • 4.53k • 676

  • PKU-Alignment/BeaverTails

    Viewer • Updated Oct 17, 2023 • 364k • 4.17k • 56

  • AgentPublic/camembert-base-toxic-fr-user-prompts

    Text Classification • Updated May 30, 2024 • 699 • 7

  • OpenSafetyLab/Salad-Data

    Viewer • Updated Mar 29, 2024 • 30.4k • 802 • 21

  • meta-llama/Llama-Guard-3-8B

    Text Generation • Updated Oct 11, 2024 • 332k • • 197

  • davanstrien/aart-ai-safety-dataset

    Viewer • Updated Jan 9, 2024 • 3.27k • 27 • 2

  • walledai/AdvBench

    Viewer • Updated Jul 4, 2024 • 520 • 6.87k • 29
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs