mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated Text Generation • 8B • Updated Sep 14, 2024 • 5.46k • • 173
To Trust or Not To Trust Prediction Scores for Membership Inference Attacks Paper • 2111.09076 • Published Nov 17, 2021 • 1
Plug & Play Attacks: Towards Robust and Flexible Model Inversion Attacks Paper • 2201.12179 • Published Jan 28, 2022 • 1
Balancing Transparency and Risk: The Security and Privacy Risks of Open-Source Machine Learning Models Paper • 2308.09490 • Published Aug 18, 2023 • 1
Leveraging Diffusion-Based Image Variations for Robust Training on Poisoned Data Paper • 2310.06372 • Published Oct 10, 2023 • 1
Be Careful What You Smooth For: Label Smoothing Can Be a Privacy Shield but Also a Catalyst for Model Inversion Attacks Paper • 2310.06549 • Published Oct 10, 2023 • 1
Sparsely-gated Mixture-of-Expert Layers for CNN Interpretability Paper • 2204.10598 • Published Apr 22, 2022 • 2
Exploiting Cultural Biases via Homoglyphs in Text-to-Image Synthesis Paper • 2209.08891 • Published Sep 19, 2022 • 2
Class Attribute Inference Attacks: Inferring Sensitive Class Information by Diffusion-Based Attribute Manipulations Paper • 2303.09289 • Published Mar 16, 2023 • 2
SEGA: Instructing Diffusion using Semantic Dimensions Paper • 2301.12247 • Published Jan 28, 2023 • 6
Fair Diffusion: Instructing Text-to-Image Generation Models on Fairness Paper • 2302.10893 • Published Feb 7, 2023 • 6
Rickrolling the Artist: Injecting Backdoors into Text Encoders for Text-to-Image Synthesis Paper • 2211.02408 • Published Nov 4, 2022