chuanguo
/

TamedLlama-8B-Instruct

Model card Files Files and versions Community

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

TamedLlama-8B-Instruct

Repository for TamedLlama-8B-Instruct, a fine-tuned variant of Llama-3-8B-Instruct that is robust against prompt injection attacks. See our TamedLlama paper for more information.

Utility Evaluation (higher is better)

Category	Benchmark	Metric	Llama 3 8B Instruct	TamedLlama 8B Instruct	GPT-4o-mini	GPT-4o (2024-11-20)
General Knowledge	MMLU (0-shot, CoT)	macro_avg/acc	64.6	61.2	82.0^[1]	85.7^[2]
	MMLU Pro (5-shot, CoT)	macro_avg/acc	42.5	40.7	63.1^[3]	77.9^[3]
	IFEval		76.3	74.1	-	-
	BBH (3-shot, CoT)	acc	68.4	64.6	-	-
	GPQA (0-shot, CoT)	acc	35.3	32.6	40.2^[1]	46.0^[2]
Instruction Following	AlpacaEval2	win_rate	28.0	26.5	44.7	56.2
	SEP	win_rate	50.0	48.5	65.9	64.9

Security Evaluation (lower is better)

Category	Benchmark	Metric	Llama 3 8B Instruct	TamedLlama 8B Instruct	GPT-4o-mini	GPT-4o (2024-11-20)
Instruction Following	AlpacaFarm	ASR	23.1	0.0	0.5	0.0
	SEP (start)	ASR	48.7	5.9	14.6	14.8
	SEP (end)	ASR	60.0	6.8	9.1	14.4
	TaskTracker	ASR	5.3	0.2	0.3	0.6
	CyberSecEval2	ASR	43.6	7.3	25.5	20.0

Downloads last month: 1

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for chuanguo/TamedLlama-8B-Instruct

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(893)

this model

Dataset used to train chuanguo/TamedLlama-8B-Instruct