Teen-Different commited on
Commit
0f1060e
·
verified ·
1 Parent(s): e7e0bb1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -0
README.md CHANGED
@@ -24,10 +24,25 @@ metrics:
24
 
25
  # TD-HallOumi-3B: Llama 3.2 3B for Hallucination Detection / Claim Verification
26
 
 
27
  This model is a fine-tuned version of `meta-llama/Llama-3.2-3B-Instruct` specifically adapted for **Claim Verification** and **Hallucination Detection**. It assesses whether claims made in a response are supported by a given context document.
28
 
29
  This work is inspired by and utilizes datasets developed for the [HallOumi project by Oumi AI](https://oumi.ai/blog/posts/introducing-halloumi), which aims to build trust in AI systems by enabling verifiable outputs. This 3B parameter model is provided by the **TEEN-DIFFERENT** community.
30
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
31
  ## Model Details
32
 
33
  * **Base Model:** [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct)
 
24
 
25
  # TD-HallOumi-3B: Llama 3.2 3B for Hallucination Detection / Claim Verification
26
 
27
+
28
  This model is a fine-tuned version of `meta-llama/Llama-3.2-3B-Instruct` specifically adapted for **Claim Verification** and **Hallucination Detection**. It assesses whether claims made in a response are supported by a given context document.
29
 
30
  This work is inspired by and utilizes datasets developed for the [HallOumi project by Oumi AI](https://oumi.ai/blog/posts/introducing-halloumi), which aims to build trust in AI systems by enabling verifiable outputs. This 3B parameter model is provided by the **TEEN-DIFFERENT** community.
31
 
32
+ ## Performance
33
+
34
+ Evaluated on the [oumi-ai/oumi-groundedness-benchmark](https://huggingface.co/datasets/oumi-ai/oumi-groundedness-benchmark) for Hallucination Detection (Macro F1 Score):
35
+
36
+
37
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64f8f07735a0a9fc54553c67/PkpHwSBoNxNFgE5KJr08y.png)
38
+
39
+ * **TD-HallOumi-3B\*** achieves **68.00%** Macro F1.
40
+ * **Highly Efficient:** This 3B parameter model outperforms larger models like Open AI o1, Llama 3.1 405B and Gemini 1.5 Pro.
41
+ * **Competitive:** Ranks closely behind Claude Sonnet 3.5 (69.60%).
42
+
43
+ This model offers strong hallucination detection capabilities with significantly fewer parameters than many alternatives.
44
+
45
+
46
  ## Model Details
47
 
48
  * **Base Model:** [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct)