TEEN-D
/

TD-HallOumi-3B

Text Classification

hallucination-detection

claim-verification

Model card Files Files and versions Community

Teen-Different commited on Apr 4

Commit

0f1060e

·

verified ·

1 Parent(s): e7e0bb1

Update README.md

Files changed (1) hide show

README.md +15 -0

README.md CHANGED Viewed

@@ -24,10 +24,25 @@ metrics:
 # TD-HallOumi-3B: Llama 3.2 3B for Hallucination Detection / Claim Verification
 This model is a fine-tuned version of `meta-llama/Llama-3.2-3B-Instruct` specifically adapted for **Claim Verification** and **Hallucination Detection**. It assesses whether claims made in a response are supported by a given context document.
 This work is inspired by and utilizes datasets developed for the [HallOumi project by Oumi AI](https://oumi.ai/blog/posts/introducing-halloumi), which aims to build trust in AI systems by enabling verifiable outputs. This 3B parameter model is provided by the **TEEN-DIFFERENT** community.
 ## Model Details
 *   **Base Model:** [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct)

 # TD-HallOumi-3B: Llama 3.2 3B for Hallucination Detection / Claim Verification
 This model is a fine-tuned version of `meta-llama/Llama-3.2-3B-Instruct` specifically adapted for **Claim Verification** and **Hallucination Detection**. It assesses whether claims made in a response are supported by a given context document.
 This work is inspired by and utilizes datasets developed for the [HallOumi project by Oumi AI](https://oumi.ai/blog/posts/introducing-halloumi), which aims to build trust in AI systems by enabling verifiable outputs. This 3B parameter model is provided by the **TEEN-DIFFERENT** community.
+## Performance
+Evaluated on the [oumi-ai/oumi-groundedness-benchmark](https://huggingface.co/datasets/oumi-ai/oumi-groundedness-benchmark) for Hallucination Detection (Macro F1 Score):
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/64f8f07735a0a9fc54553c67/PkpHwSBoNxNFgE5KJr08y.png)
+*   **TD-HallOumi-3B\*** achieves **68.00%** Macro F1.
+*   **Highly Efficient:** This 3B parameter model outperforms larger models like Open AI o1, Llama 3.1 405B and Gemini 1.5 Pro.
+*   **Competitive:** Ranks closely behind Claude Sonnet 3.5 (69.60%).
+This model offers strong hallucination detection capabilities with significantly fewer parameters than many alternatives.
 ## Model Details
 *   **Base Model:** [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct)