Text Classification
Transformers
Safetensors
English
llama
text-generation-inference
wizeng23 commited on
Commit
bd123b8
·
verified ·
1 Parent(s): 3414f7e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -12
README.md CHANGED
@@ -22,17 +22,17 @@ base_model:
22
 
23
  <!-- Provide a quick summary of what the model is/does. -->
24
 
25
- Introducing **HallOumi-8B-classifier**, a _fast_ **SOTA hallucination detection model**, outperforming DeepSeek R1, OpenAI o1, Google Gemini 1.5 Pro, and Anthropic Sonnet 3.5 at only 8 billion parameters!
26
 
27
- <!-- Give HallOumi a try now! -->
28
 
29
- <!-- * Demo: https://oumi.ai/halloumi-demo -->
30
- <!-- * Github: https://github.com/oumi-ai/oumi/tree/main/configs/projects/halloumi -->
31
 
32
  | Model | Macro F1 Score | Open? | Model Size |
33
  | --------------------- | -------------- | ----------------- | ---------- |
34
  | **HallOumi-8B** | **77.2% ± 2.2%** | Truly Open Source | 8B |
35
- | Anthropic Sonnet 3.5 | 69.6% ± 2.8% | Closed | ?? |
36
  | OpenAI o1-preview | 65.9% ± 2.3% | Closed | ?? |
37
  | DeepSeek R1 | 61.6% ± 2.5% | Open Weights | 671B |
38
  | Llama 3.1 405B | 58.8% ± 2.4% | Open Weights | 405B |
@@ -55,10 +55,10 @@ While such tools are useful in the right hands, being unable to trust them preve
55
  where it can be utilized safely and responsibly.
56
 
57
  ## Building Trust with Verifiability
58
- To begin trusting AI systems, we have to be able to verify their outputs. To verify, we specifically mean that we need to:
59
 
60
  * Understand the **truthfulness** of a particular statement produced by any model (the key focus of **HallOumi-8B-classifier** model).
61
- * Understand what **information supports that statement’s truth** and have **full traceability** connecting the statement to that information. (Novel aspects treated by our *generative* [HallOumi model](https://huggingface.co/oumi-ai/HallOumi-8B))
62
 
63
 
64
  - **Developed by:** [Oumi AI](https://oumi.ai/)
@@ -66,7 +66,7 @@ To begin trusting AI systems, we have to be able to verify their outputs. To ver
66
  - **Language(s) (NLP):** English
67
  - **License:** [CC-BY-NC-4.0](https://creativecommons.org/licenses/by-nc/4.0/deed.en)
68
  - **Finetuned from model:** [Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct)
69
- <!-- - **Demo:** [HallOumi Demo](https://oumi.ai/halloumi) -->
70
 
71
  ---
72
 
@@ -75,12 +75,12 @@ To begin trusting AI systems, we have to be able to verify their outputs. To ver
75
  <!-- Address questions around how the model is intended to be used, including the foreseeable users and those affected by the model. -->
76
  Use to verify claims/detect hallucinations in scenarios where a known source of truth is available.
77
 
78
- <!-- Demo: https://oumi.ai/halloumi -->
79
 
80
  ## Out-of-Scope Use
81
 
82
  <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
83
- Smaller LLMs have limited capabilities and should be used with caution. Please don't use this model for purposes outside of claim verification.
84
 
85
  ## Bias, Risks, and Limitations
86
 
@@ -100,12 +100,13 @@ Training data:
100
  ### Training Procedure
101
 
102
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when relevant to the training procedure. -->
103
- Training notebook: Coming Soon
104
 
105
  ## Evaluation
106
 
107
  <!-- This section describes the evaluation protocols and provides the results. -->
108
- Eval notebook: Coming Soon
 
109
 
110
  ## Environmental Impact
111
 
 
22
 
23
  <!-- Provide a quick summary of what the model is/does. -->
24
 
25
+ Introducing **HallOumi-8B-classifier**, a _fast_ **SOTA hallucination detection model**, outperforming DeepSeek R1, OpenAI o1, Google Gemini 1.5 Pro, and Claude Sonnet 3.5 at only **8 billion parameters!**
26
 
27
+ Give HallOumi a try now!
28
 
29
+ * Demo: https://oumi.ai/halloumi-demo
30
+ * Github: https://github.com/oumi-ai/oumi/tree/main/configs/projects/halloumi
31
 
32
  | Model | Macro F1 Score | Open? | Model Size |
33
  | --------------------- | -------------- | ----------------- | ---------- |
34
  | **HallOumi-8B** | **77.2% ± 2.2%** | Truly Open Source | 8B |
35
+ | Claude Sonnet 3.5 | 69.6% ± 2.8% | Closed | ?? |
36
  | OpenAI o1-preview | 65.9% ± 2.3% | Closed | ?? |
37
  | DeepSeek R1 | 61.6% ± 2.5% | Open Weights | 671B |
38
  | Llama 3.1 405B | 58.8% ± 2.4% | Open Weights | 405B |
 
55
  where it can be utilized safely and responsibly.
56
 
57
  ## Building Trust with Verifiability
58
+ To be able to begin trusting AI systems, we have to be able to verify their outputs. To verify, we specifically mean that we need to:
59
 
60
  * Understand the **truthfulness** of a particular statement produced by any model (the key focus of **HallOumi-8B-classifier** model).
61
+ * Understand what **information supports that statement’s truth** and have **full traceability** connecting the statement to that information (provided by our *generative* [HallOumi model](https://huggingface.co/oumi-ai/HallOumi-8B))
62
 
63
 
64
  - **Developed by:** [Oumi AI](https://oumi.ai/)
 
66
  - **Language(s) (NLP):** English
67
  - **License:** [CC-BY-NC-4.0](https://creativecommons.org/licenses/by-nc/4.0/deed.en)
68
  - **Finetuned from model:** [Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct)
69
+ - **Demo:** [HallOumi Demo](https://oumi.ai/halloumi)
70
 
71
  ---
72
 
 
75
  <!-- Address questions around how the model is intended to be used, including the foreseeable users and those affected by the model. -->
76
  Use to verify claims/detect hallucinations in scenarios where a known source of truth is available.
77
 
78
+ Demo: https://oumi.ai/halloumi-demo
79
 
80
  ## Out-of-Scope Use
81
 
82
  <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
83
+ Smaller LLMs have limited capabilities and should be used with caution. Avoid using this model for purposes outside of claim verification.
84
 
85
  ## Bias, Risks, and Limitations
86
 
 
100
  ### Training Procedure
101
 
102
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when relevant to the training procedure. -->
103
+ For information on training, see https://oumi.ai/halloumi
104
 
105
  ## Evaluation
106
 
107
  <!-- This section describes the evaluation protocols and provides the results. -->
108
+ Follow along with our notebook on how to evaluate hallucination with HallOumi and other popular models:
109
+ https://github.com/oumi-ai/oumi/blob/main/configs/projects/halloumi/halloumi_eval_notebook.ipynb
110
 
111
  ## Environmental Impact
112