gcuomo commited on
Commit
2abcc91
·
verified ·
1 Parent(s): 98b0d4d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +28 -25
README.md CHANGED
@@ -29,26 +29,28 @@ So while the original LIAR dataset supplies factual claims from political discou
29
 
30
  ### Task Format
31
 
32
- The model reframes classification as a text-to-text task, generating a numeric label (as a string) for each claim:
33
 
34
- - `0`: pants-fire
35
- - `1`: false
36
- - `2`: barely-true
37
- - `3`: half-true
38
- - `4`: mostly-true
39
- - `5`: true
40
 
41
- **Example Input**:
 
 
42
  ```
43
- veracity: Open-source AI systems cannot hallucinate because they're transparent.
44
  ```
45
 
46
  **Example Output**:
47
  ```
48
- 0
49
  ```
50
 
51
- You can map the numeric labels back to human-readable categories using a simple dictionary.
52
 
53
  ### Training Details
54
 
@@ -75,8 +77,11 @@ It is **not** intended for production-grade fact-checking or regulatory enforcem
75
  ### Example Usage
76
 
77
  ```python
 
 
78
  from transformers import T5ForConditionalGeneration, T5Tokenizer
79
 
 
80
  model = T5ForConditionalGeneration.from_pretrained(
81
  "gcuomo/open-source-ai-t5-liar-lens"
82
  )
@@ -84,20 +89,18 @@ tokenizer = T5Tokenizer.from_pretrained(
84
  "gcuomo/open-source-ai-t5-liar-lens"
85
  )
86
 
87
- label_map = {
88
- "0": "pants-fire",
89
- "1": "false",
90
- "2": "barely-true",
91
- "3": "half-true",
92
- "4": "mostly-true",
93
- "5": "true"
94
- }
95
-
96
- input_text = "veracity: Blockchain guarantees ethical outcomes in all AI systems."
97
- inputs = tokenizer(input_text, return_tensors="pt")
98
- output = model.generate(**inputs)
99
- prediction = tokenizer.decode(output[0], skip_special_tokens=True).strip()
100
- print("Predicted label:", label_map.get(prediction, prediction))
101
  ```
102
 
103
  ### Citation
 
29
 
30
  ### Task Format
31
 
32
+ This model treats classification as a **text-to-text generation task**. Each input is a short claim or quote, and the model responds with one of six factuality labels, generated directly as a lowercase string:
33
 
34
+ - `pants-fire`
35
+ - `false`
36
+ - `barely-true`
37
+ - `half-true`
38
+ - `mostly-true`
39
+ - `true`
40
 
41
+ The input format uses a summarization-style prefix to frame the task:
42
+
43
+ **Example Input**:
44
  ```
45
+ summarize: Python is the fastest programming language available.
46
  ```
47
 
48
  **Example Output**:
49
  ```
50
+ half-true
51
  ```
52
 
53
+ This response reflects the model’s ability to evaluate short-form claims with nuance, producing a graded label based on its understanding of truthfulness.
54
 
55
  ### Training Details
56
 
 
77
  ### Example Usage
78
 
79
  ```python
80
+ ### Example Usage
81
+
82
  from transformers import T5ForConditionalGeneration, T5Tokenizer
83
 
84
+ # Load the fine-tuned model and tokenizer
85
  model = T5ForConditionalGeneration.from_pretrained(
86
  "gcuomo/open-source-ai-t5-liar-lens"
87
  )
 
89
  "gcuomo/open-source-ai-t5-liar-lens"
90
  )
91
 
92
+ # Prepare input
93
+ statement = "Blockchain guarantees ethical outcomes in all AI systems."
94
+ prompt = f"summarize: {statement}"
95
+ inputs = tokenizer(prompt, return_tensors="pt", padding=True, truncation=True, max_length=128)
96
+
97
+ # Generate prediction
98
+ output = model.generate(**inputs, max_new_tokens=8)
99
+ prediction = tokenizer.decode(output[0], skip_special_tokens=True).strip().lower()
100
+
101
+ # Print result
102
+ print("Predicted label:", prediction)
103
+
 
 
104
  ```
105
 
106
  ### Citation