codelion commited on
Commit
4c1cc67
·
verified ·
1 Parent(s): ae97dda

Add comprehensive model card with usage instructions and evaluation results

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -39,8 +39,8 @@ This LoRA adapter enhances google/gemma-3-1b-it with structured reasoning capabi
39
  - **Training Method**: GRPO (Group Relative Policy Optimization)
40
  - **LoRA Rank**: 64
41
  - **LoRA Alpha**: 128
42
- - **Training Samples**: 614
43
- - **Thinking Tag Usage**: 0.0%
44
  - **Average Quality Score**: 0.00
45
 
46
  ## 🔧 Usage
@@ -68,7 +68,7 @@ Problem: If a train travels 120 miles in 2 hours, then increases its speed by 30
68
  Response:'''
69
 
70
  inputs = tokenizer(prompt, return_tensors="pt")
71
- outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.5)
72
  response = tokenizer.decode(outputs[0], skip_special_tokens=True)
73
  print(response)
74
  ```
@@ -131,7 +131,7 @@ The model was trained on self-generated reasoning problems across multiple domai
131
  ## 🔬 Evaluation
132
 
133
  The adapter was evaluated on diverse reasoning tasks:
134
- - Thinking tag usage rate: 0.0%
135
  - Average reasoning quality score: 0.00
136
  - Response comprehensiveness: 0 words average
137
 
 
39
  - **Training Method**: GRPO (Group Relative Policy Optimization)
40
  - **LoRA Rank**: 64
41
  - **LoRA Alpha**: 128
42
+ - **Training Samples**: 107
43
+ - **Thinking Tag Usage**: 40.0%
44
  - **Average Quality Score**: 0.00
45
 
46
  ## 🔧 Usage
 
68
  Response:'''
69
 
70
  inputs = tokenizer(prompt, return_tensors="pt")
71
+ outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.2)
72
  response = tokenizer.decode(outputs[0], skip_special_tokens=True)
73
  print(response)
74
  ```
 
131
  ## 🔬 Evaluation
132
 
133
  The adapter was evaluated on diverse reasoning tasks:
134
+ - Thinking tag usage rate: 40.0%
135
  - Average reasoning quality score: 0.00
136
  - Response comprehensiveness: 0 words average
137