Update README.md
Browse files
README.md
CHANGED
@@ -22,12 +22,13 @@ model-index:
|
|
22 |
metrics:
|
23 |
- name: Loss
|
24 |
type: loss
|
25 |
-
value: 4.
|
26 |
---
|
27 |
|
28 |
# T5-Small with LoRA on OpenCodeReasoning
|
29 |
|
30 |
This is a LoRA fine-tuned version of T5-small on a subset of NVIDIA's OpenCodeReasoning dataset using [PEFT](https://github.com/huggingface/peft).
|
|
|
31 |
|
32 |
## Loss Curve
|
33 |
|
@@ -43,7 +44,8 @@ This is a LoRA fine-tuned version of T5-small on a subset of NVIDIA's OpenCodeRe
|
|
43 |
| 400 | 4.89 | 4.42 |
|
44 |
| 450 | 4.69 | 4.40 |
|
45 |
|
46 |
-
Final Train Loss: **
|
|
|
47 |
|
48 |
## Example Usage
|
49 |
|
@@ -59,8 +61,9 @@ tokenizer = AutoTokenizer.from_pretrained("ShahzebKhoso/t5-small-opencode-lora")
|
|
59 |
inputs = tokenizer("generate code: write a function to reverse a string", return_tensors="pt")
|
60 |
outputs = model.generate(**inputs)
|
61 |
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
|
|
62 |
|
63 |
-
Notes
|
64 |
|
65 |
Trained on subset of OpenCodeReasoning due to Colab memory limits
|
66 |
|
@@ -69,6 +72,6 @@ Use PeftModel with t5-small base
|
|
69 |
Metrics used: Loss (BLEU skipped due to output structure)
|
70 |
|
71 |
|
72 |
-
License
|
73 |
|
74 |
Apache 2.0
|
|
|
22 |
metrics:
|
23 |
- name: Loss
|
24 |
type: loss
|
25 |
+
value: 4.69
|
26 |
---
|
27 |
|
28 |
# T5-Small with LoRA on OpenCodeReasoning
|
29 |
|
30 |
This is a LoRA fine-tuned version of T5-small on a subset of NVIDIA's OpenCodeReasoning dataset using [PEFT](https://github.com/huggingface/peft).
|
31 |
+
Improved version to be uploaded soon.
|
32 |
|
33 |
## Loss Curve
|
34 |
|
|
|
44 |
| 400 | 4.89 | 4.42 |
|
45 |
| 450 | 4.69 | 4.40 |
|
46 |
|
47 |
+
Final Train Loss: **4.69**
|
48 |
+
Final Eval Loss: **4.40**
|
49 |
|
50 |
## Example Usage
|
51 |
|
|
|
61 |
inputs = tokenizer("generate code: write a function to reverse a string", return_tensors="pt")
|
62 |
outputs = model.generate(**inputs)
|
63 |
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
64 |
+
'''
|
65 |
|
66 |
+
## Notes
|
67 |
|
68 |
Trained on subset of OpenCodeReasoning due to Colab memory limits
|
69 |
|
|
|
72 |
Metrics used: Loss (BLEU skipped due to output structure)
|
73 |
|
74 |
|
75 |
+
## License
|
76 |
|
77 |
Apache 2.0
|