Update README.md
Browse files
README.md
CHANGED
@@ -14,7 +14,8 @@ datasets:
|
|
14 |
---
|
15 |
|
16 |
## Inroduction
|
17 |
-
SA stands for Safety and alignment. We fine tuned DeepCoder-1.5B-Preview with STAR-1 for 250 steps to enhance safety alignment using unsloth SFT cookbook
|
|
|
18 |
This model is fine-tuned with policy-grounded data to be safe and aligned with human values while coding. Specifically, it utilizes the STAR-1 dataset, which integrates diverse, deliberative reasoning examples evaluated rigorously by GPT-4o. This ensures the model maintains robust safety standards and minimizes biases, promoting responsible, secure, and effective coding practices without compromising its core reasoning capabilities.
|
19 |
|
20 |
# Uploaded model
|
|
|
14 |
---
|
15 |
|
16 |
## Inroduction
|
17 |
+
SA stands for Safety and alignment. We fine tuned DeepCoder-1.5B-Preview with STAR-1 for 250 steps to enhance safety alignment using unsloth SFT cookbook.
|
18 |
+
|
19 |
This model is fine-tuned with policy-grounded data to be safe and aligned with human values while coding. Specifically, it utilizes the STAR-1 dataset, which integrates diverse, deliberative reasoning examples evaluated rigorously by GPT-4o. This ensures the model maintains robust safety standards and minimizes biases, promoting responsible, secure, and effective coding practices without compromising its core reasoning capabilities.
|
20 |
|
21 |
# Uploaded model
|