Update README.md
Browse files
README.md
CHANGED
@@ -21,7 +21,7 @@ base_model:
|
|
21 |
## Model Introduction
|
22 |
***Skywork-SWE-32B*** is a code agent model developed by [Skywork AI](https://skywork.ai/home), specifically designed for software engineering (SWE) tasks. It demonstrates strong performance across several key metrics:
|
23 |
- Skywork-SWE-32B attains 38.0% pass@1 accuracy on the [SWE-bench Verified](https://www.swebench.com) benchmark, outperforming previous open-source SoTA [Qwen2.5-Coder-32B-based](https://huggingface.co/Qwen/Qwen2.5-Coder-32B) LLMs built on the [OpenHands](https://github.com/All-Hands-AI/OpenHands) agent framework.
|
24 |
-
- When incorporated with test-time scaling techniques, the performance further improves to 47.0%
|
25 |
- We clearly demonstrate the data scaling law phenomenon for software engineering capabilities in LLMs, with no signs of saturation at 8209 collected training trajectories.
|
26 |
|
27 |
We also introduce an efficient and automated pipeline for SWE data collection, culminating in the creation of the Skywork-SWE dataset---a large-scale, high-quality dataset featuring comprehensive executable runtime environments. Detailed descriptions are available on our [technical report](https://huggingface.co/Skywork/Skywork-SWE-32B/resolve/main/assets/Report.pdf).
|
|
|
21 |
## Model Introduction
|
22 |
***Skywork-SWE-32B*** is a code agent model developed by [Skywork AI](https://skywork.ai/home), specifically designed for software engineering (SWE) tasks. It demonstrates strong performance across several key metrics:
|
23 |
- Skywork-SWE-32B attains 38.0% pass@1 accuracy on the [SWE-bench Verified](https://www.swebench.com) benchmark, outperforming previous open-source SoTA [Qwen2.5-Coder-32B-based](https://huggingface.co/Qwen/Qwen2.5-Coder-32B) LLMs built on the [OpenHands](https://github.com/All-Hands-AI/OpenHands) agent framework.
|
24 |
+
- When incorporated with test-time scaling techniques, the performance further improves to 47.0% accuracy, surpassing the previous SoTA results for sub-32B parameter models.
|
25 |
- We clearly demonstrate the data scaling law phenomenon for software engineering capabilities in LLMs, with no signs of saturation at 8209 collected training trajectories.
|
26 |
|
27 |
We also introduce an efficient and automated pipeline for SWE data collection, culminating in the creation of the Skywork-SWE dataset---a large-scale, high-quality dataset featuring comprehensive executable runtime environments. Detailed descriptions are available on our [technical report](https://huggingface.co/Skywork/Skywork-SWE-32B/resolve/main/assets/Report.pdf).
|