zengliangcs commited on
Commit
6842aa2
·
verified ·
1 Parent(s): ac05540

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -21,7 +21,7 @@ base_model:
21
  ## Model Introduction
22
  ***Skywork-SWE-32B*** is a code agent model developed by [Skywork AI](https://skywork.ai/home), specifically designed for software engineering (SWE) tasks. It demonstrates strong performance across several key metrics:
23
  - Skywork-SWE-32B attains 38.0% pass@1 accuracy on the [SWE-bench Verified](https://www.swebench.com) benchmark, outperforming previous open-source SoTA [Qwen2.5-Coder-32B-based](https://huggingface.co/Qwen/Qwen2.5-Coder-32B) LLMs built on the [OpenHands](https://github.com/All-Hands-AI/OpenHands) agent framework.
24
- - When incorporated with test-time scaling techniques, the performance further improves to 47.0% pass@1 accuracy, surpassing the previous SoTA results for sub-32B parameter models.
25
  - We clearly demonstrate the data scaling law phenomenon for software engineering capabilities in LLMs, with no signs of saturation at 8209 collected training trajectories.
26
 
27
  We also introduce an efficient and automated pipeline for SWE data collection, culminating in the creation of the Skywork-SWE dataset---a large-scale, high-quality dataset featuring comprehensive executable runtime environments. Detailed descriptions are available on our [technical report](https://huggingface.co/Skywork/Skywork-SWE-32B/resolve/main/assets/Report.pdf).
 
21
  ## Model Introduction
22
  ***Skywork-SWE-32B*** is a code agent model developed by [Skywork AI](https://skywork.ai/home), specifically designed for software engineering (SWE) tasks. It demonstrates strong performance across several key metrics:
23
  - Skywork-SWE-32B attains 38.0% pass@1 accuracy on the [SWE-bench Verified](https://www.swebench.com) benchmark, outperforming previous open-source SoTA [Qwen2.5-Coder-32B-based](https://huggingface.co/Qwen/Qwen2.5-Coder-32B) LLMs built on the [OpenHands](https://github.com/All-Hands-AI/OpenHands) agent framework.
24
+ - When incorporated with test-time scaling techniques, the performance further improves to 47.0% accuracy, surpassing the previous SoTA results for sub-32B parameter models.
25
  - We clearly demonstrate the data scaling law phenomenon for software engineering capabilities in LLMs, with no signs of saturation at 8209 collected training trajectories.
26
 
27
  We also introduce an efficient and automated pipeline for SWE data collection, culminating in the creation of the Skywork-SWE dataset---a large-scale, high-quality dataset featuring comprehensive executable runtime environments. Detailed descriptions are available on our [technical report](https://huggingface.co/Skywork/Skywork-SWE-32B/resolve/main/assets/Report.pdf).