Update README.md
Browse files
README.md
CHANGED
@@ -81,6 +81,13 @@ It demonstrates strong capabilities in:
|
|
81 |
- Reasoning about code structure and inferring missing logic.
|
82 |
- Generalizing across different programming languages, coding styles, and codebases.
|
83 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
84 |
For detailed benchmark results, please refer to our [📑 paper](https://arxiv.org/pdf/xxx.xxxxx).
|
85 |
|
86 |
## Citation
|
|
|
81 |
- Reasoning about code structure and inferring missing logic.
|
82 |
- Generalizing across different programming languages, coding styles, and codebases.
|
83 |
|
84 |
+
| | DeepSeek-Coder-6.7B-Base | OpenCoder-8B-Base | Qwen2.5-Coder-7B | Seed-Coder-8B-Base |
|
85 |
+
|------------|--------------------------|-------------------|:----------------:|--------------------|
|
86 |
+
| HumanEval | 47.6 | 66.5 | 72.0 | 77.4 |
|
87 |
+
| MBPP | 70.2 | 79.9 | 79.4 | 82.0 |
|
88 |
+
| MultiPL-E | 44.7 | 61.0 | 58.8 | 67.6 |
|
89 |
+
| CruxEval-O | 41.0 | 43.9 | 56.0 | 48.4 |
|
90 |
+
|
91 |
For detailed benchmark results, please refer to our [📑 paper](https://arxiv.org/pdf/xxx.xxxxx).
|
92 |
|
93 |
## Citation
|