Update README.md
Browse files
README.md
CHANGED
@@ -11,21 +11,22 @@ We are thrilled to introduce Seed-Coder, a powerful, transparent, and parameter-
|
|
11 |
- Transparent: We openly share detailed insights into our model-centric data pipeline, including methods for curating GitHub data, commits data, and code-related web data.
|
12 |
- Powerful: Seed-Coder achieves state-of-the-art performance among open-source models of comparable size across a diverse range of coding tasks.
|
13 |
|
|
|
|
|
|
|
14 |
|
15 |
## Highlight
|
16 |
|
17 |
|
18 |
-
|
19 |
-
- Pretrained on a
|
20 |
-
- Excels at
|
21 |
-
- Robust performance across
|
22 |
-
-
|
23 |
|
24 |
Seed-Coder-8B-Base serves as the foundation for Seed-Coder-8B-Instruct and Seed-Coder-8B-reasoning.
|
25 |
|
26 |
-
|
27 |
-
<img width="100%" src="imgs/seed-coder_intro_performance.jpg">
|
28 |
-
</p>
|
29 |
|
30 |
## Model Downloads
|
31 |
| Model Name | Length | Download | Notes |
|
|
|
11 |
- Transparent: We openly share detailed insights into our model-centric data pipeline, including methods for curating GitHub data, commits data, and code-related web data.
|
12 |
- Powerful: Seed-Coder achieves state-of-the-art performance among open-source models of comparable size across a diverse range of coding tasks.
|
13 |
|
14 |
+
<p align="center">
|
15 |
+
<img width="100%" src="imgs/seed-coder_intro_performance.jpg">
|
16 |
+
</p>
|
17 |
|
18 |
## Highlight
|
19 |
|
20 |
|
21 |
+
Seed-Coder-8B-Base is an 8-billion-parameter foundation model tailored for code understanding and generation. It is designed to provide developers with a powerful, general-purpose code model capable of handling a wide range of coding tasks. It features:
|
22 |
+
- Pretrained on a massively curated corpus, filtered using **LLM-based techniques** to ensure high-quality real-world code, resulting in cleaner and more effective learning signals.
|
23 |
+
- Excels at code completion and supports Fill-in-the-Middle (FIM) tasks, enabling it to predict missing code spans given partial contexts.
|
24 |
+
- Robust performance across various programming languages, making it ideal for downstream finetuning or direct use in code generation systems.
|
25 |
+
- Long-context support up to 32K tokens, enabling it to handle large codebases, multi-file projects, and extended editing tasks.
|
26 |
|
27 |
Seed-Coder-8B-Base serves as the foundation for Seed-Coder-8B-Instruct and Seed-Coder-8B-reasoning.
|
28 |
|
29 |
+
|
|
|
|
|
30 |
|
31 |
## Model Downloads
|
32 |
| Model Name | Length | Download | Notes |
|