Update README.md
Browse files
README.md
CHANGED
@@ -7,7 +7,7 @@ license: apache-2.0
|
|
7 |
## Introduction
|
8 |
**Seed-Coder-8B-Base** is an 8-billion-parameter foundation model tailored for code understanding and generation. It is designed to provide developers with a powerful, general-purpose code model capable of handling a wide range of coding tasks.
|
9 |
It features:
|
10 |
-
-
|
11 |
- Excels at **code completion** and supports **Fill-in-the-Middle (FIM)** tasks, enabling it to predict missing code spans given partial contexts.
|
12 |
- Robust performance across **various programming languages** and **code reasoning scenarios**, making it ideal for downstream finetuning or direct use in code generation systems.
|
13 |
- **Long-context support** up to 32K tokens, enabling it to handle large codebases, multi-file projects, and extended editing tasks.
|
|
|
7 |
## Introduction
|
8 |
**Seed-Coder-8B-Base** is an 8-billion-parameter foundation model tailored for code understanding and generation. It is designed to provide developers with a powerful, general-purpose code model capable of handling a wide range of coding tasks.
|
9 |
It features:
|
10 |
+
- Pretrained on a **massively curated corpus**, filtered using **LLM-based techniques** to ensure **high-quality real-world code**, **text-code alignment data**, and **synthetic datasets**, resulting in cleaner and more effective learning signals.
|
11 |
- Excels at **code completion** and supports **Fill-in-the-Middle (FIM)** tasks, enabling it to predict missing code spans given partial contexts.
|
12 |
- Robust performance across **various programming languages** and **code reasoning scenarios**, making it ideal for downstream finetuning or direct use in code generation systems.
|
13 |
- **Long-context support** up to 32K tokens, enabling it to handle large codebases, multi-file projects, and extended editing tasks.
|