zqh11 commited on
Commit
1c0953a
1 Parent(s): 035b7b6

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +322 -0
README.md ADDED
@@ -0,0 +1,322 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: deepseek-license
4
+ license_link: LICENSE
5
+ base_model: deepseek-ai/DeepSeek-Coder-V2-Base
6
+ ---
7
+ <!-- markdownlint-disable first-line-h1 -->
8
+ <!-- markdownlint-disable html -->
9
+ <!-- markdownlint-disable no-duplicate-header -->
10
+
11
+ <div align="center">
12
+ <img src="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/logo.svg?raw=true" width="60%" alt="DeepSeek-V2" />
13
+ </div>
14
+ <hr>
15
+ <div align="center" style="line-height: 1;">
16
+ <a href="https://www.deepseek.com/" target="_blank" style="margin: 2px;">
17
+ <img alt="Homepage" src="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/badge.svg?raw=true" style="display: inline-block; vertical-align: middle;"/>
18
+ </a>
19
+ <a href="https://chat.deepseek.com/" target="_blank" style="margin: 2px;">
20
+ <img alt="Chat" src="https://img.shields.io/badge/🤖%20Chat-DeepSeek%20V2-536af5?color=536af5&logoColor=white" style="display: inline-block; vertical-align: middle;"/>
21
+ </a>
22
+ <a href="https://huggingface.co/deepseek-ai" target="_blank" style="margin: 2px;">
23
+ <img alt="Hugging Face" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-DeepSeek%20AI-ffc107?color=ffc107&logoColor=white" style="display: inline-block; vertical-align: middle;"/>
24
+ </a>
25
+ </div>
26
+
27
+ <div align="center" style="line-height: 1;">
28
+ <a href="https://discord.gg/Tc7c45Zzu5" target="_blank" style="margin: 2px;">
29
+ <img alt="Discord" src="https://img.shields.io/badge/Discord-DeepSeek%20AI-7289da?logo=discord&logoColor=white&color=7289da" style="display: inline-block; vertical-align: middle;"/>
30
+ </a>
31
+ <a href="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/qr.jpeg?raw=true" target="_blank" style="margin: 2px;">
32
+ <img alt="Wechat" src="https://img.shields.io/badge/WeChat-DeepSeek%20AI-brightgreen?logo=wechat&logoColor=white" style="display: inline-block; vertical-align: middle;"/>
33
+ </a>
34
+ <a href="https://twitter.com/deepseek_ai" target="_blank" style="margin: 2px;">
35
+ <img alt="Twitter Follow" src="https://img.shields.io/badge/Twitter-deepseek_ai-white?logo=x&logoColor=white" style="display: inline-block; vertical-align: middle;"/>
36
+ </a>
37
+ </div>
38
+
39
+ <div align="center" style="line-height: 1;">
40
+ <a href="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/LICENSE-CODE" style="margin: 2px;">
41
+ <img alt="Code License" src="https://img.shields.io/badge/Code_License-MIT-f5de53?&color=f5de53" style="display: inline-block; vertical-align: middle;"/>
42
+ </a>
43
+ <a href="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/LICENSE-MODEL" style="margin: 2px;">
44
+ <img alt="Model License" src="https://img.shields.io/badge/Model_License-Model_Agreement-f5de53?&color=f5de53" style="display: inline-block; vertical-align: middle;"/>
45
+ </a>
46
+ </div>
47
+ <p align="center">
48
+ <a href="#4-api-platform">API Platform</a> |
49
+ <a href="#5-how-to-run-locally">How to Use</a> |
50
+ <a href="#6-license">License</a> |
51
+ </p>
52
+
53
+
54
+ <p align="center">
55
+ <a href="https://github.com/deepseek-ai/DeepSeek-Coder-V2/blob/main/paper.pdf"><b>Paper Link</b>👁️</a>
56
+ </p>
57
+
58
+ # DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
59
+
60
+ ## 1. Introduction
61
+ We present DeepSeek-Coder-V2, an open-source Mixture-of-Experts (MoE) code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks. Specifically, DeepSeek-Coder-V2 is further pre-trained from an intermediate checkpoint of DeepSeek-V2 with additional 6 trillion tokens. Through this continued pre-training, DeepSeek-Coder-V2 substantially enhances the coding and mathematical reasoning capabilities of DeepSeek-V2, while maintaining comparable performance in general language tasks. Compared to DeepSeek-Coder-33B, DeepSeek-Coder-V2 demonstrates significant advancements in various aspects of code-related tasks, as well as reasoning and general capabilities. Additionally, DeepSeek-Coder-V2 expands its support for programming languages from 86 to 338, while extending the context length from 16K to 128K.
62
+
63
+ <p align="center">
64
+ <img width="100%" src="https://github.com/deepseek-ai/DeepSeek-Coder-V2/blob/main/figures/performance.png?raw=true">
65
+ </p>
66
+
67
+
68
+ In standard benchmark evaluations, DeepSeek-Coder-V2 achieves superior performance compared to closed-source models such as GPT4-Turbo, Claude 3 Opus, and Gemini 1.5 Pro in coding and math benchmarks. The list of supported programming languages can be found [here](https://github.com/deepseek-ai/DeepSeek-Coder-V2/blob/main/supported_langs.txt).
69
+
70
+ ## 2. Model Downloads
71
+
72
+ We release the DeepSeek-Coder-V2 with 16B and 236B parameters based on the [DeepSeekMoE](https://arxiv.org/pdf/2401.06066) framework, which has actived parameters of only 2.4B and 21B , including base and instruct models, to the public.
73
+
74
+ <div align="center">
75
+
76
+ | **Model** | **#Total Params** | **#Active Params** | **Context Length** | **Download** |
77
+ | :-----------------------------: | :---------------: | :----------------: | :----------------: | :----------------------------------------------------------: |
78
+ | DeepSeek-Coder-V2-Lite-Base | 16B | 2.4B | 128k | [🤗 HuggingFace](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Lite-Base) |
79
+ | DeepSeek-Coder-V2-Lite-Instruct | 16B | 2.4B | 128k | [🤗 HuggingFace](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct) |
80
+ | DeepSeek-Coder-V2-Base | 236B | 21B | 128k | [🤗 HuggingFace](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Base) |
81
+ | DeepSeek-Coder-V2-Instruct | 236B | 21B | 128k | [🤗 HuggingFace](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Instruct) |
82
+ | DeepSeek-Coder-V2-Instruct-0724 | 236B | 21B | 128k | [🤗 HuggingFace](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Instruct-0724) |
83
+ </div>
84
+
85
+
86
+ ## 3. Chat Website
87
+
88
+ You can chat with the DeepSeek-Coder-V2 on DeepSeek's official website: [coder.deepseek.com](https://coder.deepseek.com/sign_in)
89
+
90
+ ## 4. API Platform
91
+ We also provide OpenAI-Compatible API at DeepSeek Platform: [platform.deepseek.com](https://platform.deepseek.com/), and you can also pay-as-you-go at an unbeatable price.
92
+ <p align="center">
93
+ <img width="40%" src="https://github.com/deepseek-ai/DeepSeek-Coder-V2/blob/main/figures/model_price.jpg?raw=true">
94
+ </p>
95
+
96
+
97
+ ## 5. How to run locally
98
+ **Here, we provide some examples of how to use DeepSeek-Coder-V2-Lite model. If you want to utilize DeepSeek-Coder-V2 in BF16 format for inference, 80GB*8 GPUs are required.**
99
+
100
+ ### Inference with Huggingface's Transformers
101
+ You can directly employ [Huggingface's Transformers](https://github.com/huggingface/transformers) for model inference.
102
+
103
+ #### Code Completion
104
+ ```python
105
+ from transformers import AutoTokenizer, AutoModelForCausalLM
106
+ import torch
107
+ tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-Coder-V2-Lite-Base", trust_remote_code=True)
108
+ model = AutoModelForCausalLM.from_pretrained("deepseek-ai/DeepSeek-Coder-V2-Lite-Base", trust_remote_code=True, torch_dtype=torch.bfloat16).cuda()
109
+ input_text = "#write a quick sort algorithm"
110
+ inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
111
+ outputs = model.generate(**inputs, max_length=128)
112
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
113
+ ```
114
+
115
+ #### Code Insertion
116
+ ```python
117
+ from transformers import AutoTokenizer, AutoModelForCausalLM
118
+ import torch
119
+ tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-Coder-V2-Lite-Base", trust_remote_code=True)
120
+ model = AutoModelForCausalLM.from_pretrained("deepseek-ai/DeepSeek-Coder-V2-Lite-Base", trust_remote_code=True, torch_dtype=torch.bfloat16).cuda()
121
+ input_text = """<|fim▁begin|>def quick_sort(arr):
122
+ if len(arr) <= 1:
123
+ return arr
124
+ pivot = arr[0]
125
+ left = []
126
+ right = []
127
+ <|fim▁hole|>
128
+ if arr[i] < pivot:
129
+ left.append(arr[i])
130
+ else:
131
+ right.append(arr[i])
132
+ return quick_sort(left) + [pivot] + quick_sort(right)<|fim▁end|>"""
133
+ inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
134
+ outputs = model.generate(**inputs, max_length=128)
135
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True)[len(input_text):])
136
+ ```
137
+
138
+ #### Chat Completion
139
+
140
+ ```python
141
+ from transformers import AutoTokenizer, AutoModelForCausalLM
142
+ import torch
143
+ tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct", trust_remote_code=True)
144
+ model = AutoModelForCausalLM.from_pretrained("deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct", trust_remote_code=True, torch_dtype=torch.bfloat16).cuda()
145
+ messages=[
146
+ { 'role': 'user', 'content': "write a quick sort algorithm in python."}
147
+ ]
148
+ inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
149
+ # tokenizer.eos_token_id is the id of <|end▁of▁sentence|> token
150
+ outputs = model.generate(inputs, max_new_tokens=512, do_sample=False, top_k=50, top_p=0.95, num_return_sequences=1, eos_token_id=tokenizer.eos_token_id)
151
+ print(tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True))
152
+ ```
153
+
154
+
155
+
156
+ The complete chat template can be found within `tokenizer_config.json` located in the huggingface model repository.
157
+
158
+ An example of chat template is as belows:
159
+
160
+ ```bash
161
+ <|begin▁of▁sentence|>User: {user_message_1}
162
+
163
+ Assistant: {assistant_message_1}<|end▁of▁sentence|>User: {user_message_2}
164
+
165
+ Assistant:
166
+ ```
167
+
168
+ You can also add an optional system message:
169
+
170
+ ```bash
171
+ <|begin▁of▁sentence|>{system_message}
172
+
173
+ User: {user_message_1}
174
+
175
+ Assistant: {assistant_message_1}<|end▁of▁sentence|>User: {user_message_2}
176
+
177
+ Assistant:
178
+ ```
179
+
180
+ ### Inference with vLLM (recommended)
181
+ To utilize [vLLM](https://github.com/vllm-project/vllm) for model inference, please merge this Pull Request into your vLLM codebase: https://github.com/vllm-project/vllm/pull/4650.
182
+
183
+ ```python
184
+ from transformers import AutoTokenizer
185
+ from vllm import LLM, SamplingParams
186
+
187
+ max_model_len, tp_size = 8192, 1
188
+ model_name = "deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct"
189
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
190
+ llm = LLM(model=model_name, tensor_parallel_size=tp_size, max_model_len=max_model_len, trust_remote_code=True, enforce_eager=True)
191
+ sampling_params = SamplingParams(temperature=0.3, max_tokens=256, stop_token_ids=[tokenizer.eos_token_id])
192
+
193
+ messages_list = [
194
+ [{"role": "user", "content": "Who are you?"}],
195
+ [{"role": "user", "content": "write a quick sort algorithm in python."}],
196
+ [{"role": "user", "content": "Write a piece of quicksort code in C++."}],
197
+ ]
198
+
199
+ prompt_token_ids = [tokenizer.apply_chat_template(messages, add_generation_prompt=True) for messages in messages_list]
200
+
201
+ outputs = llm.generate(prompt_token_ids=prompt_token_ids, sampling_params=sampling_params)
202
+
203
+ generated_text = [output.outputs[0].text for output in outputs]
204
+ print(generated_text)
205
+ ```
206
+ ## 5. New Features 🎉🎉🎉
207
+
208
+ ### Function calling
209
+
210
+ Function calling allows the model to call external tools to enhance its capabilities.
211
+
212
+ Here is an example:
213
+
214
+ ```python
215
+ # Assume that `model` and `tokenizer` are loaded
216
+ model.generation_config = GenerationConfig(do_sample=False, max_new_tokens=128, eos_token_id=tokenizer.eos_token_id, pad_token_id=tokenizer.eos_token_id)
217
+
218
+ tool_system_prompt = """You are a helpful Assistant.
219
+
220
+ ## Tools
221
+
222
+ ### Function
223
+
224
+ You have the following functions available:
225
+
226
+ - `get_current_weather`:
227
+ ```json
228
+ {
229
+ "name": "get_current_weather",
230
+ "description": "Get the current weather in a given location",
231
+ "parameters": {
232
+ "type": "object",
233
+ "properties": {
234
+ "location": {
235
+ "type": "string",
236
+ "description": "The city and state, e.g. San Francisco, CA"
237
+ },
238
+ "unit": {
239
+ "type": "string",
240
+ "enum": [
241
+ "celsius",
242
+ "fahrenheit"
243
+ ]
244
+ }
245
+ },
246
+ "required": [
247
+ "location"
248
+ ]
249
+ }
250
+ }
251
+ ```"""
252
+
253
+ tool_call_messages = [{"role": "system", "content": tool_system_prompt}, {"role": "user", "content": "What's the weather like in Tokyo and Paris?"}]
254
+ tool_call_inputs = tokenizer.apply_chat_template(tool_call_messages, add_generation_prompt=True, return_tensors="pt")
255
+ tool_call_outputs = model.generate(tool_call_inputs.to(model.device))
256
+ # Generated text: '<|tool▁calls▁begin|><|tool▁call▁begin|>function<|tool▁sep|>get_current_weather\n```json\n{"location": "Tokyo"}\n```<|tool▁call▁end|>\n<|tool▁call▁begin|>function<|tool▁sep|>get_current_weather\n```json\n{"location": "Paris"}\n```<|tool▁call▁end|><|tool▁calls▁end|><|end▁of▁sentence|>'
257
+
258
+ # Mock response of calling `get_current_weather`
259
+ tool_messages = [{"role": "tool", "content": '{"location": "Tokyo", "temperature": "10", "unit": null}'}, {"role": "tool", "content": '{"location": "Paris", "temperature": "22", "unit": null}'}]
260
+ tool_inputs = tokenizer.apply_chat_template(tool_messages, add_generation_prompt=False, return_tensors="pt")[:, 1:]
261
+ tool_inputs = torch.cat([tool_call_outputs, tool_inputs.to(model.device)], dim=1)
262
+ tool_outputs = model.generate(tool_inputs)
263
+ # Generated text: The current weather in Tokyo is 10 degrees, and in Paris, it is 22 degrees.<|end▁of▁sentence|>
264
+ ```
265
+
266
+ ### JSON output
267
+
268
+ You can use JSON Output Mode to ensure the model generates a valid JSON object. To active this mode, a special instruction should be appended to your system prompt.
269
+
270
+ ```python
271
+ # Assume that `model` and `tokenizer` are loaded
272
+ model.generation_config = GenerationConfig(do_sample=False, max_new_tokens=128, eos_token_id=tokenizer.eos_token_id, pad_token_id=tokenizer.eos_token_id)
273
+
274
+ user_system_prompt = 'The user will provide some exam text. Please parse the "question" and "answer" and output them in JSON format.'
275
+ json_system_prompt = f"""{user_system_prompt}
276
+
277
+ ## Response Format
278
+
279
+ Reply with JSON object ONLY."""
280
+
281
+ json_messages = [{"role": "system", "content": json_system_prompt}, {"role": "user", "content": "Which is the highest mountain in the world? Mount Everest."}]
282
+ json_inputs = tokenizer.apply_chat_template(json_messages, add_generation_prompt=True, return_tensors="pt")
283
+ json_outpus = model.generate(json_inputs.to(model.device))
284
+ # Generated text: '```json\n{\n "question": "Which is the highest mountain in the world?",\n "answer": "Mount Everest."\n}\n```<|end▁of▁sentence|>'
285
+ ```
286
+
287
+ ### FIM completion
288
+
289
+ In FIM (Fill In the Middle) completion, you can provide a prefix and an optional suffix, and the model will complete the content in between.
290
+
291
+ ```python
292
+ # Assume that `model` and `tokenizer` are loaded
293
+ model.generation_config = GenerationConfig(do_sample=False, max_new_tokens=128, eos_token_id=tokenizer.eos_token_id, pad_token_id=tokenizer.eos_token_id)
294
+
295
+ prefix = """def quick_sort(arr):
296
+ if len(arr) <= 1:
297
+ return arr
298
+ pivot = arr[0]
299
+ left = []
300
+ right = []
301
+ """
302
+
303
+ suffix = """
304
+ if arr[i] < pivot:
305
+ left.append(arr[i])
306
+ else:
307
+ right.append(arr[i])
308
+ return quick_sort(left) + [pivot] + quick_sort(right)"""
309
+
310
+ fim_prompt = f"<|fim▁begin|>{prefix}<|fim▁hole|>{suffix}<|fim▁end|>"
311
+ fim_inputs = tokenizer(fim_prompt, add_special_tokens=True, return_tensors="pt").input_ids
312
+ fim_outputs = model.generate(fim_inputs.to(model.device))
313
+ # Generated text: " for i in range(1, len(arr)):<|end▁of▁sentence|>"
314
+ ```
315
+
316
+ ## 6. License
317
+
318
+ This code repository is licensed under [the MIT License](https://github.com/deepseek-ai/DeepSeek-Coder-V2/blob/main/LICENSE-CODE). The use of DeepSeek-Coder-V2 Base/Instruct models is subject to [the Model License](https://github.com/deepseek-ai/DeepSeek-Coder-V2/blob/main/LICENSE-MODEL). DeepSeek-Coder-V2 series (including Base and Instruct) supports commercial use.
319
+
320
+
321
+ ## 7. Contact
322
+ If you have any questions, please raise an issue or contact us at [[email protected]]([email protected]).