bigmoyan commited on
Commit
d56abb2
·
verified ·
1 Parent(s): 13adc1d

add FAQ for tool calls.

Browse files
Files changed (1) hide show
  1. docs/tool_call_guidance.md +258 -241
docs/tool_call_guidance.md CHANGED
@@ -1,241 +1,258 @@
1
- ## Tool Calling
2
- To enable the tool calling feature, you may need to set certain tool calling parser options when starting the service. See [deploy_guidance](./deploy_guidance.md) for details.
3
- In Kimi-K2, a tool calling process includes:
4
- - Passing function descriptions to Kimi-K2
5
- - Kimi-K2 decides to make a function call and returns the necessary information for the function call to the user
6
- - The user performs the function call, collects the call results, and passes the function call results to Kimi-K2
7
- - Kimi-K2 continues to generate content based on the function call results until the model believes it has obtained sufficient information to respond to the user
8
-
9
- ### Preparing Tools
10
- Suppose we have a function `get_weather` that can query the weather conditions in real-time.
11
- This function accepts a city name as a parameter and returns the weather conditions. We need to prepare a structured description for it so that Kimi-K2 can understand its functionality.
12
-
13
- ```python
14
- def get_weather(city):
15
- return {"weather": "Sunny"}
16
-
17
- # Collect the tool descriptions in tools
18
- tools = [{
19
- "type": "function",
20
- "function": {
21
- "name": "get_weather",
22
- "description": "Get weather information. Call this tool when the user needs to get weather information",
23
- "parameters": {
24
- "type": "object",
25
- "required": ["city"],
26
- "properties": {
27
- "city": {
28
- "type": "string",
29
- "description": "City name",
30
- }
31
- }
32
- }
33
- }
34
- }]
35
-
36
- # Tool name->object mapping for easy calling later
37
- tool_map = {
38
- "get_weather": get_weather
39
- }
40
- ```
41
- ### Chat with tools
42
- We use `openai.OpenAI` to send messages to Kimi-K2 along with tool descriptions. Kimi-K2 will autonomously decide whether to use and how to use the provided tools.
43
- If Kimi-K2 believes a tool call is needed, it will return a result with `finish_reason='tool_calls'`. At this point, the returned result includes the tool call information.
44
- After calling tools with the provided information, we then need to append the tool call results to the chat history and continue calling Kimi-K2.
45
- Kimi-K2 may need to call tools multiple times until the model believes the current results can answer the user's question. We should check `finish_reason` until it is not `tool_calls`.
46
-
47
- The results obtained by the user after calling the tools should be added to `messages` with `role='tool'`.
48
-
49
- ```python
50
- import json
51
- from openai import OpenAI
52
- model_name='moonshotai/Kimi-K2-Instruct'
53
- client = OpenAI(base_url=endpoint,
54
- api_key='xxx')
55
-
56
- messages = [
57
- {"role": "user", "content": "What's the weather like in Beijing today? Let's check using the tool."}
58
- ]
59
- finish_reason = None
60
- while finish_reason is None or finish_reason == "tool_calls":
61
- completion = client.chat.completions.create(
62
- model=model_name,
63
- messages=messages,
64
- temperature=0.3,
65
- tools=tools,
66
- tool_choice="auto",
67
- )
68
- choice = completion.choices[0]
69
- finish_reason = choice.finish_reason
70
- # Note: The finish_reason when tool calls end may vary across different engines, so this condition check needs to be adjusted accordingly
71
- if finish_reason == "tool_calls":
72
- messages.append(choice.message)
73
- for tool_call in choice.message.tool_calls:
74
- tool_call_name = tool_call.function.name
75
- tool_call_arguments = json.loads(tool_call.function.arguments)
76
- tool_function = tool_map[tool_call_name]
77
- tool_result = tool_function(tool_call_arguments)
78
- print("tool_result", tool_result)
79
-
80
- messages.append({
81
- "role": "tool",
82
- "tool_call_id": tool_call.id,
83
- "name": tool_call_name,
84
- "content": json.dumps(tool_result),
85
- })
86
- print('-' * 100)
87
- print(choice.message.content)
88
- ```
89
- ### Tool Calling in Streaming Mode
90
- Tool calling can also be used in streaming mode. In this case, we need to collect the tool call information returned in the stream until we have a complete tool call. Please refer to the code below:
91
-
92
- ```python
93
- messages = [
94
- {"role": "user", "content": "What's the weather like in Beijing today? Let's check using the tool."}
95
- ]
96
- finish_reason = None
97
- msg = ''
98
- while finish_reason is None or finish_reason == "tool_calls":
99
- completion = client.chat.completions.create(
100
- model=model_name,
101
- messages=messages,
102
- temperature=0.3,
103
- tools=tools,
104
- tool_choice="auto",
105
- stream=True
106
- )
107
- tool_calls = []
108
- for chunk in completion:
109
- delta = chunk.choices[0].delta
110
- if delta.content:
111
- msg += delta.content
112
- if delta.tool_calls:
113
- for tool_call_chunk in delta.tool_calls:
114
- if tool_call_chunk.index is not None:
115
- # Extend the tool_calls list
116
- while len(tool_calls) <= tool_call_chunk.index:
117
- tool_calls.append({
118
- "id": "",
119
- "type": "function",
120
- "function": {
121
- "name": "",
122
- "arguments": ""
123
- }
124
- })
125
-
126
- tc = tool_calls[tool_call_chunk.index]
127
-
128
- if tool_call_chunk.id:
129
- tc["id"] += tool_call_chunk.id
130
- if tool_call_chunk.function.name:
131
- tc["function"]["name"] += tool_call_chunk.function.name
132
- if tool_call_chunk.function.arguments:
133
- tc["function"]["arguments"] += tool_call_chunk.function.arguments
134
-
135
- finish_reason = chunk.choices[0].finish_reason
136
- # Note: The finish_reason when tool calls end may vary across different engines, so this condition check needs to be adjusted accordingly
137
- if finish_reason == "tool_calls":
138
- for tool_call in tool_calls:
139
- tool_call_name = tool_call['function']['name']
140
- tool_call_arguments = json.loads(tool_call['function']['arguments'])
141
- tool_function = tool_map[tool_call_name]
142
- tool_result = tool_function(tool_call_arguments)
143
- messages.append({
144
- "role": "tool",
145
- "tool_call_id": tool_call['id'],
146
- "name": tool_call_name,
147
- "content": json.dumps(tool_result),
148
- })
149
- # The text generated by the tool call is not the final version, reset msg
150
- msg = ''
151
-
152
- print(msg)
153
- ```
154
- ### Manually Parsing Tool Calls
155
- The tool call requests generated by Kimi-K2 can also be parsed manually, which is especially useful when the service you are using does not provide a tool-call parser.
156
- The tool call requests generated by Kimi-K2 are wrapped by `<|tool_calls_section_begin|>` and `<|tool_calls_section_end|>`,
157
- with each tool call wrapped by `<|tool_call_begin|>` and `<|tool_call_end|>`. The tool ID and arguments are separated by `<|tool_call_argument_begin|>`.
158
- The format of the tool ID is `functions.{func_name}:{idx}`, from which we can parse the function name.
159
-
160
- Based on the above rules, we can directly post request to the completions interface and manually parse tool calls.
161
-
162
- ```python
163
- import requests
164
- from transformers import AutoTokenizer
165
- messages = [
166
- {"role": "user", "content": "What's the weather like in Beijing today? Let's check using the tool."}
167
- ]
168
- msg = ''
169
- tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
170
- while True:
171
- text = tokenizer.apply_chat_template(
172
- messages,
173
- tokenize=False,
174
- tools=tools,
175
- add_generation_prompt=True,
176
- )
177
- payload = {
178
- "model": model_name,
179
- "prompt": text,
180
- "max_tokens": 512
181
- }
182
- response = requests.post(
183
- f"{endpoint}/completions",
184
- headers={"Content-Type": "application/json"},
185
- json=payload,
186
- stream=False,
187
- )
188
- raw_out = response.json()
189
-
190
- raw_output = raw_out["choices"][0]["text"]
191
- tool_calls = extract_tool_call_info(raw_output)
192
- if len(tool_calls) == 0:
193
- # No tool calls
194
- msg = raw_output
195
- break
196
- else:
197
- for tool_call in tool_calls:
198
- tool_call_name = tool_call['function']['name']
199
- tool_call_arguments = json.loads(tool_call['function']['arguments'])
200
- tool_function = tool_map[tool_call_name]
201
- tool_result = tool_function(tool_call_arguments)
202
-
203
- messages.append({
204
- "role": "tool",
205
- "tool_call_id": tool_call['id'],
206
- "name": tool_call_name,
207
- "content": json.dumps(tool_result),
208
- })
209
- print('-' * 100)
210
- print(msg)
211
- ```
212
- Here, `extract_tool_call_info` parses the model output and returns the model call information. A simple implementation would be:
213
- ```python
214
- def extract_tool_call_info(tool_call_rsp: str):
215
- if '<|tool_calls_section_begin|>' not in tool_call_rsp:
216
- # No tool calls
217
- return []
218
- import re
219
- pattern = r"<\|tool_calls_section_begin\|>(.*?)<\|tool_calls_section_end\|>"
220
-
221
- tool_calls_sections = re.findall(pattern, tool_call_rsp, re.DOTALL)
222
-
223
- # Extract multiple tool calls
224
- func_call_pattern = r"<\|tool_call_begin\|>\s*(?P<tool_call_id>[\w\.]+:\d+)\s*<\|tool_call_argument_begin\|>\s*(?P<function_arguments>.*?)\s*<\|tool_call_end\|>"
225
- tool_calls = []
226
- for match in re.findall(func_call_pattern, tool_calls_sections[0], re.DOTALL):
227
- function_id, function_args = match
228
- # function_id: functions.get_weather:0
229
- function_name = function_id.split('.')[1].split(':')[0]
230
- tool_calls.append(
231
- {
232
- "id": function_id,
233
- "type": "function",
234
- "function": {
235
- "name": function_name,
236
- "arguments": function_args
237
- }
238
- }
239
- )
240
- return tool_calls
241
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Tool Calling
2
+ To enable the tool calling feature, you may need to set certain tool calling parser options when starting the service. See [deploy_guidance](./deploy_guidance.md) for details.
3
+ In Kimi-K2, a tool calling process includes:
4
+ - Passing function descriptions to Kimi-K2
5
+ - Kimi-K2 decides to make a function call and returns the necessary information for the function call to the user
6
+ - The user performs the function call, collects the call results, and passes the function call results to Kimi-K2
7
+ - Kimi-K2 continues to generate content based on the function call results until the model believes it has obtained sufficient information to respond to the user
8
+
9
+ ### Preparing Tools
10
+ Suppose we have a function `get_weather` that can query the weather conditions in real-time.
11
+ This function accepts a city name as a parameter and returns the weather conditions. We need to prepare a structured description for it so that Kimi-K2 can understand its functionality.
12
+
13
+ ```python
14
+ def get_weather(city):
15
+ return {"weather": "Sunny"}
16
+
17
+ # Collect the tool descriptions in tools
18
+ tools = [{
19
+ "type": "function",
20
+ "function": {
21
+ "name": "get_weather",
22
+ "description": "Get weather information. Call this tool when the user needs to get weather information",
23
+ "parameters": {
24
+ "type": "object",
25
+ "required": ["city"],
26
+ "properties": {
27
+ "city": {
28
+ "type": "string",
29
+ "description": "City name",
30
+ }
31
+ }
32
+ }
33
+ }
34
+ }]
35
+
36
+ # Tool name->object mapping for easy calling later
37
+ tool_map = {
38
+ "get_weather": get_weather
39
+ }
40
+ ```
41
+ ### Chat with tools
42
+ We use `openai.OpenAI` to send messages to Kimi-K2 along with tool descriptions. Kimi-K2 will autonomously decide whether to use and how to use the provided tools.
43
+ If Kimi-K2 believes a tool call is needed, it will return a result with `finish_reason='tool_calls'`. At this point, the returned result includes the tool call information.
44
+ After calling tools with the provided information, we then need to append the tool call results to the chat history and continue calling Kimi-K2.
45
+ Kimi-K2 may need to call tools multiple times until the model believes the current results can answer the user's question. We should check `finish_reason` until it is not `tool_calls`.
46
+
47
+ The results obtained by the user after calling the tools should be added to `messages` with `role='tool'`.
48
+
49
+ ```python
50
+ import json
51
+ from openai import OpenAI
52
+ model_name='moonshotai/Kimi-K2-Instruct'
53
+ client = OpenAI(base_url=endpoint,
54
+ api_key='xxx')
55
+
56
+ messages = [
57
+ {"role": "user", "content": "What's the weather like in Beijing today? Let's check using the tool."}
58
+ ]
59
+ finish_reason = None
60
+ while finish_reason is None or finish_reason == "tool_calls":
61
+ completion = client.chat.completions.create(
62
+ model=model_name,
63
+ messages=messages,
64
+ temperature=0.3,
65
+ tools=tools,
66
+ tool_choice="auto",
67
+ )
68
+ choice = completion.choices[0]
69
+ finish_reason = choice.finish_reason
70
+ # Note: The finish_reason when tool calls end may vary across different engines, so this condition check needs to be adjusted accordingly
71
+ if finish_reason == "tool_calls":
72
+ messages.append(choice.message)
73
+ for tool_call in choice.message.tool_calls:
74
+ tool_call_name = tool_call.function.name
75
+ tool_call_arguments = json.loads(tool_call.function.arguments)
76
+ tool_function = tool_map[tool_call_name]
77
+ tool_result = tool_function(tool_call_arguments)
78
+ print("tool_result", tool_result)
79
+
80
+ messages.append({
81
+ "role": "tool",
82
+ "tool_call_id": tool_call.id,
83
+ "name": tool_call_name,
84
+ "content": json.dumps(tool_result),
85
+ })
86
+ print('-' * 100)
87
+ print(choice.message.content)
88
+ ```
89
+ ### Tool Calling in Streaming Mode
90
+ Tool calling can also be used in streaming mode. In this case, we need to collect the tool call information returned in the stream until we have a complete tool call. Please refer to the code below:
91
+
92
+ ```python
93
+ messages = [
94
+ {"role": "user", "content": "What's the weather like in Beijing today? Let's check using the tool."}
95
+ ]
96
+ finish_reason = None
97
+ msg = ''
98
+ while finish_reason is None or finish_reason == "tool_calls":
99
+ completion = client.chat.completions.create(
100
+ model=model_name,
101
+ messages=messages,
102
+ temperature=0.3,
103
+ tools=tools,
104
+ tool_choice="auto",
105
+ stream=True
106
+ )
107
+ tool_calls = []
108
+ for chunk in completion:
109
+ delta = chunk.choices[0].delta
110
+ if delta.content:
111
+ msg += delta.content
112
+ if delta.tool_calls:
113
+ for tool_call_chunk in delta.tool_calls:
114
+ if tool_call_chunk.index is not None:
115
+ # Extend the tool_calls list
116
+ while len(tool_calls) <= tool_call_chunk.index:
117
+ tool_calls.append({
118
+ "id": "",
119
+ "type": "function",
120
+ "function": {
121
+ "name": "",
122
+ "arguments": ""
123
+ }
124
+ })
125
+
126
+ tc = tool_calls[tool_call_chunk.index]
127
+
128
+ if tool_call_chunk.id:
129
+ tc["id"] += tool_call_chunk.id
130
+ if tool_call_chunk.function.name:
131
+ tc["function"]["name"] += tool_call_chunk.function.name
132
+ if tool_call_chunk.function.arguments:
133
+ tc["function"]["arguments"] += tool_call_chunk.function.arguments
134
+
135
+ finish_reason = chunk.choices[0].finish_reason
136
+ # Note: The finish_reason when tool calls end may vary across different engines, so this condition check needs to be adjusted accordingly
137
+ if finish_reason == "tool_calls":
138
+ for tool_call in tool_calls:
139
+ tool_call_name = tool_call['function']['name']
140
+ tool_call_arguments = json.loads(tool_call['function']['arguments'])
141
+ tool_function = tool_map[tool_call_name]
142
+ tool_result = tool_function(tool_call_arguments)
143
+ messages.append({
144
+ "role": "tool",
145
+ "tool_call_id": tool_call['id'],
146
+ "name": tool_call_name,
147
+ "content": json.dumps(tool_result),
148
+ })
149
+ # The text generated by the tool call is not the final version, reset msg
150
+ msg = ''
151
+
152
+ print(msg)
153
+ ```
154
+ ### Manually Parsing Tool Calls
155
+ The tool call requests generated by Kimi-K2 can also be parsed manually, which is especially useful when the service you are using does not provide a tool-call parser.
156
+ The tool call requests generated by Kimi-K2 are wrapped by `<|tool_calls_section_begin|>` and `<|tool_calls_section_end|>`,
157
+ with each tool call wrapped by `<|tool_call_begin|>` and `<|tool_call_end|>`. The tool ID and arguments are separated by `<|tool_call_argument_begin|>`.
158
+ The format of the tool ID is `functions.{func_name}:{idx}`, from which we can parse the function name.
159
+
160
+ Based on the above rules, we can directly post request to the completions interface and manually parse tool calls.
161
+
162
+ ```python
163
+ import requests
164
+ from transformers import AutoTokenizer
165
+ messages = [
166
+ {"role": "user", "content": "What's the weather like in Beijing today? Let's check using the tool."}
167
+ ]
168
+ msg = ''
169
+ tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
170
+ while True:
171
+ text = tokenizer.apply_chat_template(
172
+ messages,
173
+ tokenize=False,
174
+ tools=tools,
175
+ add_generation_prompt=True,
176
+ )
177
+ payload = {
178
+ "model": model_name,
179
+ "prompt": text,
180
+ "max_tokens": 512
181
+ }
182
+ response = requests.post(
183
+ f"{endpoint}/completions",
184
+ headers={"Content-Type": "application/json"},
185
+ json=payload,
186
+ stream=False,
187
+ )
188
+ raw_out = response.json()
189
+
190
+ raw_output = raw_out["choices"][0]["text"]
191
+ tool_calls = extract_tool_call_info(raw_output)
192
+ if len(tool_calls) == 0:
193
+ # No tool calls
194
+ msg = raw_output
195
+ break
196
+ else:
197
+ for tool_call in tool_calls:
198
+ tool_call_name = tool_call['function']['name']
199
+ tool_call_arguments = json.loads(tool_call['function']['arguments'])
200
+ tool_function = tool_map[tool_call_name]
201
+ tool_result = tool_function(tool_call_arguments)
202
+
203
+ messages.append({
204
+ "role": "tool",
205
+ "tool_call_id": tool_call['id'],
206
+ "name": tool_call_name,
207
+ "content": json.dumps(tool_result),
208
+ })
209
+ print('-' * 100)
210
+ print(msg)
211
+ ```
212
+ Here, `extract_tool_call_info` parses the model output and returns the model call information. A simple implementation would be:
213
+ ```python
214
+ def extract_tool_call_info(tool_call_rsp: str):
215
+ if '<|tool_calls_section_begin|>' not in tool_call_rsp:
216
+ # No tool calls
217
+ return []
218
+ import re
219
+ pattern = r"<\|tool_calls_section_begin\|>(.*?)<\|tool_calls_section_end\|>"
220
+
221
+ tool_calls_sections = re.findall(pattern, tool_call_rsp, re.DOTALL)
222
+
223
+ # Extract multiple tool calls
224
+ func_call_pattern = r"<\|tool_call_begin\|>\s*(?P<tool_call_id>[\w\.]+:\d+)\s*<\|tool_call_argument_begin\|>\s*(?P<function_arguments>.*?)\s*<\|tool_call_end\|>"
225
+ tool_calls = []
226
+ for match in re.findall(func_call_pattern, tool_calls_sections[0], re.DOTALL):
227
+ function_id, function_args = match
228
+ # function_id: functions.get_weather:0
229
+ function_name = function_id.split('.')[1].split(':')[0]
230
+ tool_calls.append(
231
+ {
232
+ "id": function_id,
233
+ "type": "function",
234
+ "function": {
235
+ "name": function_name,
236
+ "arguments": function_args
237
+ }
238
+ }
239
+ )
240
+ return tool_calls
241
+ ```
242
+
243
+ ## FAQ
244
+
245
+ #### Q1: I received special tokens like '<|tool_call_begin|>' in the 'content' field instead of a normal tool_call.
246
+
247
+ This indicates a tool-call crash, which most often occurs in multi-turn tool-calling scenarios due to incorrect tool-call ID. K2 expects the ID to follow the format `functions.func_name:idx`, where `functions` is a fixed string; `func_name` is the actual function name, like `get_weather`, and `idx` is a global counter that starts at 0 and increments with each function invocation.
248
+ Please check all tool-call IDs in the message list.
249
+
250
+
251
+ #### Q2: My tool-call ID is incorrect—how can I fix it?
252
+
253
+ First, make sure your code and chat template are up to date with the latest version from the Hugging Face repo.
254
+ If you're using vLLM or SGLang and they are generating random tool-call IDs, upgrade them to the latest release. For other frameworks, you must either parse the tool-call ID from the model output and set it correctly in the server-side response, or rewrite every tool-call ID according to the rules above on the client side before sending the messages to Kimi K2.
255
+
256
+ #### Q3: My tool call id is correct, but I still get crashed in multiturn tool call.
257
+
258
+ Please describe your situation in the [discussion](https://huggingface.co/moonshotai/Kimi-K2-Instruct-0905/discussions)