File size: 10,280 Bytes
8a112f5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 |
## Tool Calling
To enable the tool calling feature, you may need to set certain tool calling parser options when starting the service. See [deploy_guidance](./deploy_guidance.md) for details.
In Kimi-K2, a tool calling process includes:
- Passing function descriptions to Kimi-K2
- Kimi-K2 decides to make a function call and returns the necessary information for the function call to the user
- The user performs the function call, collects the call results, and passes the function call results to Kimi-K2
- Kimi-K2 continues to generate content based on the function call results until the model believes it has obtained sufficient information to respond to the user
### Preparing Tools
Suppose we have a function `get_weather` that can query the weather conditions in real-time.
This function accepts a city name as a parameter and returns the weather conditions. We need to prepare a structured description for it so that Kimi-K2 can understand its functionality.
```python
def get_weather(city):
return {"weather": "Sunny"}
# Collect the tool descriptions in tools
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather information. Call this tool when the user needs to get weather information",
"parameters": {
"type": "object",
"required": ["city"],
"properties": {
"city": {
"type": "string",
"description": "City name",
}
}
}
}
}]
# Tool name->object mapping for easy calling later
tool_map = {
"get_weather": get_weather
}
```
### Chat with tools
We use `openai.OpenAI` to send messages to Kimi-K2 along with tool descriptions. Kimi-K2 will autonomously decide whether to use and how to use the provided tools.
If Kimi-K2 believes a tool call is needed, it will return a result with `finish_reason='tool_calls'`. At this point, the returned result includes the tool call information.
After calling tools with the provided information, we then need to append the tool call results to the chat history and continue calling Kimi-K2.
Kimi-K2 may need to call tools multiple times until the model believes the current results can answer the user's question. We should check `finish_reason` until it is not `tool_calls`.
The results obtained by the user after calling the tools should be added to `messages` with `role='tool'`.
```python
import json
from openai import OpenAI
model_name='moonshotai/Kimi-K2-Instruct'
client = OpenAI(base_url=endpoint,
api_key='xxx')
messages = [
{"role": "user", "content": "What's the weather like in Beijing today? Let's check using the tool."}
]
finish_reason = None
while finish_reason is None or finish_reason == "tool_calls":
completion = client.chat.completions.create(
model=model_name,
messages=messages,
temperature=0.3,
tools=tools,
tool_choice="auto",
)
choice = completion.choices[0]
finish_reason = choice.finish_reason
# Note: The finish_reason when tool calls end may vary across different engines, so this condition check needs to be adjusted accordingly
if finish_reason == "tool_calls":
messages.append(choice.message)
for tool_call in choice.message.tool_calls:
tool_call_name = tool_call.function.name
tool_call_arguments = json.loads(tool_call.function.arguments)
tool_function = tool_map[tool_call_name]
tool_result = tool_function(tool_call_arguments)
print("tool_result", tool_result)
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"name": tool_call_name,
"content": json.dumps(tool_result),
})
print('-' * 100)
print(choice.message.content)
```
### Tool Calling in Streaming Mode
Tool calling can also be used in streaming mode. In this case, we need to collect the tool call information returned in the stream until we have a complete tool call. Please refer to the code below:
```python
messages = [
{"role": "user", "content": "What's the weather like in Beijing today? Let's check using the tool."}
]
finish_reason = None
msg = ''
while finish_reason is None or finish_reason == "tool_calls":
completion = client.chat.completions.create(
model=model_name,
messages=messages,
temperature=0.3,
tools=tools,
tool_choice="auto",
stream=True
)
tool_calls = []
for chunk in completion:
delta = chunk.choices[0].delta
if delta.content:
msg += delta.content
if delta.tool_calls:
for tool_call_chunk in delta.tool_calls:
if tool_call_chunk.index is not None:
# Extend the tool_calls list
while len(tool_calls) <= tool_call_chunk.index:
tool_calls.append({
"id": "",
"type": "function",
"function": {
"name": "",
"arguments": ""
}
})
tc = tool_calls[tool_call_chunk.index]
if tool_call_chunk.id:
tc["id"] += tool_call_chunk.id
if tool_call_chunk.function.name:
tc["function"]["name"] += tool_call_chunk.function.name
if tool_call_chunk.function.arguments:
tc["function"]["arguments"] += tool_call_chunk.function.arguments
finish_reason = chunk.choices[0].finish_reason
# Note: The finish_reason when tool calls end may vary across different engines, so this condition check needs to be adjusted accordingly
if finish_reason == "tool_calls":
for tool_call in tool_calls:
tool_call_name = tool_call['function']['name']
tool_call_arguments = json.loads(tool_call['function']['arguments'])
tool_function = tool_map[tool_call_name]
tool_result = tool_function(tool_call_arguments)
messages.append({
"role": "tool",
"tool_call_id": tool_call['id'],
"name": tool_call_name,
"content": json.dumps(tool_result),
})
# The text generated by the tool call is not the final version, reset msg
msg = ''
print(msg)
```
### Manually Parsing Tool Calls
The tool call requests generated by Kimi-K2 can also be parsed manually, which is especially useful when the service you are using does not provide a tool-call parser.
The tool call requests generated by Kimi-K2 are wrapped by `<|tool_calls_section_begin|>` and `<|tool_calls_section_end|>`,
with each tool call wrapped by `<|tool_call_begin|>` and `<|tool_call_end|>`. The tool ID and arguments are separated by `<|tool_call_argument_begin|>`.
The format of the tool ID is `functions.{func_name}:{idx}`, from which we can parse the function name.
Based on the above rules, we can directly post request to the completions interface and manually parse tool calls.
```python
import requests
from transformers import AutoTokenizer
messages = [
{"role": "user", "content": "What's the weather like in Beijing today? Let's check using the tool."}
]
msg = ''
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
while True:
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
tools=tools,
add_generation_prompt=True,
)
payload = {
"model": model_name,
"prompt": text,
"max_tokens": 512
}
response = requests.post(
f"{endpoint}/completions",
headers={"Content-Type": "application/json"},
json=payload,
stream=False,
)
raw_out = response.json()
raw_output = raw_out["choices"][0]["text"]
tool_calls = extract_tool_call_info(raw_output)
if len(tool_calls) == 0:
# No tool calls
msg = raw_output
break
else:
for tool_call in tool_calls:
tool_call_name = tool_call['function']['name']
tool_call_arguments = json.loads(tool_call['function']['arguments'])
tool_function = tool_map[tool_call_name]
tool_result = tool_function(tool_call_arguments)
messages.append({
"role": "tool",
"tool_call_id": tool_call['id'],
"name": tool_call_name,
"content": json.dumps(tool_result),
})
print('-' * 100)
print(msg)
```
Here, `extract_tool_call_info` parses the model output and returns the model call information. A simple implementation would be:
```python
def extract_tool_call_info(tool_call_rsp: str):
if '<|tool_calls_section_begin|>' not in tool_call_rsp:
# No tool calls
return []
import re
pattern = r"<\|tool_calls_section_begin\|>(.*?)<\|tool_calls_section_end\|>"
tool_calls_sections = re.findall(pattern, tool_call_rsp, re.DOTALL)
# Extract multiple tool calls
func_call_pattern = r"<\|tool_call_begin\|>\s*(?P<tool_call_id>[\w\.]+:\d+)\s*<\|tool_call_argument_begin\|>\s*(?P<function_arguments>.*?)\s*<\|tool_call_end\|>"
tool_calls = []
for match in re.findall(func_call_pattern, tool_calls_sections[0], re.DOTALL):
function_id, function_args = match
# function_id: functions.get_weather:0
function_name = function_id.split('.')[1].split(':')[0]
tool_calls.append(
{
"id": function_id,
"type": "function",
"function": {
"name": function_name,
"arguments": function_args
}
}
)
return tool_calls
```
|