File size: 10,280 Bytes
8a112f5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
## Tool Calling
To enable the tool calling feature, you may need to set certain tool calling parser options when starting the service. See [deploy_guidance](./deploy_guidance.md) for details.
In Kimi-K2, a tool calling process includes:
- Passing function descriptions to Kimi-K2
- Kimi-K2 decides to make a function call and returns the necessary information for the function call to the user
- The user performs the function call, collects the call results, and passes the function call results to Kimi-K2
- Kimi-K2 continues to generate content based on the function call results until the model believes it has obtained sufficient information to respond to the user

### Preparing Tools
Suppose we have a function `get_weather` that can query the weather conditions in real-time. 
This function accepts a city name as a parameter and returns the weather conditions. We need to prepare a structured description for it so that Kimi-K2 can understand its functionality.

```python

def get_weather(city):

    return {"weather": "Sunny"}



# Collect the tool descriptions in tools

tools = [{

    "type": "function",

    "function": {        

        "name": "get_weather", 

        "description": "Get weather information. Call this tool when the user needs to get weather information", 

         "parameters": {

              "type": "object",

              "required": ["city"], 

              "properties": { 

                  "city": { 

                      "type": "string", 

                      "description": "City name", 

                }

            }

        }

    }

}]



# Tool name->object mapping for easy calling later

tool_map = {

    "get_weather": get_weather

}

```
### Chat with tools
We use `openai.OpenAI` to send messages to Kimi-K2 along with tool descriptions. Kimi-K2 will autonomously decide whether to use and how to use the provided tools. 
If Kimi-K2 believes a tool call is needed, it will return a result with `finish_reason='tool_calls'`. At this point, the returned result includes the tool call information. 
After calling tools with the provided information, we then need to append the tool call results to the chat history and continue calling Kimi-K2. 
Kimi-K2 may need to call tools multiple times until the model believes the current results can answer the user's question. We should check `finish_reason` until it is not `tool_calls`.

The results obtained by the user after calling the tools should be added to `messages` with `role='tool'`.

```python

import json

from openai import OpenAI

model_name='moonshotai/Kimi-K2-Instruct'

client = OpenAI(base_url=endpoint, 

                        api_key='xxx')



messages = [

{"role": "user", "content": "What's the weather like in Beijing today? Let's check using the tool."}

]

finish_reason = None

while finish_reason is None or finish_reason == "tool_calls":

    completion = client.chat.completions.create(

        model=model_name,

        messages=messages,

        temperature=0.3,

        tools=tools, 

        tool_choice="auto",

    )

    choice = completion.choices[0]

    finish_reason = choice.finish_reason

    # Note: The finish_reason when tool calls end may vary across different engines, so this condition check needs to be adjusted accordingly

    if finish_reason == "tool_calls": 

        messages.append(choice.message)

        for tool_call in choice.message.tool_calls: 

            tool_call_name = tool_call.function.name

            tool_call_arguments = json.loads(tool_call.function.arguments) 

            tool_function = tool_map[tool_call_name] 

            tool_result = tool_function(tool_call_arguments)

            print("tool_result", tool_result)



            messages.append({

                "role": "tool",

                "tool_call_id": tool_call.id,

                "name": tool_call_name,

                "content": json.dumps(tool_result), 

            })

print('-' * 100)

print(choice.message.content)

```
### Tool Calling in Streaming Mode
Tool calling can also be used in streaming mode. In this case, we need to collect the tool call information returned in the stream until we have a complete tool call. Please refer to the code below:

```python

messages = [

    {"role": "user", "content": "What's the weather like in Beijing today? Let's check using the tool."}

]

finish_reason = None

msg = ''

while finish_reason is None or finish_reason == "tool_calls":

    completion = client.chat.completions.create(

        model=model_name,

        messages=messages,

        temperature=0.3,

        tools=tools,

        tool_choice="auto",

        stream=True 

    )

    tool_calls = []

    for chunk in completion:

        delta = chunk.choices[0].delta

        if delta.content:

            msg += delta.content

        if delta.tool_calls:

            for tool_call_chunk in delta.tool_calls:

                if tool_call_chunk.index is not None:

                    # Extend the tool_calls list

                    while len(tool_calls) <= tool_call_chunk.index:

                        tool_calls.append({

                            "id": "",

                            "type": "function",

                            "function": {

                                "name": "",

                                "arguments": ""

                            }

                        })



                    tc = tool_calls[tool_call_chunk.index]



                    if tool_call_chunk.id:

                        tc["id"] += tool_call_chunk.id

                    if tool_call_chunk.function.name:

                        tc["function"]["name"] += tool_call_chunk.function.name

                    if tool_call_chunk.function.arguments:

                        tc["function"]["arguments"] += tool_call_chunk.function.arguments



        finish_reason = chunk.choices[0].finish_reason

    # Note: The finish_reason when tool calls end may vary across different engines, so this condition check needs to be adjusted accordingly

    if finish_reason == "tool_calls":

        for tool_call in tool_calls:

            tool_call_name = tool_call['function']['name']

            tool_call_arguments = json.loads(tool_call['function']['arguments'])

            tool_function = tool_map[tool_call_name] 

            tool_result = tool_function(tool_call_arguments)

            messages.append({

                "role": "tool",

                "tool_call_id": tool_call['id'],

                "name": tool_call_name,

                "content": json.dumps(tool_result),

            })

        # The text generated by the tool call is not the final version, reset msg

        msg = ''



    print(msg)

```
### Manually Parsing Tool Calls
The tool call requests generated by Kimi-K2 can also be parsed manually, which is especially useful when the service you are using does not provide a tool-call parser. 
The tool call requests generated by Kimi-K2 are wrapped by `<|tool_calls_section_begin|>` and `<|tool_calls_section_end|>`, 
with each tool call wrapped by `<|tool_call_begin|>` and `<|tool_call_end|>`. The tool ID and arguments are separated by `<|tool_call_argument_begin|>`. 
The format of the tool ID is `functions.{func_name}:{idx}`, from which we can parse the function name.

Based on the above rules, we can directly post request to the completions interface and manually parse tool calls.

```python

import requests

from transformers import AutoTokenizer

messages = [

    {"role": "user", "content": "What's the weather like in Beijing today? Let's check using the tool."}

]

msg = ''

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)

while True:

    text = tokenizer.apply_chat_template(

        messages,

        tokenize=False,

        tools=tools,

        add_generation_prompt=True,

    )

    payload = {

        "model": model_name,

        "prompt": text,

        "max_tokens": 512

    }

    response = requests.post(

        f"{endpoint}/completions",

        headers={"Content-Type": "application/json"},

        json=payload,

        stream=False,

    )

    raw_out = response.json()



    raw_output = raw_out["choices"][0]["text"]

    tool_calls = extract_tool_call_info(raw_output)

    if len(tool_calls) == 0:

        # No tool calls

        msg = raw_output

        break

    else:

        for tool_call in tool_calls:

            tool_call_name = tool_call['function']['name']

            tool_call_arguments = json.loads(tool_call['function']['arguments'])

            tool_function = tool_map[tool_call_name]

            tool_result = tool_function(tool_call_arguments)



            messages.append({

                "role": "tool",

                "tool_call_id": tool_call['id'],

                "name": tool_call_name,

                "content": json.dumps(tool_result), 

            })

print('-' * 100)          

print(msg)

```
Here, `extract_tool_call_info` parses the model output and returns the model call information. A simple implementation would be:
```python

def extract_tool_call_info(tool_call_rsp: str):

    if '<|tool_calls_section_begin|>' not in tool_call_rsp:

        # No tool calls

        return []

    import re

    pattern = r"<\|tool_calls_section_begin\|>(.*?)<\|tool_calls_section_end\|>"

    

    tool_calls_sections = re.findall(pattern, tool_call_rsp, re.DOTALL)

    

    # Extract multiple tool calls

    func_call_pattern = r"<\|tool_call_begin\|>\s*(?P<tool_call_id>[\w\.]+:\d+)\s*<\|tool_call_argument_begin\|>\s*(?P<function_arguments>.*?)\s*<\|tool_call_end\|>"

    tool_calls = []

    for match in re.findall(func_call_pattern, tool_calls_sections[0], re.DOTALL):

        function_id, function_args = match

        # function_id: functions.get_weather:0

        function_name = function_id.split('.')[1].split(':')[0]

        tool_calls.append(

            {

                "id": function_id,

                "type": "function",

                "function": {

                    "name": function_name,

                    "arguments": function_args

                }

            }

        )  

    return tool_calls

```