Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,547 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language:
|
3 |
+
- en
|
4 |
+
- fr
|
5 |
+
- de
|
6 |
+
- es
|
7 |
+
- pt
|
8 |
+
- it
|
9 |
+
- ja
|
10 |
+
- ko
|
11 |
+
- ru
|
12 |
+
- zh
|
13 |
+
- ar
|
14 |
+
- fa
|
15 |
+
- id
|
16 |
+
- ms
|
17 |
+
- ne
|
18 |
+
- pl
|
19 |
+
- ro
|
20 |
+
- sr
|
21 |
+
- sv
|
22 |
+
- tr
|
23 |
+
- uk
|
24 |
+
- vi
|
25 |
+
- hi
|
26 |
+
- bn
|
27 |
+
license: apache-2.0
|
28 |
+
library_name: vllm
|
29 |
+
inference: false
|
30 |
+
base_model:
|
31 |
+
- mistralai/Devstrall-Small-2505
|
32 |
+
extra_gated_description: >-
|
33 |
+
If you want to learn more about how we process your personal data, please read
|
34 |
+
our <a href="https://mistral.ai/terms/">Privacy Policy</a>.
|
35 |
+
pipeline_tag: text2text-generation
|
36 |
+
---
|
37 |
+
|
38 |
+
# Model Card for mistralai/Devstrall-Small-2505
|
39 |
+
|
40 |
+
Devstral is an agentic LLM for software engineering tasks built under a collaboration between [Mistral AI](https://mistral.ai/) and [All Hands AI](https://www.all-hands.dev/) 🙌. Devstral excels at using tools to explore codebases, editing multiple files and power software engineering agents. The model achieves remarkable performance on SWE-bench which positionates it as the #1 open source model on this [benchmark](#benchmark-results).
|
41 |
+
|
42 |
+
It is finetuned from [Mistral-Small-3.1](https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Base-2503), therefore it has a long context window of up to 128k tokens. As a coding agent, Devstral is text-only and before fine-tuning from `Mistral-Small-3.1` the vision encoder was removed.
|
43 |
+
|
44 |
+
For enterprises requiring specialized capabilities (increased context, domain-specific knowledge, etc.), we will release commercial models beyond what Mistral AI contributes to the community.
|
45 |
+
|
46 |
+
Learn more about Devstral in our [blog post](https://mistral.ai/news/devstral).
|
47 |
+
|
48 |
+
|
49 |
+
## Key Features:
|
50 |
+
- **Agentic coding**: Devstral is designed to excel at agentic coding tasks, making it a great choice for software engineering agents.
|
51 |
+
- **lightweight**: with its compact size of just 24 billion parameters, Devstral is light enough to run on a single RTX 4090 or a Mac with 32GB RAM, making it an appropriate model for local deployment and on-device use.
|
52 |
+
- **Apache 2.0 License**: Open license allowing usage and modification for both commercial and non-commercial purposes.
|
53 |
+
- **Context Window**: A 128k context window.
|
54 |
+
- **Tokenizer**: Utilizes a Tekken tokenizer with a 131k vocabulary size.
|
55 |
+
|
56 |
+
|
57 |
+
|
58 |
+
## Benchmark Results
|
59 |
+
|
60 |
+
### SWE-Bench
|
61 |
+
|
62 |
+
Devstral achieves a score of 46.8% on SWE-Bench Verified, outperforming prior open-source SoTA by 6%.
|
63 |
+
|
64 |
+
| Model | Scaffold | SWE-Bench Verified (%) |
|
65 |
+
|------------------|--------------------|------------------------|
|
66 |
+
| Devstral | OpenHands Scaffold | **46.8** |
|
67 |
+
| GPT-4.1-mini | OpenAI Scaffold | 23.6 |
|
68 |
+
| Claude 3.5 Haiku | Anthropic Scaffold | 40.6 |
|
69 |
+
| SWE-smith-LM 32B | SWE-agent Scaffold | 40.2 |
|
70 |
+
|
71 |
+
|
72 |
+
When evaluated under the same test scaffold (OpenHands, provided by All Hands AI 🙌), Devstral exceeds far larger models such as Deepseek-V3-0324 and Qwen3 232B-A22B.
|
73 |
+
|
74 |
+

|
75 |
+
|
76 |
+
## Usage
|
77 |
+
|
78 |
+
We recommend to use Devstral with the [OpenHands](https://github.com/All-Hands-AI/OpenHands/tree/main) scaffold.
|
79 |
+
You can use it either through our API or by running locally.
|
80 |
+
|
81 |
+
### API
|
82 |
+
Follow these [instructions](https://docs.mistral.ai/getting-started/quickstart/#account-setup) to create a Mistral account and get an API key.
|
83 |
+
|
84 |
+
Then run these commands to start the OpenHands docker container.
|
85 |
+
```bash
|
86 |
+
export MISTRAL_API_KEY=<MY_KEY>
|
87 |
+
|
88 |
+
docker pull docker.all-hands.dev/all-hands-ai/runtime:0.39-nikolaik
|
89 |
+
|
90 |
+
mkdir -p ~/.openhands-state && echo '{"language":"en","agent":"CodeActAgent","max_iterations":null,"security_analyzer":null,"confirmation_mode":false,"llm_model":"mistral/devstral-small-2505","llm_api_key":"'$MISTRAL_API_KEY'","remote_runtime_resource_factor":null,"github_token":null,"enable_default_condenser":true}' > ~/.openhands-state/settings.json
|
91 |
+
|
92 |
+
docker run -it --rm --pull=always \
|
93 |
+
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.39-nikolaik \
|
94 |
+
-e LOG_ALL_EVENTS=true \
|
95 |
+
-v /var/run/docker.sock:/var/run/docker.sock \
|
96 |
+
-v ~/.openhands-state:/.openhands-state \
|
97 |
+
-p 3000:3000 \
|
98 |
+
--add-host host.docker.internal:host-gateway \
|
99 |
+
--name openhands-app \
|
100 |
+
docker.all-hands.dev/all-hands-ai/openhands:0.39
|
101 |
+
```
|
102 |
+
|
103 |
+
### Local inference
|
104 |
+
|
105 |
+
You can also run the model locally. It can be done with LMStudio or other providers listed below.
|
106 |
+
|
107 |
+
Launch Openhands
|
108 |
+
You can now interact with the model served from LM Studio with openhands. Start the openhands server with the docker
|
109 |
+
|
110 |
+
```bash
|
111 |
+
docker pull docker.all-hands.dev/all-hands-ai/runtime:0.38-nikolaik
|
112 |
+
docker run -it --rm --pull=always \
|
113 |
+
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.38-nikolaik \
|
114 |
+
-e LOG_ALL_EVENTS=true \
|
115 |
+
-v /var/run/docker.sock:/var/run/docker.sock \
|
116 |
+
-v ~/.openhands-state:/.openhands-state \
|
117 |
+
-p 3000:3000 \
|
118 |
+
--add-host host.docker.internal:host-gateway \
|
119 |
+
--name openhands-app \
|
120 |
+
docker.all-hands.dev/all-hands-ai/openhands:0.38
|
121 |
+
```
|
122 |
+
|
123 |
+
The server will start at http://0.0.0.0:3000. Open it in your browser and you will see a tab AI Provider Configuration.
|
124 |
+
Now you can start a new conversation with the agent by clicking on the plus sign on the left bar.
|
125 |
+
|
126 |
+
|
127 |
+
The model can also be deployed with the following libraries:
|
128 |
+
- [`LMStudio (recommended for quantized model)`](https://lmstudio.ai/): See [here](#lmstudio)
|
129 |
+
- [`vllm (recommended)`](https://github.com/vllm-project/vllm): See [here](#vllm)
|
130 |
+
- [`ollama`](https://github.com/ollama/ollama): See [here](#ollama)
|
131 |
+
- [`mistral-inference`](https://github.com/mistralai/mistral-inference): See [here](#mistral-inference)
|
132 |
+
- [`transformers`](https://github.com/huggingface/transformers): See [here](#transformers)
|
133 |
+
|
134 |
+
### OpenHands (recommended)
|
135 |
+
|
136 |
+
#### Launch a server to deploy Devstral-Small-2505
|
137 |
+
|
138 |
+
Make sure you launched an OpenAI-compatible server such as vLLM or Ollama as described above. Then, you can use OpenHands to interact with `Devstral-Small-2505`.
|
139 |
+
|
140 |
+
In the case of the tutorial we spineed up a vLLM server running the command:
|
141 |
+
```bash
|
142 |
+
vllm serve mistralai/Devstral-Small-2505 --tokenizer_mode mistral --config_format mistral --load_format mistral --tool-call-parser mistral --enable-auto-tool-choice --tensor-parallel-size 2
|
143 |
+
```
|
144 |
+
|
145 |
+
The server address should be in the following format: `http://<your-server-url>:8000/v1`
|
146 |
+
|
147 |
+
#### Launch OpenHands
|
148 |
+
|
149 |
+
You can follow installation of OpenHands [here](https://docs.all-hands.dev/modules/usage/installation).
|
150 |
+
|
151 |
+
The easiest way to launch OpenHands is to use the Docker image:
|
152 |
+
```bash
|
153 |
+
docker pull docker.all-hands.dev/all-hands-ai/runtime:0.38-nikolaik
|
154 |
+
|
155 |
+
docker run -it --rm --pull=always \
|
156 |
+
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.38-nikolaik \
|
157 |
+
-e LOG_ALL_EVENTS=true \
|
158 |
+
-v /var/run/docker.sock:/var/run/docker.sock \
|
159 |
+
-v ~/.openhands-state:/.openhands-state \
|
160 |
+
-p 3000:3000 \
|
161 |
+
--add-host host.docker.internal:host-gateway \
|
162 |
+
--name openhands-app \
|
163 |
+
docker.all-hands.dev/all-hands-ai/openhands:0.38
|
164 |
+
```
|
165 |
+
|
166 |
+
|
167 |
+
Then, you can access the OpenHands UI at `http://localhost:3000`.
|
168 |
+
|
169 |
+
#### Connect to the server
|
170 |
+
|
171 |
+
When accessing the OpenHands UI, you will be prompted to connect to a server. You can use the advanced mode to connect to the server you launched earlier.
|
172 |
+
|
173 |
+
Fill the following fields:
|
174 |
+
- **Custom Model**: `openai/mistralai/Devstral-Small-2505`
|
175 |
+
- **Base URL**: `http://<your-server-url>:8000/v1`
|
176 |
+
- **API Key**: `token` (or any other token you used to launch the server if any)
|
177 |
+
|
178 |
+
#### Use OpenHands powered by Devstral
|
179 |
+
|
180 |
+
Now you're good to use Devstral Small inside OpenHands by **starting a new conversation**. Let's build a To-Do list app.
|
181 |
+
|
182 |
+
<details>
|
183 |
+
<summary>To-Do list app</summary
|
184 |
+
|
185 |
+
1. Let's ask Devstral to generate the app with the following prompt:
|
186 |
+
|
187 |
+
```txt
|
188 |
+
Build a To-Do list app with the following requirements:
|
189 |
+
- Built using FastAPI and React.
|
190 |
+
- Make it a one page app that:
|
191 |
+
- Allows to add a task.
|
192 |
+
- Allows to delete a task.
|
193 |
+
- Allows to mark a task as done.
|
194 |
+
- Displays the list of tasks.
|
195 |
+
- Store the tasks in a SQLite database.
|
196 |
+
```
|
197 |
+
|
198 |
+

|
199 |
+
|
200 |
+
|
201 |
+
2. Let's see the result
|
202 |
+
|
203 |
+
You should see the agent construct the app and be able to explore the code it generated.
|
204 |
+
|
205 |
+
If it doesn't do it automatically, ask Devstral to deploy the app or do it manually, and then go the front URL deployment to see the app.
|
206 |
+
|
207 |
+

|
208 |
+

|
209 |
+
|
210 |
+
|
211 |
+
3. Iterate
|
212 |
+
|
213 |
+
Now that you have a first result you can iterate on it by asking your agent to improve it. For example, in the app generated we could click on a task to mark it checked but having a checkbox would improve UX. You could also ask it to add a feature to edit a task, or to add a feature to filter the tasks by status.
|
214 |
+
|
215 |
+
Enjoy building with Devstral Small and OpenHands!
|
216 |
+
|
217 |
+
</details>
|
218 |
+
|
219 |
+
|
220 |
+
### LMStudio (recommended for quantized model)
|
221 |
+
Download the weights from huggingface:
|
222 |
+
|
223 |
+
```
|
224 |
+
pip install -U "huggingface_hub[cli]"
|
225 |
+
huggingface-cli download \
|
226 |
+
"mistralai/Devstral-Small-2505_gguf" \
|
227 |
+
--include "devstralQ4_K_M.gguf" \
|
228 |
+
--local-dir "mistralai/Devstral-Small-2505_gguf/"
|
229 |
+
```
|
230 |
+
|
231 |
+
You can serve the model locally with [LMStudio](https://lmstudio.ai/).
|
232 |
+
* Download [LM Studio](https://lmstudio.ai/) and install it
|
233 |
+
* Install `lms cli ~/.lmstudio/bin/lms bootstrap`
|
234 |
+
* In a bash terminal, run `lms import devstralQ4_K_M.ggu` in the directory where you've downloaded the model checkpoint (e.g. `mistralai/Devstral-Small-2505_gguf`)
|
235 |
+
* Open the LMStudio application, click the terminal icon to get into the developer tab. Click select a model to load and select Devstral Q4 K M. Toggle the status button to start the model, in setting oggle Serve on Local Network to be on.
|
236 |
+
* On the right tab, you will see an API identifier which should be devstralq4_k_m and an api address under API Usage. Keep note of this address, we will use it in the next step.
|
237 |
+
|
238 |
+
Launch Openhands
|
239 |
+
You can now interact with the model served from LM Studio with openhands. Start the openhands server with the docker
|
240 |
+
|
241 |
+
```bash
|
242 |
+
docker pull docker.all-hands.dev/all-hands-ai/runtime:0.38-nikolaik
|
243 |
+
docker run -it --rm --pull=always \
|
244 |
+
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.38-nikolaik \
|
245 |
+
-e LOG_ALL_EVENTS=true \
|
246 |
+
-v /var/run/docker.sock:/var/run/docker.sock \
|
247 |
+
-v ~/.openhands-state:/.openhands-state \
|
248 |
+
-p 3000:3000 \
|
249 |
+
--add-host host.docker.internal:host-gateway \
|
250 |
+
--name openhands-app \
|
251 |
+
docker.all-hands.dev/all-hands-ai/openhands:0.38
|
252 |
+
```
|
253 |
+
|
254 |
+
Click “see advanced setting” on the second line.
|
255 |
+
In the new tab, toggle advanced to on. Set the custom model to be mistral/devstralq4_k_m and Base URL the api address we get from the last step in LM Studio. Set API Key to dummy. Click save changes.
|
256 |
+
|
257 |
+
### vLLM (recommended)
|
258 |
+
|
259 |
+
We recommend using this model with the [vLLM library](https://github.com/vllm-project/vllm)
|
260 |
+
to implement production-ready inference pipelines.
|
261 |
+
|
262 |
+
**_Installation_**
|
263 |
+
|
264 |
+
Make sure you install [`vLLM >= 0.8.5`](https://github.com/vllm-project/vllm/releases/tag/v0.8.5):
|
265 |
+
|
266 |
+
```
|
267 |
+
pip install vllm --upgrade
|
268 |
+
```
|
269 |
+
|
270 |
+
Doing so should automatically install [`mistral_common >= 1.5.4`](https://github.com/mistralai/mistral-common/releases/tag/v1.5.4).
|
271 |
+
|
272 |
+
To check:
|
273 |
+
```
|
274 |
+
python -c "import mistral_common; print(mistral_common.__version__)"
|
275 |
+
```
|
276 |
+
|
277 |
+
You can also make use of a ready-to-go [docker image](https://github.com/vllm-project/vllm/blob/main/Dockerfile) or on the [docker hub](https://hub.docker.com/layers/vllm/vllm-openai/latest/images/sha256-de9032a92ffea7b5c007dad80b38fd44aac11eddc31c435f8e52f3b7404bbf39).
|
278 |
+
|
279 |
+
#### Server
|
280 |
+
|
281 |
+
We recommand that you use Devstral in a server/client setting.
|
282 |
+
|
283 |
+
1. Spin up a server:
|
284 |
+
|
285 |
+
```
|
286 |
+
vllm serve mistralai/Devstral-Small-2505 --tokenizer_mode mistral --config_format mistral --load_format mistral --tool-call-parser mistral --enable-auto-tool-choice --tensor-parallel-size 2
|
287 |
+
```
|
288 |
+
|
289 |
+
|
290 |
+
2. To ping the client you can use a simple Python snippet.
|
291 |
+
|
292 |
+
```py
|
293 |
+
import requests
|
294 |
+
import json
|
295 |
+
from huggingface_hub import hf_hub_download
|
296 |
+
|
297 |
+
|
298 |
+
url = "http://<your-server-url>:8000/v1/chat/completions"
|
299 |
+
headers = {"Content-Type": "application/json", "Authorization": "Bearer token"}
|
300 |
+
|
301 |
+
model = "mistralai/Devstral-Small-2505"
|
302 |
+
|
303 |
+
def load_system_prompt(repo_id: str, filename: str) -> str:
|
304 |
+
file_path = hf_hub_download(repo_id=repo_id, filename=filename)
|
305 |
+
with open(file_path, "r") as file:
|
306 |
+
system_prompt = file.read()
|
307 |
+
return system_prompt
|
308 |
+
|
309 |
+
SYSTEM_PROMPT = load_system_prompt(model, "SYSTEM_PROMPT.txt")
|
310 |
+
|
311 |
+
messages = [
|
312 |
+
{"role": "system", "content": SYSTEM_PROMPT},
|
313 |
+
{
|
314 |
+
"role": "user",
|
315 |
+
"content": [
|
316 |
+
{
|
317 |
+
"type": "text",
|
318 |
+
"text": "Write a function that computes fibonacci in Python.",
|
319 |
+
},
|
320 |
+
],
|
321 |
+
},
|
322 |
+
]
|
323 |
+
|
324 |
+
data = {"model": model, "messages": messages, "temperature": 0.15}
|
325 |
+
|
326 |
+
response = requests.post(url, headers=headers, data=json.dumps(data))
|
327 |
+
print(response.json()["choices"][0]["message"]["content"])
|
328 |
+
```
|
329 |
+
|
330 |
+
<details>
|
331 |
+
<summary>Output</summary>
|
332 |
+
|
333 |
+
Certainly! The Fibonacci sequence is a series of numbers where each number is the sum of the two preceding ones, usually starting with 0 and 1. Here's a simple Python function to compute the Fibonacci sequence:
|
334 |
+
|
335 |
+
### Iterative Approach
|
336 |
+
This approach uses a loop to compute the Fibonacci number iteratively.
|
337 |
+
|
338 |
+
```python
|
339 |
+
def fibonacci(n):
|
340 |
+
if n <= 0:
|
341 |
+
return "Input should be a positive integer."
|
342 |
+
elif n == 1:
|
343 |
+
return 0
|
344 |
+
elif n == 2:
|
345 |
+
return 1
|
346 |
+
|
347 |
+
a, b = 0, 1
|
348 |
+
for _ in range(2, n):
|
349 |
+
a, b = b, a + b
|
350 |
+
return b
|
351 |
+
|
352 |
+
# Example usage:
|
353 |
+
print(fibonacci(10)) # Output: 34
|
354 |
+
```
|
355 |
+
|
356 |
+
### Recursive Approach
|
357 |
+
This approach uses recursion to compute the Fibonacci number. Note that this is less efficient for large `n` due to repeated calculations.
|
358 |
+
|
359 |
+
```python
|
360 |
+
def fibonacci_recursive(n):
|
361 |
+
if n <= 0:
|
362 |
+
return "Input should be a positive integer."
|
363 |
+
elif n == 1:
|
364 |
+
return 0
|
365 |
+
elif n == 2:
|
366 |
+
return 1
|
367 |
+
else:
|
368 |
+
return fibonacci_recursive(n - 1) + fibonacci_recursive(n - 2)
|
369 |
+
|
370 |
+
# Example usage:
|
371 |
+
print(fibonacci_recursive(10)) # Output: 34
|
372 |
+
```
|
373 |
+
|
374 |
+
\### Memoization Approach
|
375 |
+
This approach uses memoization to store previously computed Fibonacci numbers, making it more efficient than the simple recursive approach.
|
376 |
+
|
377 |
+
```python
|
378 |
+
def fibonacci_memo(n, memo={}):
|
379 |
+
if n <= 0:
|
380 |
+
return "Input should be a positive integer."
|
381 |
+
elif n == 1:
|
382 |
+
return 0
|
383 |
+
elif n == 2:
|
384 |
+
return 1
|
385 |
+
elif n in memo:
|
386 |
+
return memo[n]
|
387 |
+
|
388 |
+
memo[n] = fibonacci_memo(n - 1, memo) + fibonacci_memo(n - 2, memo)
|
389 |
+
return memo[n]
|
390 |
+
|
391 |
+
# Example usage:
|
392 |
+
print(fibonacci_memo(10)) # Output: 34
|
393 |
+
```
|
394 |
+
|
395 |
+
\### Dynamic Programming Approach
|
396 |
+
This approach uses an array to store the Fibonacci numbers up to `n`.
|
397 |
+
|
398 |
+
```python
|
399 |
+
def fibonacci_dp(n):
|
400 |
+
if n <= 0:
|
401 |
+
return "Input should be a positive integer."
|
402 |
+
elif n == 1:
|
403 |
+
return 0
|
404 |
+
elif n == 2:
|
405 |
+
return 1
|
406 |
+
|
407 |
+
fib = [0, 1] + [0] * (n - 2)
|
408 |
+
for i in range(2, n):
|
409 |
+
fib[i] = fib[i - 1] + fib[i - 2]
|
410 |
+
return fib[n - 1]
|
411 |
+
|
412 |
+
# Example usage:
|
413 |
+
print(fibonacci_dp(10)) # Output: 34
|
414 |
+
```
|
415 |
+
|
416 |
+
You can choose any of these approaches based on your needs. The iterative and dynamic programming approaches are generally more efficient for larger values of `n`.
|
417 |
+
|
418 |
+
</details>
|
419 |
+
|
420 |
+
|
421 |
+
### Mistral-inference
|
422 |
+
|
423 |
+
We recommend using mistral-inference to quickly try out / "vibe-check" Devstral.
|
424 |
+
|
425 |
+
#### Install
|
426 |
+
|
427 |
+
Make sure to have mistral_inference >= 1.6.0 installed.
|
428 |
+
|
429 |
+
```bash
|
430 |
+
pip install mistral_inference --upgrade
|
431 |
+
```
|
432 |
+
|
433 |
+
#### Download
|
434 |
+
|
435 |
+
```python
|
436 |
+
from huggingface_hub import snapshot_download
|
437 |
+
from pathlib import Path
|
438 |
+
|
439 |
+
mistral_models_path = Path.home().joinpath('mistral_models', 'Devstral')
|
440 |
+
mistral_models_path.mkdir(parents=True, exist_ok=True)
|
441 |
+
|
442 |
+
snapshot_download(repo_id="mistralai/Devstral-Small-2505", allow_patterns=["params.json", "consolidated.safetensors", "tekken.json"], local_dir=mistral_models_path)
|
443 |
+
```
|
444 |
+
|
445 |
+
#### Python
|
446 |
+
|
447 |
+
You can run the model using the following command:
|
448 |
+
|
449 |
+
```bash
|
450 |
+
mistral-chat $HOME/mistral_models/Devstral --instruct --max_tokens 300
|
451 |
+
```
|
452 |
+
|
453 |
+
If you prompt it with "Write me a unique and efficient function that computes fibonacci in Python", the model should generate something along the following lines:
|
454 |
+
|
455 |
+
<details>
|
456 |
+
<summary>Output</summary>
|
457 |
+
|
458 |
+
Certainly! A common and efficient way to compute Fibonacci numbers is by using memoization to store previously computed values. This avoids redundant calculations and significantly improves performance. Below is a Python function that uses memoization to compute Fibonacci numbers efficiently:
|
459 |
+
|
460 |
+
```python
|
461 |
+
def fibonacci(n, memo=None):
|
462 |
+
if memo is None:
|
463 |
+
memo = {}
|
464 |
+
|
465 |
+
if n in memo:
|
466 |
+
return memo[n]
|
467 |
+
|
468 |
+
if n <= 1:
|
469 |
+
return n
|
470 |
+
|
471 |
+
memo[n] = fibonacci(n - 1, memo) + fibonacci(n - 2, memo)
|
472 |
+
return memo[n]
|
473 |
+
|
474 |
+
# Example usage:
|
475 |
+
n = 10
|
476 |
+
print(f"Fibonacci number at position {n} is {fibonacci(n)}")
|
477 |
+
```
|
478 |
+
|
479 |
+
### Explanation:
|
480 |
+
|
481 |
+
1. **Base Case**: If `n` is 0 or 1, the function returns `n` because the Fibonacci sequence starts with 0 and 1.
|
482 |
+
2. **Memoization**: The function uses a dictionary `memo` to store the results of previously computed Fibonacci numbers.
|
483 |
+
3. **Recursive Case**: For other values of `n`, the function recursively computes the Fibonacci number by summing the results of `fibonacci(n - 1)` and `fibonacci(n)`
|
484 |
+
|
485 |
+
</details>
|
486 |
+
|
487 |
+
### Ollama
|
488 |
+
|
489 |
+
You can run Devstral using the [Ollama](https://ollama.ai/) CLI.
|
490 |
+
|
491 |
+
```bash
|
492 |
+
ollama run devstral
|
493 |
+
```
|
494 |
+
|
495 |
+
### Transformers
|
496 |
+
|
497 |
+
To make the best use of our model with transformers make sure to have [installed](https://github.com/mistralai/mistral-common) ` mistral-common >= 1.5.5` to use our tokenizer.
|
498 |
+
|
499 |
+
```bash
|
500 |
+
pip install mistral-common --upgrade
|
501 |
+
```
|
502 |
+
|
503 |
+
Then load our tokenizer along with the model and generate:
|
504 |
+
|
505 |
+
```python
|
506 |
+
import torch
|
507 |
+
|
508 |
+
from mistral_common.protocol.instruct.messages import (
|
509 |
+
SystemMessage, UserMessage
|
510 |
+
)
|
511 |
+
from mistral_common.protocol.instruct.request import ChatCompletionRequest
|
512 |
+
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
|
513 |
+
from mistral_common.tokens.tokenizers.tekken import SpecialTokenPolicy
|
514 |
+
from huggingface_hub import hf_hub_download
|
515 |
+
from transformers import AutoModelForCausalLM
|
516 |
+
|
517 |
+
def load_system_prompt(repo_id: str, filename: str) -> str:
|
518 |
+
file_path = hf_hub_download(repo_id=repo_id, filename=filename)
|
519 |
+
with open(file_path, "r") as file:
|
520 |
+
system_prompt = file.read()
|
521 |
+
return system_prompt
|
522 |
+
|
523 |
+
model_id = "mistralai/Devstral-Small-2505"
|
524 |
+
tekken_file = hf_hub_download(repo_id=model_id, filename="tekken.json")
|
525 |
+
SYSTEM_PROMPT = load_system_prompt(model_id, "SYSTEM_PROMPT.txt")
|
526 |
+
|
527 |
+
tokenizer = MistralTokenizer.from_file(tekken_file)
|
528 |
+
|
529 |
+
model = AutoModelForCausalLM.from_pretrained(model_id)
|
530 |
+
|
531 |
+
tokenized = tokenizer.encode_chat_completion(
|
532 |
+
ChatCompletionRequest(
|
533 |
+
messages=[
|
534 |
+
SystemMessage(content=SYSTEM_PROMPT),
|
535 |
+
UserMessage(content="Write me a function that computes fibonacci in Python."),
|
536 |
+
],
|
537 |
+
)
|
538 |
+
)
|
539 |
+
|
540 |
+
output = model.generate(
|
541 |
+
input_ids=torch.tensor([tokenized.tokens]),
|
542 |
+
max_new_tokens=1000,
|
543 |
+
)[0]
|
544 |
+
|
545 |
+
decoded_output = tokenizer.decode(output[len(tokenized.tokens):])
|
546 |
+
print(decoded_output)
|
547 |
+
```
|