danielhanchen commited on
Commit
122dc39
·
verified ·
1 Parent(s): e9b6960

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -43,7 +43,7 @@ It is finetuned from [Mistral-Small-3.1](https://huggingface.co/mistralai/Mistra
43
 
44
  For enterprises requiring specialized capabilities (increased context, domain-specific knowledge, etc.), we will release commercial models beyond what Mistral AI contributes to the community.
45
 
46
- Learn more about Devstral in our [blog post](https://mistral.ai/news/devstral).
47
 
48
  **Updates compared to [`Devstral Small 1.0`](https://huggingface.co/mistralai/Devstral-Small-2505):**
49
  - Improved performance, please refer to the [benchmark results](#benchmark-results).
@@ -63,11 +63,11 @@ Learn more about Devstral in our [blog post](https://mistral.ai/news/devstral).
63
 
64
  ### SWE-Bench
65
 
66
- Devstral Small 1.1 achieves a score of **52.4%** on SWE-Bench Verified, outperforming Devstral Small 1.0 by +5,6% and the second best state of the art model by +10.2%.
67
 
68
  | Model | Agentic Scaffold | SWE-Bench Verified (%) |
69
  |--------------------|--------------------|------------------------|
70
- | Devstral Small 1.1 | OpenHands Scaffold | **52.4** |
71
  | Devstral Small 1.0 | OpenHands Scaffold | *46.8* |
72
  | GPT-4.1-mini | OpenAI Scaffold | 23.6 |
73
  | Claude 3.5 Haiku | Anthropic Scaffold | 40.6 |
@@ -512,4 +512,4 @@ Finally, the game is ready to be played:
512
 
513
  ![space invaders pong - game](assets/space_invaders_pong/game.png)
514
 
515
- Don't hesitate to iterate or give more information to Devstral to improve the game!
 
43
 
44
  For enterprises requiring specialized capabilities (increased context, domain-specific knowledge, etc.), we will release commercial models beyond what Mistral AI contributes to the community.
45
 
46
+ Learn more about Devstral in our [blog post](https://mistral.ai/news/devstral-2507).
47
 
48
  **Updates compared to [`Devstral Small 1.0`](https://huggingface.co/mistralai/Devstral-Small-2505):**
49
  - Improved performance, please refer to the [benchmark results](#benchmark-results).
 
63
 
64
  ### SWE-Bench
65
 
66
+ Devstral Small 1.1 achieves a score of **53.6%** on SWE-Bench Verified, outperforming Devstral Small 1.0 by +6,8% and the second best state of the art model by +11.4%.
67
 
68
  | Model | Agentic Scaffold | SWE-Bench Verified (%) |
69
  |--------------------|--------------------|------------------------|
70
+ | Devstral Small 1.1 | OpenHands Scaffold | **53.6** |
71
  | Devstral Small 1.0 | OpenHands Scaffold | *46.8* |
72
  | GPT-4.1-mini | OpenAI Scaffold | 23.6 |
73
  | Claude 3.5 Haiku | Anthropic Scaffold | 40.6 |
 
512
 
513
  ![space invaders pong - game](assets/space_invaders_pong/game.png)
514
 
515
+ Don't hesitate to iterate or give more information to Devstral to improve the game!