Nebius

company

Verified

https://nebius.com

nebiusai

nebius

Inference Provider

2,798,819 monthly requests

AI & ML interests

AI-centric cloud platform ready for intensive workloads Training-ready platform with NVIDIA® H100 Tensor Core GPUs. Competitive pricing. Dedicated support.

Recent Activity

ibragim-bad new activity about 22 hours ago

nebius/SWE-rebench:Could this dataset be repurposed for LLM training?

ibragim-bad new activity 11 days ago

nebius/SWE-rebench:GLM 4.5

ibragim-bad new activity 14 days ago

nebius/SWE-rebench:Could you provide the Docker images?

View all activity

Articles

Introducing Three New Serverless Inference Providers: Hyperbolic, Nebius AI Studio, and Novita 🔥

ibragim-bad

in nebius/SWE-rebench about 22 hours ago

Could this dataset be repurposed for LLM training?

#7 opened about 22 hours ago by

ibragim-bad

in nebius/SWE-rebench 11 days ago

GLM 4.5

#6 opened 11 days ago by

ibragim-bad

posted an update 14 days ago

Post

234

We tested Qwen3-Coder, GPT-5 and other 30+ models on new SWE-Bench like tasks from July 2025!

Hi all, I’m Ibragim from Nebius.

We ran a benchmark on 34 fresh GitHub PR tasks from July 2025 using the SWE-rebench leaderboard https://swe-rebench.com/leaderboard . These are real, recent problems — no training-set contamination — and include both proprietary and open-source models.

Quick takeaways:

> GPT-5-Medium leads overall (29.4% resolved rate, 38.2% pass@5).
> Qwen3-Coder is the best open-source performer, matching GPT-5-High in pass@5 (32.4%) despite a lower resolved rate.
> Claude Sonnet 4.0 lags behind in pass@5 at 23.5%.

All tasks come from the continuously updated, decontaminated nebius/SWE-rebench-leaderboard for real-world SWE tasks.

1 reply

·

ibragim-bad

in nebius/SWE-rebench 14 days ago

Could you provide the Docker images?

#2 opened 3 months ago by

ibragim-bad

authored a paper 18 days ago

Training Long-Context, Multi-Turn Software Engineering Agents with Reinforcement Learning

Paper • 2508.03501 • Published 21 days ago • 53

ibragim-bad

updated a dataset 18 days ago

nebius/SWE-rebench

Viewer • Updated 18 days ago • 21.3k • 3.2M • 21

ibragim-bad

in nebius/SWE-rebench 19 days ago

How to build docker image for each instance?

#4 opened about 1 month ago by

Any plans to release all of the Docker images for this dataset?

#5 opened 20 days ago by

ibragim-bad

updated a dataset 22 days ago

nebius/SWE-rebench-leaderboard

Viewer • Updated 22 days ago • 409 • 475 • 6

ibragim-bad

published a dataset about 2 months ago

nebius/SWE-rebench-leaderboard

Viewer • Updated 22 days ago • 409 • 475 • 6

ibragim-bad

in nebius/SWE-rebench 3 months ago

Add task category and library name to dataset card

#3 opened 3 months ago by

hr0nix

authored a paper 3 months ago

SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents

Paper • 2505.20411 • Published May 26 • 88

ibragim-bad

authored a paper 3 months ago

SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents

Paper • 2505.20411 • Published May 26 • 88

ibragim-bad

updated a dataset 3 months ago

nebius/SWE-bench-extra

Viewer • Updated May 28 • 6.38k • 138 • 44

ibragim-bad

published a dataset 4 months ago

nebius/SWE-rebench

Viewer • Updated 18 days ago • 21.3k • 3.2M • 21

ibragim-bad

in nebius/SWE-bench-extra 6 months ago

Do you have the swe-bench harness code?

#2 opened 7 months ago by

ibragim-bad

updated 4 datasets 8 months ago

nebius/SWE-agent-trajectories

Viewer • Updated Dec 23, 2024 • 80k • 618 • 62

nebius/SWE-bench-extra

Viewer • Updated May 28 • 6.38k • 138 • 44

nebius/SWE-agent-trajectories

Viewer • Updated Dec 23, 2024 • 80k • 618 • 62

nebius/SWE-bench-extra

Viewer • Updated May 28 • 6.38k • 138 • 44