open-r1
/

OlympicCoder-32B

Text Generation

text-generation-inference

Model card Files Files and versions

lewtun HF Staff commited on Mar 17

Commit

2ae6e03

·

verified ·

1 Parent(s): f199e18

Update README.md

Files changed (1) hide show

README.md +13 -0

README.md CHANGED Viewed

@@ -25,8 +25,21 @@ OlympicCoder-32B is a code mode that achieves very strong performance on competi
 ## Evaluation
 ![](./ioi-evals.png)

 ## Evaluation
+We compare the performance of OlympicCoder models on two main benchmarks for competitive coding:
+* **[IOI'2024:](https://github.com/huggingface/ioi)** 6 very challenging problems from the 2024 International Olympiad in Informatics. Models are allowed up to 50 submissions per problem.
+* **[LiveCodeBench:](https://livecodebench.github.io)** Python programming problems source from platforms like CodeForces and LeetCoder. We use the `v4_v5` subset of [`livecodebench/code_generation_lite`](https://huggingface.co/datasets/livecodebench/code_generation_lite), which corresponds to 268 problems. We use `lighteval` to evaluate models on LiveCodeBench using the sampling parameters described [here](https://github.com/huggingface/open-r1?tab=readme-ov-file#livecodebench).
+> [!NOTE]
+> The OlympicCoder models were post-trained exclusively on C++ solutions generated by DeepSeek-R1. As a result the performance on LiveCodeBench should be considered to be partially _out-of-domain_, since this expects models to output solutions in Python.
+### IOI'24
 ![](./ioi-evals.png)
+### LiveCodeBench
+![](./lcb-evals.png)