Update README.md
Browse files
README.md
CHANGED
@@ -57,7 +57,7 @@ print(outputs[0]["generated_text"][-1])
|
|
57 |
|
58 |
## Evaluation Results
|
59 |
|
60 |
-
We evaluate
|
61 |
|
62 |
### Needle in a Haystack
|
63 |
|
|
|
57 |
|
58 |
## Evaluation Results
|
59 |
|
60 |
+
We evaluate Nemotron-UltraLong-8B on a diverse set of benchmarks, including long-context tasks (e.g., RULER, LV-Eval, and InfiniteBench) and standard tasks (e.g., MMLU, MATH, GSM-8K, and HumanEval). UltraLong-8B achieves superior performance on ultra-long context tasks while maintaining competitive results on standard benchmarks.
|
61 |
|
62 |
### Needle in a Haystack
|
63 |
|