Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

Vendor ABC

company
Activity Feed Request to join this org

AI & ML interests

None defined yet.

Rajiv Shah's profile picture Taylor Linton's profile picture

rajistics 
posted an update 4 months ago
view post
Post
3552
Having some fun with long context benchmarks (watch the video!!)

NoLiMA: NoLiMa: Long-Context Evaluation Beyond Literal Matching (2502.05167)
Fiction LiveBench: https://fiction.live/stories/Fiction-liveBench-Mar-25-2025/oQdzQvKHw8JyXbN87
Michalenglo: https://deepmind.google/research/publications/117639/
LongGenBench: Spinning the Golden Thread: Benchmarking Long-Form Generation in Language Models (2409.02076)
NeedleBench: NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window? (2407.11963)
RULER: RULER: What's the Real Context Size of Your Long-Context Language Models? (2404.06654)

For more: https://www.reddit.com/r/rajistics/comments/1jxwk29/long_context_llm_benchmarks_video/

let me know if you like these posts
rajistics 
updated 3 models almost 3 years ago

vendorabc/tabular-playground

Tabular Classification • Updated Aug 30, 2022

vendorabc/modeltest

Tabular Classification • Updated Aug 30, 2022

vendorabc/modelhubexample

Tabular Classification • Updated Aug 30, 2022
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs