hallucinations-leaderboard

community

https://www.neuralnoise.com

pminervini

pminervini

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

clefourrier authored a paper 20 days ago

Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation

acDante authored a paper about 1 month ago

The Hallucinations Leaderboard -- An Open Effort to Measure Hallucinations in Large Language Models

acDante authored a paper about 1 month ago

Are We Done with MMLU?

View all activity

hallucinations-leaderboard's activity

clefourrier

authored a paper 20 days ago

Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation

Paper • 2412.03304 • Published 22 days ago • 17

acDante

authored 4 papers about 1 month ago

The Hallucinations Leaderboard -- An Open Effort to Measure Hallucinations in Large Language Models

Paper • 2404.05904 • Published Apr 8 • 8

Are We Done with MMLU?

Paper • 2406.04127 • Published Jun 6 • 37

Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering

Paper • 2410.15999 • Published Oct 21 • 19

Analysing the Residual Stream of Language Models Under Knowledge Conflicts

Paper • 2410.16090 • Published Oct 21 • 7

pminervini

updated 2 datasets about 2 months ago

hallucinations-leaderboard/requests

Preview • Updated Oct 31 • 64k

hallucinations-leaderboard/results

Updated Oct 31 • 161k • 2

pminervini

authored 2 papers about 2 months ago

Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering

Paper • 2410.15999 • Published Oct 21 • 19

Adapting Neural Link Predictors for Data-Efficient Complex Query Answering

Paper • 2301.12313 • Published Jan 29, 2023

aryopg

authored 3 papers about 2 months ago

CoMAT: Chain of Mathematically Annotated Thought Improves Mathematical Reasoning

Paper • 2410.10336 • Published Oct 14 • 2

Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering

Paper • 2410.15999 • Published Oct 21 • 19

Analysing the Residual Stream of Language Models Under Knowledge Conflicts

Paper • 2410.16090 • Published Oct 21 • 7

pminervini

authored a paper about 2 months ago

Attention Is All You Need But You Don't Need All Of It For Inference of Large Language Models

Paper • 2407.15516 • Published Jul 22 • 1

aryopg

authored a paper about 2 months ago

DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucinations

Paper • 2410.18860 • Published Oct 24 • 9

pminervini

authored a paper about 2 months ago

DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucinations

Paper • 2410.18860 • Published Oct 24 • 9

rohitsaxena

authored a paper 4 months ago

MovieSum: An Abstractive Summarization Dataset for Movie Screenplays

Paper • 2408.06281 • Published Aug 12 • 9

clefourrier

authored 2 papers 6 months ago

The Hallucinations Leaderboard -- An Open Effort to Measure Hallucinations in Large Language Models

Paper • 2404.05904 • Published Apr 8 • 8

GAIA: a benchmark for General AI Assistants

Paper • 2311.12983 • Published Nov 21, 2023 • 185

rohitsaxena

authored a paper 6 months ago

Select and Summarize: Scene Saliency for Movie Script Summarization

Paper • 2404.03561 • Published Apr 4 • 2

pminervini

authored a paper 6 months ago

A Simple and Effective $L_2$ Norm-Based Strategy for KV Cache Compression

Paper • 2406.11430 • Published Jun 17 • 22