Gabriele Sarti's picture

Gabriele Sarti

gsarti

·

https://gsarti.com

AI & ML interests

Interpretability for generative language models

Recent Activity

updated a collection about 14 hours ago

🔍 Interpretability & Analysis of LMs

upvoted a paper about 14 hours ago

Can Interpretation Predict Behavior on Unseen Data?

liked a dataset about 18 hours ago

sardinelab/MF2

View all activity

Organizations

New activity in gsarti/gradio_highlightedtextbox 10 days ago

Update Dockerfile

#8 opened 11 days ago by

New activity in gsarti/gradio_highlightedtextbox 11 days ago

run gradio cc build

#7 opened 11 days ago by

🚩 Report: Not working

#3 opened 11 months ago by

tip + patch to solve typing

#2 opened about 1 year ago by

fix(parser): Correctly handle literal less-than signs in text

#6 opened 11 days ago by

refactor(docker): Build package from source in Dockerfile

#5 opened 11 days ago by

rsn86/fix_package

#4 opened 11 days ago by

New activity in gsarti/unsup_wqe_metrics about 1 month ago

Update README.md

#3 opened about 1 month ago by

Update task category

#2 opened about 1 month ago by

Add dataset card, link to paper, and link to code

#1 opened about 1 month ago by

commented a paper about 1 month ago

Unsupervised Word-level Quality Estimation for Machine Translation Through the Lens of Annotators (Dis)agreement

Paper • 2505.23183 • Published May 29 • 2 •

commented a paper about 2 months ago

Steering Large Language Models for Machine Translation Personalization

Paper • 2505.16612 • Published May 22 • 6 •

New activity in gsarti/qe4pe 2 months ago

[bot] Conversion to Parquet

#1 opened 9 months ago by

parquet-converter

New activity in gsarti/qe4pe 4 months ago

Link paper to HF papers URL

#3 opened 4 months ago by

commented a paper 4 months ago

QE4PE: Word-level Quality Estimation for Human Post-Editing

Paper • 2503.03044 • Published Mar 4 • 6 •

commented a paper 5 months ago

We Can't Understand AI Using our Existing Vocabulary

Paper • 2502.07586 • Published Feb 11 • 10 •

commented a paper 6 months ago

Enhancing Automated Interpretability with Output-Centric Feature Descriptions

Paper • 2501.08319 • Published Jan 14 • 11 •

commented a paper 8 months ago

Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models

Paper • 2411.14257 • Published Nov 21, 2024 • 13 •

New activity in gsarti/opus-mt-tc-en-pl 9 months ago

how to fine tune this model to get better polish translation

#3 opened about 2 years ago by

commented a paper 11 months ago

Measuring Progress in Dictionary Learning for Language Model Interpretability with Board Game Models

Paper • 2408.00113 • Published Jul 31, 2024 • 8 •