6 3 6

Besmira Nushi

nushib

AI & ML interests

Machine Learning, Responsible AI

Recent Activity

authored a paper 11 days ago

Mitigating Spurious Correlations in Multi-modal Models during Fine-tuning

authored a paper 11 days ago

Diversity of Thought Improves Reasoning Abilities of Large Language Models

authored a paper 11 days ago

Elephants Never Forget: Memorization and Learning of Tabular Data in Large Language Models

View all activity

Organizations

nushib's activity

authored 5 papers 11 days ago

Mitigating Spurious Correlations in Multi-modal Models during Fine-tuning

Paper • 2304.03916 • Published Apr 8, 2023

Diversity of Thought Improves Reasoning Abilities of Large Language Models

Paper • 2310.07088 • Published Oct 11, 2023 • 5

Elephants Never Forget: Memorization and Learning of Tabular Data in Large Language Models

Paper • 2404.06209 • Published Apr 9, 2024 • 5

Eureka: Evaluating and Understanding Large Foundation Models

Paper • 2409.10566 • Published Sep 13, 2024

BENCHAGENTS: Automated Benchmark Creation with Agent Interaction

Paper • 2410.22584 • Published Oct 29, 2024

liked a dataset 11 days ago

microsoft/Eureka-Bench-Logs

Updated 9 days ago • 1.84k • 6

New activity in microsoft/Eureka-Bench-Logs 11 days ago

Update readme - include phi tech report + contact info

#18 opened 11 days ago by

nushib

New activity in microsoft/Eureka-Bench-Logs 21 days ago

Upload 7 files

#2 opened 21 days ago by

nushib

authored a paper about 1 month ago

Phi-4-reasoning Technical Report

Paper • 2504.21318 • Published Apr 30 • 48

authored a paper 2 months ago

Inference-Time Scaling for Complex Tasks: Where We Stand and What Lies Ahead

Paper • 2504.00294 • Published Mar 31 • 10

liked a model 5 months ago

microsoft/phi-4

Text Generation • Updated Feb 24 • 420k • • 2.07k

liked a dataset 9 months ago

microsoft/VISION_LANGUAGE

Viewer • Updated Jan 23 • 30k • 77 • 5

upvoted a paper 11 months ago

The Art of Saying No: Contextual Noncompliance in Language Models

Paper • 2407.12043 • Published Jul 2, 2024 • 4

liked a model 12 months ago

microsoft/Orca-2-13b

Text Generation • Updated Nov 22, 2023 • 7.29k • 666

authored a paper about 1 year ago

Introducing v0.5 of the AI Safety Benchmark from MLCommons

Paper • 2404.12241 • Published Apr 18, 2024 • 11

upvoted a paper over 1 year ago

SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities

Paper • 2401.12168 • Published Jan 22, 2024 • 28

liked a dataset over 1 year ago

toxigen/toxigen-data

Viewer • Updated Jun 17, 2024 • 319k • 4.02k • 58

authored a paper over 1 year ago

KITAB: Evaluating LLMs on Constraint Satisfaction for Information Retrieval

Paper • 2310.15511 • Published Oct 24, 2023 • 5

updated a dataset over 1 year ago

microsoft/kitab

Viewer • Updated Oct 25, 2023 • 13.6k • 470 • 13

New activity in microsoft/kitab over 1 year ago

Update README.md

#5 opened over 1 year ago by

marah-abdin