Damien Sileo

sileod

36 18 104

https://sileod.github.io/

AI & ML interests

NLP datasets, reasoning, scaling multi-task learning

Recent Activity

updated a dataset about 11 hours ago

reasoning-core/staging

updated a dataset about 21 hours ago

reasoning-core/synlogic

updated a collection about 21 hours ago

datasets

View all activity

Organizations

upvoted a paper 4 months ago

Distilling Human-Aligned Privacy Sensitivity Assessment from Large Language Models

Paper • 2603.29497 • Published Mar 31 • 6

upvoted 3 papers 5 months ago

Bridging the Data Provenance Gap Across Text, Speech and Video

Paper • 2412.17847 • Published Dec 19, 2024 • 13

Adaptive Text Anonymization: Learning Privacy-Utility Trade-offs via Prompt Optimization

Paper • 2602.20743 • Published Feb 24 • 2

Reasoning Core: A Scalable Procedural Data Generation Suite for Symbolic Pre-training and Post-Training

Paper • 2603.02208 • Published Mar 2 • 4

upvoted a collection 5 months ago

datasets

Collection

Reasoning Core ◉ Pre-generated symbolic reasoning data, from pre-training pile to post-training environments • 7 items • Updated about 21 hours ago • 4

upvoted a paper 6 months ago

MortalMATH: Evaluating the Conflict Between Reasoning Objectives and Emergency Contexts

Paper • 2601.18790 • Published Jan 26 • 2

upvoted 2 papers 10 months ago

Reasoning Core: A Scalable RL Environment for LLM Symbolic Reasoning

Paper • 2509.18083 • Published Sep 22, 2025 • 5

Saturation-Driven Dataset Generation for LLM Mathematical Reasoning in the TPTP Ecosystem

Paper • 2509.06809 • Published Sep 8, 2025 • 3

upvoted 3 papers almost 2 years ago

TAROT: Task-Oriented Authorship Obfuscation Using Policy Optimization Methods

Paper • 2407.21630 • Published Jul 31, 2024 • 8

Consent in Crisis: The Rapid Decline of the AI Data Commons

Paper • 2407.14933 • Published Jul 20, 2024 • 15

Attention Overflow: Language Model Input Blur during Long-Context Missing Items Recommendation

Paper • 2407.13481 • Published Jul 18, 2024 • 10

upvoted 3 papers about 2 years ago

The Data Provenance Initiative: A Large Scale Audit of Dataset Licensing & Attribution in AI

Paper • 2310.16787 • Published Oct 25, 2023 • 5

Probing neural language models for understanding of words of estimative probability

Paper • 2211.03358 • Published Nov 7, 2022 • 1

Scaling Synthetic Logical Reasoning Datasets with Context-Sensitive Declarative Grammars

Paper • 2406.11035 • Published Jun 16, 2024 • 1

upvoted 3 collections over 2 years ago

upvoted a paper almost 3 years ago

tasksource: Structured Dataset Preprocessing Annotations for Frictionless Extreme Multi-Task Learning and Evaluation

Paper • 2301.05948 • Published Jan 14, 2023 • 3

Damien Sileo

AI & ML interests

Recent Activity

Organizations

sileod's activity