AI Safety Research's picture

AI Safety Research

AISafety

·

https://humanaligned.ai

AI & ML interests

LLMs, planning, EA

Recent Activity

liked a dataset 25 days ago

LightningRodLabs/future-as-label-paper-training-dataset

liked a model 27 days ago

cyankiwi/GLM-4.7-Flash-AWQ-4bit

liked a model 27 days ago

zai-org/GLM-4.7-Flash

View all activity

Organizations

upvoted 2 articles about 2 months ago

Article

Exploring Environments Hub: Your Language Model needs better (open) environments to learn

Sep 4, 2025

•

29

Article

HUMAINE: A Rigorous Framework for Understanding AI Through Human Experience

Sep 16, 2025

•

7

upvoted an article 2 months ago

Article

From GRPO to DAPO and GSPO: What, Why, and How

Aug 9, 2025

•

84

upvoted a collection 2 months ago

Olmo 3

Artifacts for the Olmo 3 release. • 9 items • Updated Dec 23, 2025 • 163

upvoted an article 2 months ago

Article

We Got Claude to Fine-Tune an Open Source LLM

Dec 4, 2025

•

594

upvoted a collection 3 months ago

Transformers.js demos

A collection of my favorite WebML demos, built with Transformers.js! • 30 items • Updated Jul 11, 2024 • 138

upvoted 2 papers 3 months ago

DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning

Paper • 2511.22570 • Published Nov 27, 2025 • 90

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published Nov 9, 2025 • 133

upvoted a collection 3 months ago

The Bestiary

Decensored language models made using Heretic (https://github.com/p-e-w/heretic) • 6 items • Updated Nov 16, 2025 • 90

upvoted an article 4 months ago

Article

EuroLLM-9B

Dec 2, 2024

•

139

upvoted a collection 4 months ago

🎯 Liquid Nanos

Library of task-specific models: https://www.liquid.ai/blog/introducing-liquid-nanos-frontier-grade-performance-on-everyday-devices • 26 items • Updated 15 days ago • 109

upvoted an article 4 months ago

Article

SOTA OCR with Core ML and dots.ocr

Oct 2, 2025

•

62

upvoted a collection 5 months ago

DeepSeek-V3.2

4 items • Updated Dec 1, 2025 • 525

upvoted a paper 5 months ago

Reinforcement Learning on Pre-Training Data

Paper • 2509.19249 • Published Sep 23, 2025 • 67

upvoted 2 collections 6 months ago

InternVL3.5

This collection includes all released checkpoints of InternVL3.5, covering different training stages (e.g., Pretraining, SFT, MPO, Cascade RL). • 54 items • Updated Sep 28, 2025 • 105

DeepSeek-V3.1

4 items • Updated Nov 27, 2025 • 260

upvoted an article 6 months ago

Article

Introducing AI Sheets: a tool to work with datasets using open AI models!

+4

Aug 8, 2025

•

108

upvoted a paper 6 months ago

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published Aug 8, 2025 • 206

upvoted 2 collections 7 months ago

GLM-4.5

GLM-4.5: An open-source large language model designed for intelligent agents by Z.ai • 11 items • Updated Aug 11, 2025 • 252

Sparse Autoencoders

SAEs are tools for understanding the internal representations of neural networks. These can be loaded using https://github.com/EleutherAI/sae • 9 items • Updated Feb 26, 2025 • 7