EARL - SFT (S) (8B)
Model Name: mair-lab/sft-simple
Model Size: 8B parameters
Base Model: BAAI/Emu3-Stage1
Training Method: Supervised Fine-Tuning (SFT)
Dataset: Simple Edit (S)
This model is part of the EARL benchmark effort introduced in our paper:
๐ EARL: The Promise of RL for Autoregressive Image Editing
Model Summary
This SFT model is fine-tuned from Emu3 using direct supervision on the Simple Edit dataset. It is optimized for general-purpose autoregressive image editing without requiring intermediate reasoning steps. This model achieves state-of-the-art performance on several editing benchmarks across modalities.
โก๏ธ Inference script and usage: GitHub Repo
Benchmark Results (Avg Score Across Benchmarks)
Model | Base Model | OmniEdit | EmuEdit | AURORA | MB | VisMin | I2EBench | AVG |
---|---|---|---|---|---|---|---|---|
Magicbrush | SD v1.5 | 3.43 | 3.28 | 3.01 | 3.64 | 3.48 | 3.06 | 3.32 |
InstructPix2Pix | SD v1.5 | 3.97 | 3.24 | 3.05 | 3.12 | 2.94 | 3.23 | 3.26 |
Aurora | SD v1.5 | 4.50 | 4.40 | 4.12 | 4.62 | 3.82 | 3.58 | 4.17 |
Omnigen* | - | 5.68 | 5.00 | 4.10 | 4.68 | 4.09 | 4.68 | 4.70 |
SFT (S) | Emu3 | 5.73 | 3.66 | 3.58 | 3.19 | 3.57 | 3.59 | 3.88 |
๐ Note: The Emu3-based SFT (S) model achieves top results among all open-source supervised models on OmniEdit and competitive performance across other benchmarks.
Use Cases
- Open-ended and instruction-guided image editing
- Object, attribute, style and environment change
- Downloads last month
- 5