nielsr HF Staff commited on
Commit
dd49489
·
verified ·
1 Parent(s): dd2c629

Improve model card with metadata, links, and usage for Transition Models (TiM)

Browse files

This PR significantly enhances the model card for the Transition Models (TiM) by:

- Adding the `license` to the metadata (Apache-2.0).
- Including the `pipeline_tag: text-to-image` to ensure discoverability on the Hugging Face Hub.
- Providing a concise description of the model based on the paper's abstract and highlights from the GitHub README.
- Including a "Quickstart" section with instructions for setting up the environment, downloading the model, and running text-to-image generation, directly extracted from the GitHub README to facilitate immediate usage.

Files changed (1) hide show
  1. README.md +58 -1
README.md CHANGED
@@ -1 +1,58 @@
1
- arxiv.org/abs/2509.04394
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: text-to-image
4
+ ---
5
+
6
+ # Transition Models: Rethinking the Generative Learning Objective
7
+
8
+ This repository contains the Transition Models (TiM) presented in the paper [Transition Models: Rethinking the Generative Learning Objective](https://arxiv.org/abs/2509.04394).
9
+
10
+ TiM is a novel generative model designed for flexible photorealistic text-to-image generation. It achieves state-of-the-art performance with high efficiency by learning arbitrary state-to-state transitions, unifying few-step and many-step generation within a single model.
11
+
12
+ * **Paper**: [https://arxiv.org/abs/2509.04394](https://arxiv.org/abs/2509.04394)
13
+ * **Code**: [https://github.com/WZDTHU/TiM](https://github.com/WZDTHU/TiM)
14
+
15
+ ## Highlights
16
+
17
+ * Our Transition Models (TiM) are trained to master arbitrary state-to-state transitions. This approach allows TiM to learn the entire solution manifold of the generative process, unifying the few-step and many-step regimes within a single, powerful model.
18
+ ![Figure](https://github.com/WZDTHU/TiM/raw/main/assets/illustration.png)
19
+ * Despite having only 865M parameters, TiM achieves state-to-art performance, surpassing leading models such as SD3.5 (8B parameters) and FLUX.1 (12B parameters) across all evaluated step counts on GenEval benchmark. Importantly, unlike previous few-step generators, TiM demonstrates monotonic quality improvement as the sampling budget increases.
20
+ ![Figure](https://github.com/WZDTHU/TiM/raw/main/assets/nfe_demo.png)
21
+ * Additionally, when employing our native-resolution strategy, TiM delivers exceptional fidelity at resolutions up to `4096x4096`.
22
+ ![Figure](https://github.com/WZDTHU/TiM/raw/main/assets/tim_demo.png)
23
+
24
+ ## Quickstart
25
+
26
+ ### 1. Setup
27
+
28
+ First, clone the repo:
29
+ ```bash
30
+ git clone https://github.com/WZDTHU/TiM.git && cd TiM
31
+ ```
32
+
33
+ #### 1.1 Environment Setup
34
+
35
+ ```bash
36
+ conda create -n tim_env python=3.10
37
+ pip install torch==2.5.1 torchvision==0.20.1 --index-url https://download.pytorch.org/whl/cu118
38
+ pip install flash-attn
39
+ pip install -r requirements.txt
40
+ pip install -e .
41
+ ```
42
+
43
+ #### 1.2 Model Download
44
+
45
+ Download the Text-to-Image model:
46
+ ```bash
47
+ mkdir checkpoints
48
+ wget -c "https://huggingface.co/GoodEnough/TiM-T2I/resolve/main/t2i_model.bin" -O checkpoints/t2i_model.bin
49
+ ```
50
+
51
+ ### 2. Sampling (Text-to-Image Generation)
52
+
53
+ We provide the sampling scripts on three benchmarks. You can specify the sampling steps, resolutions, and CFG scale in the corresponding scripts.
54
+
55
+ Sampling with TiM-T2I model on GenEval benchmark:
56
+ ```bash
57
+ bash scripts/sample/t2i/sample_t2i_geneval.sh
58
+ ```