Update README.md

Browse files

Files changed (4) hide show

.gitattributes +2 -0
README.md +86 -0
gmdit.png +3 -0
gmdit_results.png +3 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+gmdit_results.png filter=lfs diff=lfs merge=lfs -text
+gmdit.png filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -1,3 +1,89 @@
 ---
 license: apache-2.0
 ---

 ---
 license: apache-2.0
 ---
+# Gaussian Mixture Flow Matching Models (GMFlow)
+Model used in the paper:
+**Gaussian Mixture Flow Matching Models**
+<br>
+[Hansheng Chen](https://lakonik.github.io/)<sup>1</sup>,
+[Kai Zhang](https://kai-46.github.io/website/)<sup>2</sup>,
+[Hao Tan](https://research.adobe.com/person/hao-tan/)<sup>2</sup>,
+[Zexiang Xu](https://zexiangxu.github.io/)<sup>3</sup>,
+[Fujun Luan](https://research.adobe.com/person/fujun/)<sup>2</sup>,
+[Leonidas Guibas](https://geometry.stanford.edu/?member=guibas)<sup>1</sup>,
+[Gordon Wetzstein](http://web.stanford.edu/~gordonwz/)<sup>1</sup>,
+[Sai Bi](https://sai-bi.github.io/)<sup>2</sup><br>
+<sup>1</sup>Stanford University, <sup>2</sup>Adobe Research, <sup>3</sup>Hillbot
+<br>
+[[arXiv](https://arxiv.org/abs/2504.05304)] [[GitHub](https://github.com/Lakonik/GMFlow)]
+<img src="gmdit.png" width="600"  alt=""/>
+<img src="gmdit_results.png" width="1000"  alt=""/>
+## Usage
+Please first install the [official code repository](https://github.com/Lakonik/GMFlow).
+We provide a [Diffusers pipeline](lib/pipelines/gmdit_pipeline.py) for easy inference. The following code demonstrates how to sample images from the pretrained GM-DiT model using the GM-ODE 2 solver and the GM-SDE 2 solver.
+```python
+import torch
+from huggingface_hub import snapshot_download
+from lib.models.diffusions.schedulers import FlowEulerODEScheduler, GMFlowSDEScheduler
+from lib.pipelines.gmdit_pipeline import GMDiTPipeline
+# Currently the pipeline can only load local checkpoints, so we need to download the checkpoint first
+ckpt = snapshot_download(repo_id='Lakonik/gmflow_imagenet_k8_ema')
+pipe = GMDiTPipeline.from_pretrained(ckpt, variant='bf16', torch_dtype=torch.bfloat16)
+pipe = pipe.to('cuda')
+# Pick words that exist in ImageNet
+words = ['jay', 'magpie']
+class_ids = pipe.get_label_ids(words)
+# Sample using GM-ODE 2 solver
+pipe.scheduler = FlowEulerODEScheduler.from_config(pipe.scheduler.config)
+generator = torch.manual_seed(42)
+output = pipe(
+    class_labels=class_ids,
+    guidance_scale=0.45,
+    num_inference_steps=32,
+    num_inference_substeps=4,
+    output_mode='mean',
+    order=2,
+    generator=generator)
+for i, (word, image) in enumerate(zip(words, output.images)):
+    image.save(f'{i:03d}_{word}_gmode2_step32.png')
+# Sample using GM-SDE 2 solver (the first run may be slow due to CUDA compilation)
+pipe.scheduler = GMFlowSDEScheduler.from_config(pipe.scheduler.config)
+generator = torch.manual_seed(42)
+output = pipe(
+    class_labels=class_ids,
+    guidance_scale=0.45,
+    num_inference_steps=32,
+    num_inference_substeps=1,
+    output_mode='sample',
+    order=2,
+    generator=generator)
+for i, (word, image) in enumerate(zip(words, output.images)):
+    image.save(f'{i:03d}_{word}_gmsde2_step32.png')
+```
+## Citation
+```
+@misc{gmflow,
+      title={Gaussian Mixture Flow Matching Models},
+      author={Hansheng Chen and Kai Zhang and Hao Tan and Zexiang Xu and Fujun Luan and Leonidas Guibas and Gordon Wetzstein and Sai Bi},
+      year={2025},
+      eprint={2504.05304},
+      archivePrefix={arXiv},
+      primaryClass={cs.LG},
+      url={https://arxiv.org/abs/2504.05304},
+}
+```

gmdit.png ADDED Viewed

Git LFS Details

SHA256: 753b11ed149867107eec9f97a07389c16c554c865780428bb074b0a3e61e7688
Pointer size: 131 Bytes
Size of remote file: 185 kB

gmdit_results.png ADDED Viewed

Git LFS Details

SHA256: 78d4d9939ceccf0d07936e4202a06abc16761a4e69a7b024cdfe751d10330667
Pointer size: 132 Bytes
Size of remote file: 1.26 MB