Diffusers
GMDiTPipeline
Lakonik commited on
Commit
40e3822
·
verified ·
1 Parent(s): 452c6f7

Update README.md

Browse files
Files changed (4) hide show
  1. .gitattributes +2 -0
  2. README.md +86 -0
  3. gmdit.png +3 -0
  4. gmdit_results.png +3 -0
.gitattributes CHANGED
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ gmdit_results.png filter=lfs diff=lfs merge=lfs -text
37
+ gmdit.png filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,3 +1,89 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+
5
+ # Gaussian Mixture Flow Matching Models (GMFlow)
6
+
7
+ Model used in the paper:
8
+
9
+ **Gaussian Mixture Flow Matching Models**
10
+ <br>
11
+ [Hansheng Chen](https://lakonik.github.io/)<sup>1</sup>,
12
+ [Kai Zhang](https://kai-46.github.io/website/)<sup>2</sup>,
13
+ [Hao Tan](https://research.adobe.com/person/hao-tan/)<sup>2</sup>,
14
+ [Zexiang Xu](https://zexiangxu.github.io/)<sup>3</sup>,
15
+ [Fujun Luan](https://research.adobe.com/person/fujun/)<sup>2</sup>,
16
+ [Leonidas Guibas](https://geometry.stanford.edu/?member=guibas)<sup>1</sup>,
17
+ [Gordon Wetzstein](http://web.stanford.edu/~gordonwz/)<sup>1</sup>,
18
+ [Sai Bi](https://sai-bi.github.io/)<sup>2</sup><br>
19
+ <sup>1</sup>Stanford University, <sup>2</sup>Adobe Research, <sup>3</sup>Hillbot
20
+ <br>
21
+
22
+ [[arXiv](https://arxiv.org/abs/2504.05304)] [[GitHub](https://github.com/Lakonik/GMFlow)]
23
+
24
+ <img src="gmdit.png" width="600" alt=""/>
25
+
26
+ <img src="gmdit_results.png" width="1000" alt=""/>
27
+
28
+ ## Usage
29
+
30
+ Please first install the [official code repository](https://github.com/Lakonik/GMFlow).
31
+
32
+ We provide a [Diffusers pipeline](lib/pipelines/gmdit_pipeline.py) for easy inference. The following code demonstrates how to sample images from the pretrained GM-DiT model using the GM-ODE 2 solver and the GM-SDE 2 solver.
33
+
34
+ ```python
35
+ import torch
36
+ from huggingface_hub import snapshot_download
37
+ from lib.models.diffusions.schedulers import FlowEulerODEScheduler, GMFlowSDEScheduler
38
+ from lib.pipelines.gmdit_pipeline import GMDiTPipeline
39
+
40
+ # Currently the pipeline can only load local checkpoints, so we need to download the checkpoint first
41
+ ckpt = snapshot_download(repo_id='Lakonik/gmflow_imagenet_k8_ema')
42
+ pipe = GMDiTPipeline.from_pretrained(ckpt, variant='bf16', torch_dtype=torch.bfloat16)
43
+ pipe = pipe.to('cuda')
44
+
45
+ # Pick words that exist in ImageNet
46
+ words = ['jay', 'magpie']
47
+ class_ids = pipe.get_label_ids(words)
48
+
49
+ # Sample using GM-ODE 2 solver
50
+ pipe.scheduler = FlowEulerODEScheduler.from_config(pipe.scheduler.config)
51
+ generator = torch.manual_seed(42)
52
+ output = pipe(
53
+ class_labels=class_ids,
54
+ guidance_scale=0.45,
55
+ num_inference_steps=32,
56
+ num_inference_substeps=4,
57
+ output_mode='mean',
58
+ order=2,
59
+ generator=generator)
60
+ for i, (word, image) in enumerate(zip(words, output.images)):
61
+ image.save(f'{i:03d}_{word}_gmode2_step32.png')
62
+
63
+ # Sample using GM-SDE 2 solver (the first run may be slow due to CUDA compilation)
64
+ pipe.scheduler = GMFlowSDEScheduler.from_config(pipe.scheduler.config)
65
+ generator = torch.manual_seed(42)
66
+ output = pipe(
67
+ class_labels=class_ids,
68
+ guidance_scale=0.45,
69
+ num_inference_steps=32,
70
+ num_inference_substeps=1,
71
+ output_mode='sample',
72
+ order=2,
73
+ generator=generator)
74
+ for i, (word, image) in enumerate(zip(words, output.images)):
75
+ image.save(f'{i:03d}_{word}_gmsde2_step32.png')
76
+ ```
77
+
78
+ ## Citation
79
+ ```
80
+ @misc{gmflow,
81
+ title={Gaussian Mixture Flow Matching Models},
82
+ author={Hansheng Chen and Kai Zhang and Hao Tan and Zexiang Xu and Fujun Luan and Leonidas Guibas and Gordon Wetzstein and Sai Bi},
83
+ year={2025},
84
+ eprint={2504.05304},
85
+ archivePrefix={arXiv},
86
+ primaryClass={cs.LG},
87
+ url={https://arxiv.org/abs/2504.05304},
88
+ }
89
+ ```
gmdit.png ADDED

Git LFS Details

  • SHA256: 753b11ed149867107eec9f97a07389c16c554c865780428bb074b0a3e61e7688
  • Pointer size: 131 Bytes
  • Size of remote file: 185 kB
gmdit_results.png ADDED

Git LFS Details

  • SHA256: 78d4d9939ceccf0d07936e4202a06abc16761a4e69a7b024cdfe751d10330667
  • Pointer size: 132 Bytes
  • Size of remote file: 1.26 MB