strangerTHU commited on
Commit
5a535e7
·
verified ·
1 Parent(s): ae55404

Upload folder using huggingface_hub

Browse files
Files changed (4) hide show
  1. .DS_Store +0 -0
  2. README.md +44 -0
  3. assets/.DS_Store +0 -0
  4. assets/combined.gif +2 -2
.DS_Store ADDED
Binary file (6.15 kB). View file
 
README.md CHANGED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # DC-AE-Lite
2
+ \[[github](https://github.com/dc-ai-projects/DC-Gen/tree/main)\]
3
+
4
+ Decoding is often the speed bottleneck in few-step latent diffusion models. We release DC-AE-Lite with the same encoder of DC-AE-f32c32-SANA-1.0 while having a much smaller decoder. Without training, it can be applied to diffusion model trained with DC-AE-f32c32-SANA-1.0.
5
+
6
+ ## Demo
7
+ <p align="center">
8
+ <img src="./assets/combined.gif"><br>
9
+ <b> DC-AE-Lite vs DC-AE reconstruction visual quality </b>
10
+ </p>
11
+
12
+ <p align="center">
13
+ <img src="./assets/dc-ae-lite.jpg"><br>
14
+ <b> DC-AE-Lite achieves 1.8× faster decoding than DC-AE with similar reconstruction quality </b>
15
+ </p>
16
+
17
+
18
+
19
+ # Usage
20
+ ```bash
21
+ from diffusers import AutoencoderDC
22
+ from PIL import Image
23
+ import torch
24
+ import torchvision.transforms as transforms
25
+ from torchvision.utils import save_image
26
+
27
+ device = torch.device("cuda")
28
+ dc_ae_lite = AutoencoderDC.from_pretrained("dc-ai/dc-ae-lite-f32c32-diffusers").to(device).eval()
29
+
30
+ transform = transforms.Compose([
31
+ transforms.CenterCrop((1024,1024)),
32
+ transforms.ToTensor(),
33
+ transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
34
+ ])
35
+
36
+ image = Image.open("assets/fig/girl.png")
37
+
38
+ x = transform(image)[None].to(device)
39
+ latent = dc_ae_lite.encode(x).latent
40
+ print(f"latent shape: {latent.shape}")
41
+
42
+ y = dc_ae_lite.decode(latent).sample
43
+ save_image(y * 0.5 + 0.5, "demo_dc_ae_lite.png")
44
+ ```
assets/.DS_Store ADDED
Binary file (6.15 kB). View file
 
assets/combined.gif CHANGED

Git LFS Details

  • SHA256: 4a273c7badd75db9607ae20b4db7fe42c1cbfe98714edac1881ff4c164f2c7e4
  • Pointer size: 132 Bytes
  • Size of remote file: 2.05 MB

Git LFS Details

  • SHA256: 5fb8599d10126bdf86e72bb0c6a87528435792cbf98de0c579da1911fcb105f0
  • Pointer size: 132 Bytes
  • Size of remote file: 2.52 MB