FLUX.2-Tiny-AutoEncoder does not match flux2.

by JiangShuai - opened 8 days ago

8 days ago

The diffusers version in the config. json of FLUX.2-Tiny-AutoEncoder VAE points to 0.35.2, but Flux2Pipeline is only updated and exists in 0.36.0, including missing various parameters and corresponding issues.
import torch
from diffusers import AutoModel, Flux2Pipeline

device = torch.device("cuda")
tiny_vae = AutoModel.from_pretrained(
"fal/FLUX.2-Tiny-AutoEncoder", trust_remote_code=True, torch_dtype=torch.bfloat16
).to(device)

pipe = Flux2Pipeline.from_pretrained(
"black-forest-labs/FLUX.2-dev", vae=tiny_vae, torch_dtype=torch.bfloat16
).to(device)
Error message after running:
[rank1]: self.vae_scale_factor = 2 ** (len(self.vae.config.block_out_channels) - 1) if getattr(self, "vae", None) else 8
[rank1]: AttributeError: 'FrozenDict' object has no attribute 'block_out_channels'

After modifying decoder_block_out_channels and encoder_block_out_channels in the configuration file to block_out_channels, an error is reported：[rank1]: AttributeError: Could not access latents of provided encoder_output.

image_latents = self.vae.encode(image).latent
image_latents = self._patchify_latents(image_latents)
Error reported after modification: [rank1]: latents = latents.view(batch_size, num_channels_latents, height // 2, 2, width // 2, 2) [rank1]: RuntimeError: shape '[1, 128, 37, 2, 27, 2]' is invalid for input of size 520960

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment