FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model Paper • 2507.01953 • Published Jul 2 • 18
LongAnimation: Long Animation Generation with Dynamic Global-Local Memory Paper • 2507.01945 • Published Jul 2 • 76
T-LoRA: Single Image Diffusion Model Customization Without Overfitting Paper • 2507.05964 • Published Jul 8 • 119
LangSplatV2: High-dimensional 3D Language Gaussian Splatting with 450+ FPS Paper • 2507.07136 • Published Jul 9 • 38
NoHumansRequired: Autonomous High-Quality Image Editing Triplet Mining Paper • 2507.14119 • Published Jul 18 • 58
DesignLab: Designing Slides Through Iterative Detection and Correction Paper • 2507.17202 • Published Jul 23 • 50
PUSA V1.0: Surpassing Wan-I2V with $500 Training Cost by Vectorized Timestep Adaptation Paper • 2507.16116 • Published Jul 22 • 11
ARC-Hunyuan-Video-7B: Structured Video Comprehension of Real-World Shorts Paper • 2507.20939 • Published Jul 28 • 56
X-Omni: Reinforcement Learning Makes Discrete Autoregressive Image Generative Models Great Again Paper • 2507.22058 • Published Jul 29 • 39
Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation Paper • 2508.07981 • Published Aug 11 • 58
NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale Paper • 2508.10711 • Published Aug 14 • 144
Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning Paper • 2508.20751 • Published Aug 28 • 89
OmniLayout: Enabling Coarse-to-Fine Learning with LLMs for Universal Document Layout Generation Paper • 2510.26213 • Published Oct 30 • 9
Multimodal Spatial Reasoning in the Large Model Era: A Survey and Benchmarks Paper • 2510.25760 • Published Oct 29 • 16
One Small Step in Latent, One Giant Leap for Pixels: Fast Latent Upscale Adapter for Your Diffusion Models Paper • 2511.10629 • Published Nov 13 • 122
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation Paper • 2511.14993 • Published 28 days ago • 224
Back to Basics: Let Denoising Generative Models Denoise Paper • 2511.13720 • Published 30 days ago • 66
Light-X: Generative 4D Video Rendering with Camera and Illumination Control Paper • 2512.05115 • Published 13 days ago • 10
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance Paper • 2512.08765 • Published 8 days ago • 123
Preserving Source Video Realism: High-Fidelity Face Swapping for Cinematic Quality Paper • 2512.07951 • Published 9 days ago • 46
EgoEdit: Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video Editing Paper • 2512.06065 • Published 12 days ago • 28
Towards Scalable Pre-training of Visual Tokenizers for Generation Paper • 2512.13687 • Published 1 day ago • 75
Few-Step Distillation for Text-to-Image Generation: A Practical Guide Paper • 2512.13006 • Published 2 days ago • 3