Inference-time Physics Alignment of Video Generative Models with Latent World Models Paper • 2601.10553 • Published 4 days ago • 11
Transition Matching Distillation for Fast Video Generation Paper • 2601.09881 • Published 4 days ago • 25
SnapGen++: Unleashing Diffusion Transformers for Efficient High-Fidelity Image Generation on Edge Devices Paper • 2601.08303 • Published 6 days ago • 14
DrivingGen: A Comprehensive Benchmark for Generative Video World Models in Autonomous Driving Paper • 2601.01528 • Published 15 days ago • 18
Orient Anything V2: Unifying Orientation and Rotation Understanding Paper • 2601.05573 • Published 10 days ago • 8
Goal Force: Teaching Video Models To Accomplish Physics-Conditioned Goals Paper • 2601.05848 • Published 10 days ago • 14
VideoAR: Autoregressive Video Generation via Next-Frame & Scale Prediction Paper • 2601.05966 • Published 9 days ago • 21
Guiding a Diffusion Transformer with the Internal Dynamics of Itself Paper • 2512.24176 • Published 20 days ago • 7
YOLO-Master: MOE-Accelerated with Specialized Transformers for Enhanced Real-time Detection Paper • 2512.23273 • Published 21 days ago • 13
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation Paper • 2512.23705 • Published 20 days ago • 44
Yume-1.5: A Text-Controlled Interactive World Generation Model Paper • 2512.22096 • Published 23 days ago • 59
Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning Paper • 2512.20605 • Published 26 days ago • 60
Spatia: Video Generation with Updatable Spatial Memory Paper • 2512.15716 • Published Dec 17, 2025 • 32
HiStream: Efficient High-Resolution Video Generation via Redundancy-Eliminated Streaming Paper • 2512.21338 • Published 25 days ago • 21
Learning from Next-Frame Prediction: Autoregressive Video Modeling Encodes Effective Representations Paper • 2512.21004 • Published 26 days ago • 12
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper • 2512.16093 • Published Dec 18, 2025 • 93
The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding Paper • 2512.19693 • Published 27 days ago • 63