OmniDance: Multimodal Driven Dance Video Generation with Large-scale Internet Data Paper • 2606.30019 • Published 6 days ago • 18
BlockPilot: Instance-Adaptive Policy Learning for Diffusion-based Speculative Decoding Paper • 2606.31315 • Published 5 days ago • 73
DreamX-World 1.0: A General-Purpose Interactive World Model Paper • 2606.16993 • Published 20 days ago • 113
Role-Agent: Bootstrapping LLM Agents via Dual-Role Evolution Paper • 2606.10917 • Published 26 days ago • 76
TransitLM: A Large-Scale Dataset and Benchmark for Map-Free Transit Route Generation Paper • 2605.22355 • Published May 21 • 179
Enhancing Train-Free Infinite-Frame Generation for Consistent Long Videos Paper • 2605.18233 • Published May 18 • 93
LLaTiSA: Towards Difficulty-Stratified Time Series Reasoning from Visual Perception to Semantics Paper • 2604.17295 • Published Apr 19 • 84
Extending One-Step Image Generation from Class Labels to Text via Discriminative Text Representation Paper • 2604.18168 • Published Apr 20 • 96
Elucidating the SNR-t Bias of Diffusion Probabilistic Models Paper • 2604.16044 • Published Apr 17 • 73
Omni-WorldBench: Towards a Comprehensive Interaction-Centric Evaluation for World Models Paper • 2603.22212 • Published Mar 23 • 127
Video-CoE: Reinforcing Video Event Prediction via Chain of Events Paper • 2603.14935 • Published Mar 16 • 91
Geometry-Guided Reinforcement Learning for Multi-view Consistent 3D Scene Editing Paper • 2603.03143 • Published Mar 3 • 145
From Scale to Speed: Adaptive Test-Time Scaling for Image Editing Paper • 2603.00141 • Published Feb 24 • 138
Code2World: A GUI World Model via Renderable Code Generation Paper • 2602.09856 • Published Feb 10 • 201
Everything in Its Place: Benchmarking Spatial Intelligence of Text-to-Image Models Paper • 2601.20354 • Published Jan 28 • 111
Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation Paper • 2601.20614 • Published Jan 28 • 119
Urban Socio-Semantic Segmentation with Vision-Language Reasoning Paper • 2601.10477 • Published Jan 15 • 155
Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization Paper • 2601.05432 • Published Jan 8 • 171
Advancing End-to-End Pixel Space Generative Modeling via Self-supervised Pre-training Paper • 2510.12586 • Published Oct 14, 2025 • 115