TBD-VLA: Temporal Block Diffusion Vision Language Action Model Paper • 2606.07895 • Published 10 days ago • 2
Imaginative Perception Tokens Enhance Spatial Reasoning in Multimodal Language Models Paper • 2606.03988 • Published 12 days ago • 117
From Activation to Causality: Discovery of Causal Visual Representations in the Human Brain Paper • 2605.23895 • Published 24 days ago • 52
WorldMemArena: Evaluating Multimodal Agent Memory Through Action-World Interaction Paper • 2605.29341 • Published 18 days ago • 18
CollectionLoRA: Collecting 50 Effects in 1 LoRA via Multi-Teacher On-Policy Distillation Paper • 2605.25378 • Published 21 days ago • 61
GenEvolve: Self-Evolving Image Generation Agents via Tool-Orchestrated Visual Experience Distillation Paper • 2605.21605 • Published 26 days ago • 14
EvalVerse: Pipeline-Aware and Expert-Calibrated Benchmarking for Professional Cinematic Video Generation Paper • 2605.23271 • Published 24 days ago • 80
TransitLM: A Large-Scale Dataset and Benchmark for Map-Free Transit Route Generation Paper • 2605.22355 • Published 25 days ago • 177
Multi-Objective and Mixed-Reward Reinforcement Learning via Reward-Decorrelated Policy Optimization Paper • 2605.13641 • Published May 13 • 50
CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence Paper • 2605.12882 • Published May 13 • 271
T^2PO: Uncertainty-Guided Exploration Control for Stable Multi-Turn Agentic Reinforcement Learning Paper • 2605.02178 • Published May 4 • 10
LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Paper • 2604.20796 • Published Apr 22 • 243
Pseudo-Unification: Entropy Probing Reveals Divergent Information Patterns in Unified Multimodal Models Paper • 2604.10949 • Published Apr 13 • 40
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published Apr 8 • 327
AURA: Always-On Understanding and Real-Time Assistance via Video Streams Paper • 2604.04184 • Published Apr 5 • 51
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning Paper • 2604.02721 • Published Apr 3 • 633