Self-Improvement in Multimodal Large Language Models: A Survey Paper • 2510.02665 • Published Oct 3 • 20
Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play Paper • 2509.25541 • Published Sep 29 • 140
Emergent Hierarchical Reasoning in LLMs through Reinforcement Learning Paper • 2509.03646 • Published Sep 3 • 32
JavisDiT: Joint Audio-Video Diffusion Transformer with Hierarchical Spatio-Temporal Prior Synchronization Paper • 2503.23377 • Published Mar 30 • 57
Lumina-Video: Efficient and Flexible Video Generation with Multi-scale Next-DiT Paper • 2502.06782 • Published Feb 10 • 14