LongCat-Next: Lexicalizing Modalities as Discrete Tokens Paper • 2603.27538 • Published 12 days ago • 137
Attend Before Attention: Efficient and Scalable Video Understanding via Autoregressive Gazing Paper • 2603.12254 • Published 29 days ago • 21
view article Article Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries +7 Mar 10 • 122
Diffusion Transformers with Representation Autoencoders Paper • 2510.11690 • Published Oct 13, 2025 • 170
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 Dec 1, 2025 • 307
Qwen-Image-Layered: Towards Inherent Editability via Layer Decomposition Paper • 2512.15603 • Published Dec 17, 2025 • 69
view article Article We’re open-sourcing our text-to-image model and the process behind it Nov 12, 2025 • 96
Common Diffusion Noise Schedules and Sample Steps are Flawed Paper • 2305.08891 • Published May 15, 2023 • 14
view article Article Exploring Environments Hub: Your Language Model needs better (open) environments to learn Sep 4, 2025 • 30
view article Article Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers +5 Sep 11, 2025 • 186
👁️ LFM2-VL Collection LFM2-VL is our first series of vision-language models, designed for on-device deployment. • 10 items • Updated 2 days ago • 64