Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2604.08995

Video understanding

about 10 hours ago

Wolf: Captioning Everything with a World Summarization Framework

Paper • 2407.18908 • Published Jul 26, 2024 • 32
Mixture of Nested Experts: Adaptive Processing of Visual Tokens

Paper • 2407.19985 • Published Jul 29, 2024 • 37
TPDiff: Temporal Pyramid Video Diffusion Model

Paper • 2503.09566 • Published Mar 12, 2025 • 45
DeepVideo-R1: Video Reinforcement Fine-Tuning via Difficulty-aware Regressive GRPO

Paper • 2506.07464 • Published Jun 9, 2025 • 14

about 7 hours ago

WildWorld: A Large-Scale Dataset for Dynamic World Modeling with Actions and Explicit State toward Generative ARPG

Paper • 2603.23497 • Published 28 days ago • 91
INSPATIO-WORLD: A Real-Time 4D World Simulator via Spatiotemporal Autoregressive Modeling

Paper • 2604.07209 • Published 14 days ago • 35
Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory

Paper • 2604.08995 • Published 12 days ago • 46
MultiWorld: Scalable Multi-Agent Multi-View Video World Models

Paper • 2604.18564 • Published 1 day ago • 35

Writing in the Margins: Better Inference Pattern for Long Context Retrieval

Paper • 2408.14906 • Published Aug 27, 2024 • 144
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads

Paper • 2410.10819 • Published Oct 14, 2024 • 7
LLMtimesMapReduce: Simplified Long-Sequence Processing using Large Language Models

Paper • 2410.09342 • Published Oct 12, 2024 • 39
PDFTriage: Question Answering over Long, Structured Documents

Paper • 2309.08872 • Published Sep 16, 2023 • 55

Non-LLM nice things

Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory

Paper • 2604.08995 • Published 12 days ago • 46

Stuff I'm going to read

about 2 hours ago

LTX-2: Efficient Joint Audio-Visual Foundation Model

Paper • 2601.03233 • Published Jan 6 • 176
MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head

Paper • 2601.07832 • Published Jan 12 • 52
Motion Attribution for Video Generation

Paper • 2601.08828 • Published Jan 13 • 72
Post-LayerNorm Is Back: Stable, ExpressivE, and Deep

Paper • 2601.19895 • Published Jan 27 • 27

Video understanding

about 10 hours ago

Wolf: Captioning Everything with a World Summarization Framework

Paper • 2407.18908 • Published Jul 26, 2024 • 32
Mixture of Nested Experts: Adaptive Processing of Visual Tokens

Paper • 2407.19985 • Published Jul 29, 2024 • 37
TPDiff: Temporal Pyramid Video Diffusion Model

Paper • 2503.09566 • Published Mar 12, 2025 • 45
DeepVideo-R1: Video Reinforcement Fine-Tuning via Difficulty-aware Regressive GRPO

Paper • 2506.07464 • Published Jun 9, 2025 • 14

Non-LLM nice things

Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory

Paper • 2604.08995 • Published 12 days ago • 46

about 7 hours ago

WildWorld: A Large-Scale Dataset for Dynamic World Modeling with Actions and Explicit State toward Generative ARPG

Paper • 2603.23497 • Published 28 days ago • 91
INSPATIO-WORLD: A Real-Time 4D World Simulator via Spatiotemporal Autoregressive Modeling

Paper • 2604.07209 • Published 14 days ago • 35
Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory

Paper • 2604.08995 • Published 12 days ago • 46
MultiWorld: Scalable Multi-Agent Multi-View Video World Models

Paper • 2604.18564 • Published 1 day ago • 35

Stuff I'm going to read

about 2 hours ago

LTX-2: Efficient Joint Audio-Visual Foundation Model

Paper • 2601.03233 • Published Jan 6 • 176
MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head

Paper • 2601.07832 • Published Jan 12 • 52
Motion Attribution for Video Generation

Paper • 2601.08828 • Published Jan 13 • 72
Post-LayerNorm Is Back: Stable, ExpressivE, and Deep

Paper • 2601.19895 • Published Jan 27 • 27

Writing in the Margins: Better Inference Pattern for Long Context Retrieval

Paper • 2408.14906 • Published Aug 27, 2024 • 144
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads

Paper • 2410.10819 • Published Oct 14, 2024 • 7
LLMtimesMapReduce: Simplified Long-Sequence Processing using Large Language Models

Paper • 2410.09342 • Published Oct 12, 2024 • 39
PDFTriage: Question Answering over Long, Structured Documents

Paper • 2309.08872 • Published Sep 16, 2023 • 55

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs