Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length Paper • 2512.04677 • Published 28 days ago • 167
MachineLearningLM: Continued Pretraining Language Models on Millions of Synthetic Tabular Prediction Tasks Scales In-Context ML Paper • 2509.06806 • Published Sep 8, 2025 • 63
Table-R1: Inference-Time Scaling for Table Reasoning Paper • 2505.23621 • Published May 29, 2025 • 93
VF-Eval: Evaluating Multimodal LLMs for Generating Feedback on AIGC Videos Paper • 2505.23693 • Published May 29, 2025 • 53
Automated Movie Generation via Multi-Agent CoT Planning Paper • 2503.07314 • Published Mar 10, 2025 • 44
Taking Notes Brings Focus? Towards Multi-Turn Multimodal Dialogue Learning Paper • 2503.07002 • Published Mar 10, 2025 • 39
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding Paper • 2502.08946 • Published Feb 13, 2025 • 191
Evidence-backed Fact Checking using RAG and Few-Shot In-Context Learning with LLMs Paper • 2408.12060 • Published Aug 22, 2024 • 6
Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation Paper • 2408.09787 • Published Aug 19, 2024 • 10
SEA: Supervised Embedding Alignment for Token-Level Visual-Textual Integration in MLLMs Paper • 2408.11813 • Published Aug 21, 2024 • 12
SPARK: Multi-Vision Sensor Perception and Reasoning Benchmark for Large-scale Vision-Language Models Paper • 2408.12114 • Published Aug 22, 2024 • 14
The Russian-focused embedders' exploration: ruMTEB benchmark and Russian embedding model design Paper • 2408.12503 • Published Aug 22, 2024 • 27
DreamCinema: Cinematic Transfer with Free Camera and 3D Character Paper • 2408.12601 • Published Aug 22, 2024 • 31
RoundTable: Leveraging Dynamic Schema and Contextual Autocomplete for Enhanced Query Precision in Tabular Question Answering Paper • 2408.12369 • Published Aug 22, 2024 • 5
T3M: Text Guided 3D Human Motion Synthesis from Speech Paper • 2408.12885 • Published Aug 23, 2024 • 13
Efficient Detection of Toxic Prompts in Large Language Models Paper • 2408.11727 • Published Aug 21, 2024 • 13
NanoFlow: Towards Optimal Large Language Model Serving Throughput Paper • 2408.12757 • Published Aug 22, 2024 • 19
Training-free Long Video Generation with Chain of Diffusion Model Experts Paper • 2408.13423 • Published Aug 24, 2024 • 23