-
SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models
Paper • 2407.15841 • Published • 40 -
Stable Audio Open
Paper • 2407.14358 • Published • 26 -
PlacidDreamer: Advancing Harmony in Text-to-3D Generation
Paper • 2407.13976 • Published • 5 -
Efficient Audio Captioning with Encoder-Level Knowledge Distillation
Paper • 2407.14329 • Published • 5