BOLT: Boost Large Vision-Language Model Without Training for Long-form Video Understanding
Paper • 2503.21483 • Published • 1
Generative AI, Natural Language Processing, Computer Vision
NearID: Identity Representation Learning via Near-identity Distractors
Efficient Morphology-Control Co-Design via Stackelberg Proximal Policy Optimization