Evaluating Gemini Robotics Policies in a Veo World Simulator Paper • 2512.10675 • Published 15 days ago • 16
Video OWL-ViT: Temporally-consistent open-world localization in video Paper • 2308.11093 • Published Aug 22, 2023
Scaling Vision Transformers to 22 Billion Parameters Paper • 2302.05442 • Published Feb 10, 2023 • 2
Invariant Slot Attention: Object Discovery with Slot-Centric Reference Frames Paper • 2302.04973 • Published Feb 9, 2023 • 1
Invariant Slot Attention: Object Discovery with Slot-Centric Reference Frames Paper • 2302.04973 • Published Feb 9, 2023 • 1
Simple Open-Vocabulary Object Detection with Vision Transformers Paper • 2205.06230 • Published May 12, 2022 • 3
AudioSlots: A slot-centric generative model for audio separation Paper • 2305.05591 • Published May 9, 2023 • 3
DORSal: Diffusion for Object-centric Representations of Scenes $\textit{et al.}$ Paper • 2306.08068 • Published Jun 13, 2023 • 6
DORSal: Diffusion for Object-centric Representations of Scenes et al. Paper • 2306.08068 • Published Jun 13, 2023 • 6
AudioSlots: A slot-centric generative model for audio separation Paper • 2305.05591 • Published May 9, 2023 • 3
Simple Open-Vocabulary Object Detection with Vision Transformers Paper • 2205.06230 • Published May 12, 2022 • 3