-
Adding NVMe SSDs to Enable and Accelerate 100B Model Fine-tuning on a Single GPU
Paper • 2403.06504 • Published • 56 -
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling
Paper • 2502.06703 • Published • 153 -
Slamming: Training a Speech Language Model on One GPU in a Day
Paper • 2502.15814 • Published • 69
u f
udif
AI & ML interests
None yet
Recent Activity
liked
a model
about 1 month ago
upstage/Solar-Open-100B
liked
a model
about 1 month ago
zai-org/GLM-4.7
liked
a Space
about 2 months ago
black-forest-labs/FLUX.2-dev
Organizations
None yet