view article Article Interactive Tools for machine learning, deep learning, and math May 26, 2025 • 47
Searching for Better ViT Baselines Collection Exploring ViT hparams and model shapes for the GPU poor (between tiny and base). • 33 items • Updated 5 days ago • 19
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 649
PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training Paper • 2309.10400 • Published Sep 19, 2023 • 26