Trading Inference-Time Compute for Adversarial Robustness Paper • 2501.18841 • Published Jan 31, 2025 • 4
The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions Paper • 2404.13208 • Published Apr 19, 2024 • 40