RubricBench: Aligning Model-Generated Rubrics with Human Standards Paper • 2603.01562 • Published Mar 2 • 63
Xpertbench: Expert Level Tasks with Rubrics-Based Evaluation Paper • 2604.02368 • Published 21 days ago • 11
You Only Judge Once: Multi-response Reward Modeling in a Single Forward Pass Paper • 2604.10966 • Published 4 days ago • 8