Kimi K2.5 — Opus Magnum LoRA (Tinker-format, step 45, prodops)

LoRA adapter for moonshotai/Kimi-K2.5, trained via GRPO RL on Opus Magnum puzzles using the Tinker cookbook.

Format note

This adapter is in Tinker-native format — keys use Tinker's internal naming convention, not standard PEFT layout. It is NOT directly loadable by vLLM, SGLang, or PEFT.

To convert to PEFT format for serving:

from tinker_cookbook import weights

weights.build_lora_adapter(
    base_model="moonshotai/Kimi-K2.5",
    adapter_path="./this_dir",
    output_path="./peft_dir",
)
# requires the K2.5 base model accessible via HuggingFace Hub
# (snapshot_download will fetch ~600GB if not cached)

The conversion needs ~32GB+ RAM and ~100GB+ free disk.

Training details

  • Base model: moonshotai/Kimi-K2.5
  • LoRA rank: 32
  • Reward: partial-credit with the new productive_ops signal (counts atom translations, not just any non-grab instruction)
  • Dataset: opus-magnum k1 + single-arm-easy (~330 tasks)
  • Step: 45 (peak held-out best_reward = 0.0933)
  • Run wandb: 6nlb1b6b

See research-notes/2026-04-26-kimi-parallel-runs-status.md (private) for the experiment lineage.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for GoodStartLabs/kimi-k2.5-opus-magnum-lora-step45-prodops

Adapter
(22)
this model