GRM-2.5-Air

logo

1. Introduction

GRM-2.5-Air is a 0.8B-parameter reasoning model built for general-purpose local AI. It is designed to deliver strong performance across a wide range of tasks while remaining efficient and accessible for very small devices.

The model is optimized for structured reasoning, helping it produce more accurate, coherent, and reliable responses on complex problems. GRM-2.5-Air aims to combine strong reasoning ability, practical usability, and efficient deployment in a compact form factor.

2. Key Capabilities

  • Strong Reasoning for Everyday and Advanced Tasks: GRM-2.5-Air is built to handle both daily conversations and more demanding reasoning workloads with clarity and consistency.
  • Optimized for Local Deployment: GRM-2.5-Air is designed for accessible inference across a broad range of hardware, making it a practical choice for users who want capable AI running locally.

3. Performance

GRM-2.5-Air is designed to be a highly capable option for local AI use across many scenarios. It performs well in complex reasoning tasks, everyday chat, coding, and agentic workflows, while maintaining the efficiency expected from a compact 0.8B model.

Its focus is not only raw capability, but also practical intelligence: strong reasoning, stable long-context behavior, and usability on consumer hardware.

GRM-2.5-Plus (Closed) GRM-2.5 GRM-2.5-Air GRM-7B GRM-1.5B
Knowledge & STEM
MMLU-Pro 84.2 80.1 43.6 -- --
GPQA Diamond 82.7 76.7 12.5 53.7 29.5
Instruction Following
IFEval 91.8 90.2 44.5 -- --
MultiChallenge 56.5 49.8 19.3 -- --
Reasoning & Coding
HMMT Feb 25 84.4 75.2 -- 42.7 27.3
HMMT Nov 25 83.2 77.2 -- -- --
LiveCodeBench v6 67.2 56.9 -- 51.7 39.4
Agent
TAU2-Bench 80.5 80.2 11.6 -- --
DeepPlanning 18.6 17.9 -- -- --
OSWorld-Verified 42.4 36.0 -- -- --

4. Family

The GRM-2.5 family is available in various sizes to suit every case.

Model Size Domain
GRM-2.5-Plus 9B Closed model for research and agent purposes
GRM-2.5 4B Powerful on-device deployment for difficult tasks
GRM-2.5-Air 0.8B Any-device deployment for everyday chat

5. Architecture

GRM-2.5 is built on the Qwen3.5 architecture and is optimized for complex tasks, agent environments, and everyday chat.

GRM-2.5 applies the same principle to a stronger, larger foundation, resulting in a model that punches above its weight class on structured reasoning tasks while remaining deployable on consumer hardware.


GRM-2.5 is developed by OrionLLM and released under the Apache 2.0 License.

Downloads last month
169
Safetensors
Model size
0.9B params
Tensor type
F32
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 2 Ask for provider support

Model tree for OrionLLM/GRM-2.5-Air

Finetuned
(155)
this model
Quantizations
2 models

Collection including OrionLLM/GRM-2.5-Air