GRM-2.5-Air

logo

1. Introduction

GRM-2.5-Air is a 0.8B-parameter reasoning model built for general-purpose local AI. It is designed to deliver strong performance across a wide range of tasks while remaining efficient and accessible for very small devices.

The model is optimized for structured reasoning, helping it produce more accurate, coherent, and reliable responses on complex problems. GRM-2.5-Air aims to combine strong reasoning ability, practical usability, and efficient deployment in a compact form factor.

2. Key Capabilities

Strong Reasoning for Everyday and Advanced Tasks: GRM-2.5-Air is built to handle both daily conversations and more demanding reasoning workloads with clarity and consistency.
Optimized for Local Deployment: GRM-2.5-Air is designed for accessible inference across a broad range of hardware, making it a practical choice for users who want capable AI running locally.

3. Performance

GRM-2.5-Air is designed to be a highly capable option for local AI use across many scenarios. It performs well in complex reasoning tasks, everyday chat, coding, and agentic workflows, while maintaining the efficiency expected from a compact 0.8B model.

Its focus is not only raw capability, but also practical intelligence: strong reasoning, stable long-context behavior, and usability on consumer hardware.

	GRM-2.5-Plus (Closed)	GRM-2.5	GRM-2.5-Air	GRM-7B	GRM-1.5B
Knowledge & STEM
MMLU-Pro	84.2	80.1	43.6	--	--
GPQA Diamond	82.7	76.7	12.5	53.7	29.5
Instruction Following
IFEval	91.8	90.2	44.5	--	--
MultiChallenge	56.5	49.8	19.3	--	--
Reasoning & Coding
HMMT Feb 25	84.4	75.2	--	42.7	27.3
HMMT Nov 25	83.2	77.2	--	--	--
LiveCodeBench v6	67.2	56.9	--	51.7	39.4
Agent
TAU2-Bench	80.5	80.2	11.6	--	--
DeepPlanning	18.6	17.9	--	--	--
OSWorld-Verified	42.4	36.0	--	--	--

4. Family

The GRM-2.5 family is available in various sizes to suit every case.

Model	Size	Domain
GRM-2.5-Plus	9B	Closed model for research and agent purposes
GRM-2.5	4B	Powerful on-device deployment for difficult tasks
GRM-2.5-Air	0.8B	Any-device deployment for everyday chat

5. Architecture

GRM-2.5 is built on the Qwen3.5 architecture and is optimized for complex tasks, agent environments, and everyday chat.

GRM-2.5 applies the same principle to a stronger, larger foundation, resulting in a model that punches above its weight class on structured reasoning tasks while remaining deployable on consumer hardware.

GRM-2.5 is developed by OrionLLM and released under the Apache 2.0 License.