---
license: apache-2.0
language:
- en
base_model:
- NiuTrans/GRAM-Qwen3-4B-RewardModel
library_name: transformers
tags:
- text-generation-inference
- rewardmodel
- GRAM
- RLHF
- reward
pipeline_tag: text-ranking
---

# **GRAM-Qwen3-4B-RewardModel-GGUF**

> GRAM-Qwen3-4B-RewardModel is a generative reward model developed to address reward generalization for Large Language Models (LLMs), released by NiuTrans. Unlike traditional models that depend heavily on task-specific labeled data, this model leverages both labeled and unlabeled data—a novel approach that allows it to generalize better across various tasks. It introduces a generative reward model framework that pre-trains on large amounts of unlabeled data and is subsequently fine-tuned with supervised data. The methodology also employs label smoothing and a regularized ranking loss to further boost performance, effectively bridging the gap between generative and discriminative reward modeling techniques.

> This model is built on the Qwen3-4B base and can be directly used or adapted for aligning LLMs without the need to train a reward model from scratch on extensive datasets. In evaluations on the JudgeBench benchmark—covering Chat, Code, Math, and Safety tasks—GRAM-Qwen3-4B-RewardModel achieves a competitive average score of 65.9, making it suitable for use as an open-source, plug-and-play reward model for a variety of LLM alignment scenarios. The repository provides usage instructions and demonstration code to facilitate immediate adoption for research and development purposes


## Model Files

| Model File name | Size | QuantType |
|---|---|---|
| GRAM-Qwen3-4B-RewardModel.BF16.gguf | 8.05 GB | BF16 |
| GRAM-Qwen3-4B-RewardModel.F16.gguf | 8.05 GB | F16 |
| GRAM-Qwen3-4B-RewardModel.F32.gguf | 16.1 GB | F32 |
| GRAM-Qwen3-4B-RewardModel.Q2_K.gguf | 1.67 GB | Q2_K |
| GRAM-Qwen3-4B-RewardModel.Q3_K_L.gguf | 2.24 GB | Q3_K_L |
| GRAM-Qwen3-4B-RewardModel.Q3_K_M.gguf | 2.08 GB | Q3_K_M |
| GRAM-Qwen3-4B-RewardModel.Q3_K_S.gguf | 1.89 GB | Q3_K_S |
| GRAM-Qwen3-4B-RewardModel.Q4_K_M.gguf | 2.5 GB | Q4_K_M |
| GRAM-Qwen3-4B-RewardModel.Q4_K_S.gguf | 2.38 GB | Q4_K_S |
| GRAM-Qwen3-4B-RewardModel.Q5_K_M.gguf | 2.89 GB | Q5_K_M |
| GRAM-Qwen3-4B-RewardModel.Q5_K_S.gguf | 2.82 GB | Q5_K_S |
| GRAM-Qwen3-4B-RewardModel.Q6_K.gguf | 3.31 GB | Q6_K |
| GRAM-Qwen3-4B-RewardModel.Q8_0.gguf | 4.28 GB | Q8_0 |

## Quants Usage 

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Here is a handy graph by ikawrakow comparing some lower-quality quant
types (lower is better):

![image.png](https://www.nethype.de/huggingface_embed/quantpplgraph.png)