Instructions to use ABaroian/Apertus-8B-RLVR-GSM with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ABaroian/Apertus-8B-RLVR-GSM with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="ABaroian/Apertus-8B-RLVR-GSM")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("ABaroian/Apertus-8B-RLVR-GSM", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use ABaroian/Apertus-8B-RLVR-GSM with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "ABaroian/Apertus-8B-RLVR-GSM" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ABaroian/Apertus-8B-RLVR-GSM", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/ABaroian/Apertus-8B-RLVR-GSM
- SGLang
How to use ABaroian/Apertus-8B-RLVR-GSM with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "ABaroian/Apertus-8B-RLVR-GSM" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ABaroian/Apertus-8B-RLVR-GSM", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "ABaroian/Apertus-8B-RLVR-GSM" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ABaroian/Apertus-8B-RLVR-GSM", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use ABaroian/Apertus-8B-RLVR-GSM with Docker Model Runner:
docker model run hf.co/ABaroian/Apertus-8B-RLVR-GSM
RLVR Training Apertus 8B with GRPO on GSM8K dataset
This model is a fine-tuned version of the Apertus 8B Instruct model, further trained using the RLVR (Reinforcement Learning with Verifiable Rewards) framework on the GSM8K dataset. The base Apertus models are introduced in the paper Apertus: Democratizing Open and Compliant LLMs for Global Language Environments.
Project Page: https://www.swiss-ai.org/apertus Code Repository: https://github.com/swiss-ai/apertus-tech-report
Results
Validation accuracy improved from 46.41% to 66.23%.
Compute
Training performed on a GPU node with 4× NVIDIA H100 (95 GB), running for approximately 5 hours.
Hyperparameters
| Rollouts | |
|---|---|
num_unique_prompts_rollout | 32 |
num_samples_per_prompt_rollout | 8 |
temperature | 0.8 |
| Optimization | |
learning_rate | 3.0e-7 |
beta | 0.01 |
Notes
- Note: format reward was not applied because neither the instruct or the base models were able to get a correct answer. Thus the model is not able to use
<think> </think>. - Funny observation: the model memorized the dataset. in one try, the model answered the question but because the format was not familiar, it started reciting another question from same dataset; another time it outputed the html code, assumingly from where it saw the question.
Acknowledgements
This work builds upon and was inspired by the following contributions:
- RLVR: Verifiable Rewards for Reasoning Models — for introducing the verifiable reward framework used in this experiment.
- Allen Institute for AI — Open Instruct — for providing open-source infrastructure for RLHF/RLVR training.
- Apertus Project — for releasing the Apertus-8B base and instruct models used in this work.
Model tree for ABaroian/Apertus-8B-RLVR-GSM
Base model
swiss-ai/Apertus-8B-2509