Ornstein-31B-it — GGUF

GGUF quantizations of DJLougen/Ornstein-31B-it, a Gemma 4 31B vision-language model fine-tuned with Unsloth.

Support This Work

I'm a PhD student in visual neuroscience at the University of Toronto who also happens to spend way too much time fine-tuning, merging, and quantizing open-weight models on rented H100s and a local DGX Spark. All training compute is self-funded — balancing GPU costs against a student budget. If my uploads have been useful to you, consider buying a PhD student a coffee. It goes a long way toward keeping these experiments running.

Support on Ko-fi

Available Quantizations

File	Quant	Size	Description
`Ornstein-31B-it-Q2_K.gguf`	Q2_K	11.1 GB	Smallest, lowest quality
`Ornstein-31B-it-Q3_K_M.gguf`	Q3_K_M	14.2 GB	Low quality
`Ornstein-31B-it-Q4_K_M.gguf`	Q4_K_M	17.4 GB	Recommended — good balance of quality and size
`Ornstein-31B-it-Q5_K_M.gguf`	Q5_K_M	20.3 GB	High quality
`Ornstein-31B-it-Q6_K.gguf`	Q6_K	23.5 GB	Very high quality
`Ornstein-31B-it-Q8_0.gguf`	Q8_0	30.4 GB	Near-lossless
`mmproj-Ornstein-31B-it-F16.gguf`	F16	1.1 GB	Vision encoder (required for image input)

Usage

llama.cpp

# Text-only
llama-cli -m Ornstein-31B-it-Q4_K_M.gguf -p "Hello, tell me about yourself"

# With vision (image input)
llama-gemma3-cli -m Ornstein-31B-it-Q4_K_M.gguf --mmproj mmproj-Ornstein-31B-it-F16.gguf --image photo.jpg -p "Describe this image"

llama-cpp-python

from llama_cpp import Llama

llm = Llama(model_path="Ornstein-31B-it-Q4_K_M.gguf", n_gpu_layers=-1)
output = llm.create_chat_completion(
    messages=[{"role": "user", "content": "Hello!"}]
)
print(output["choices"][0]["message"]["content"])

Details

Architecture: Gemma 4 (gemma4)
Parameters: ~32.7B
Task: image-text-to-text
Base model: unsloth/gemma-4-31B-it
Full-precision model: DJLougen/Ornstein-31B-it
Converted with: llama.cpp convert_hf_to_gguf.py

Downloads last month: 869

GGUF

Model size

31B params

Architecture

gemma4

Hardware compatibility

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Model tree for DJLougen/Ornstein-31B-it-GGUF

Base model

google/gemma-4-31B-it

Finetuned

unsloth/gemma-4-31B-it

Finetuned

DJLougen/Ornstein-31B-it

Quantized

(1)

this model