EthereumGPT-8B-GGUF (v8 β DPO aligned)
Qwen3-8B fine-tuned on ~26,000 Ethereum Q&A examples derived from the Eth R&D Discord archive, then DPO-aligned to refuse hallucination on unknown topics. Specializes in Ethereum protocol development, EIPs, consensus mechanisms, the EVM, client implementations, and developer attribution.
v8 adds DPO (Direct Preference Optimization) on top of v7's SFT training, dramatically improving refusal of fabricated entities (fake EIPs, fake developers, non-Ethereum topics) while maintaining 100% accuracy on known topics.
Quick Start
llama-server (recommended)
# Install (macOS)
brew install llama.cpp
# Serve v8 (auto-downloads from HuggingFace)
llama-server \
--hf-repo satyajitdas/EthereumGPT-8B-GGUF \
--hf-file EthereumGPT-8B-v8-Q8_0.gguf \
--port 8081 -ngl 99 --ctx-size 4096
# Query via curl
curl -s http://localhost:8081/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "system", "content": "You are EthereumGPT, an AI assistant specializing in Ethereum protocol development."},
{"role": "user", "content": "What is Fusaka?"}
],
"max_tokens": 512
}' | python3 -m json.tool
LM Studio
- Search for
satyajitdas/EthereumGPT-8B-GGUFin the model browser and downloadEthereumGPT-8B-v8-Q8_0.gguf - Set the system prompt to:
You are EthereumGPT, an AI assistant specializing in Ethereum protocol development, smart contracts, consensus mechanisms, and the broader Ethereum ecosystem. - Ensure the chat template is set to Qwen3 or ChatML (
<|im_start|>/<|im_end|>tokens)
llama-cli (interactive)
./llama-cli -m EthereumGPT-8B-Q8_0.gguf \
--chat-template chatml \
-sys "You are EthereumGPT, an AI assistant specializing in Ethereum protocol development." \
-cnv
Ollama
# Download the GGUF and Modelfile.v8, then:
ollama create ethereumgpt -f Modelfile.v8
ollama run ethereumgpt
The Modelfile.v8 includes the full Qwen3 Go template with thinking support, the EthereumGPT system prompt, and tuned sampler parameters. Important: The GGUF must include tokenizer.ggml.add_bos_token = false metadata (the files on this repo already do). Without it, Ollama's qwen3 runner prepends a spurious BOS token causing degenerate output.
Available Files
| File | Size | Description |
|---|---|---|
EthereumGPT-8B-v8-Q8_0.gguf |
8.1 GB | v8 DPO-aligned, 8-bit quantized (recommended) |
EthereumGPT-8B-Q8_0.gguf |
8.1 GB | v7 SFT only, 8-bit quantized |
Modelfile.v8 |
2 KB | Ollama Modelfile with Qwen3 thinking template |
Evaluation
v8 (DPO-aligned β latest)
86% pass rate (24/28 questions) across 4 categories:
| Category | v7 | v8 | Description |
|---|---|---|---|
| Anchor | 5/5 (100%) | 5/5 (100%) | Injected facts (Fusaka, ePBS, PeerDAS, etc.) |
| Attribution | 3/3 (100%) | 3/3 (100%) | Who said what in Discord discussions |
| Negative | 6/15 (40%) | 14/15 (93%) | Refusal of fabricated entities |
| Boundary | 2/5 (40%) | 2/5 (40%) | Adjacent but unknown topics |
| Overall | 16/28 (57%) | 24/28 (86%) | +29 percentage points |
DPO training eliminated hallucination on unknown topics (+53pp on negative category) with zero alignment tax on known topics.
Training Details
v7 SFT (base)
- Base model: Qwen/Qwen3-8B via mlx-community/Qwen3-8B-bf16
- Method: LoRA (rank 64, alpha 128) via MLX on Apple M4 Max 128GB
- Data: ~26,000 examples β factual, attribution, and evolution Q&A from 10,233 Discord chunks
- Training: lr 2e-5, batch 2, grad_accum 4, grad_checkpoint=true, ~12 hours
v8 DPO (alignment layer)
- Base: v7 fused model
- Method: DPO via mlx-lm-lora (rank 32, beta 0.1, sigmoid loss)
- Data: 108 preference pairs (71 negative + 37 positive), 98 train / 10 valid
- Training: 147 iters, lr 5e-6, loss 0.721 β 0.002, ~20 min, 38.9 GB peak
What It Knows
Trained on real Ethereum R&D Discord discussions spanning 2019-2026:
- Protocol development: EIP-1559, EIP-4844, The Merge, Pectra, Fusaka
- Consensus: Beacon chain, forkchoice, attestations, validator lifecycle, slashing, ePBS
- Execution: EVM opcodes, EOF, SELFDESTRUCT removal, state management, gas mechanics
- Infrastructure: Geth, Prysm, Lighthouse, Reth, Nethermind, Teku, Nimbus, Lodestar
- Research: ePBS, Verkle trees, PeerDAS, data availability sampling, MEV/PBS
- People: Core developers (Potuz, terencechain, lightclient, parithosh, timbeiko, etc.)
- Evolution: How key discussions (SELFDESTRUCT, ePBS, PeerDAS, Pectra scope) evolved over time
Example Outputs
Q: What is Fusaka?
A: Fusaka is the name for an upcoming Ethereum hard fork that combines the Fulu
(consensus layer) and Osaka (execution layer) upgrades, following the same
naming convention as Dencun and Pectra. It is the next major network upgrade
planned after Pectra. Key items discussed for inclusion include PeerDAS
(EIP-7594) for improved data availability.
Q: What does ePBS stand for?
A: ePBS stands for enshrined Proposer-Builder Separation. It is a proposed
protocol-level change that would enshrine the separation of block proposers
and block builders directly into the consensus protocol, rather than relying
on external middleware like MEV-Boost.
Q: What was Potuz's position on ePBS timing games?
A: Potuz argued that ePBS creates a timing game where builders can delay revealing
their payloads to gain advantages, and that the system should be designed to
prevent builders from having too much control over block timing.
Limitations
- Trained on Discord conversations up to early 2026; may not reflect the very latest protocol changes
- Best at factual Q&A about Ethereum internals; not designed for code generation or smart contract auditing
- Requires the Qwen3/ChatML chat template and system prompt for best results
- Ollama requires the provided
Modelfile.v8(with full Qwen3 Go template) β do not rely on auto-detected templates
License
Apache 2.0 (same as the base Qwen3-8B model)
- Downloads last month
- 297
8-bit
16-bit