πŸ“¦ Meta-Llama-3-70B-Instruct-4bit-gguf

meta-llama/Meta-Llama-3-70B-Instruct converted to GUFF format

QuantLLM Format

⭐ Star QuantLLM on GitHub


πŸ“– About This Model

This model is meta-llama/Meta-Llama-3-70B-Instruct converted to GUFF format.

Property Value
Base Model meta-llama/Meta-Llama-3-70B-Instruct
Format GUFF
Quantization None (Full Precision)
License apache-2.0
Created With QuantLLM

πŸš€ Quick Start

With Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained("QuantLLM/Meta-Llama-3-70B-Instruct-4bit-gguf")
tokenizer = AutoTokenizer.from_pretrained("QuantLLM/Meta-Llama-3-70B-Instruct-4bit-gguf")

# Generate text
inputs = tokenizer("Once upon a time", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100, do_sample=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

With QuantLLM

from quantllm import TurboModel

# Load with automatic optimization
model = TurboModel.from_pretrained("QuantLLM/Meta-Llama-3-70B-Instruct-4bit-gguf")

# Generate
response = model.generate("Write a poem about coding")
print(response)

Requirements

pip install transformers torch

πŸ“Š Model Details

Property Value
Original Model meta-llama/Meta-Llama-3-70B-Instruct
Format GUFF
Quantization Full Precision
License apache-2.0
Export Date 2026-04-24
Exported By QuantLLM v2.0

πŸš€ Created with QuantLLM

QuantLLM

Convert any model to GGUF, ONNX, or MLX in one line!

from quantllm import turbo

# Load any HuggingFace model
model = turbo("meta-llama/Meta-Llama-3-70B-Instruct")

# Export to any format
model.export("guff", quantization="Q4_K_M")

# Push to HuggingFace
model.push("your-repo", format="guff")
GitHub Stars

πŸ“š Documentation Β· πŸ› Report Issue Β· πŸ’‘ Request Feature

Downloads last month
33
Safetensors
Model size
71B params
Tensor type
F32
Β·
BF16
Β·
U8
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for QuantLLM/Meta-Llama-3-70B-Instruct-4bit-gguf

Quantized
(44)
this model