Athene
Collection
2 items β’ Updated β’ 4
How to use Nexusflow/Athene-70B with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="Nexusflow/Athene-70B")
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe(messages) # Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("Nexusflow/Athene-70B")
model = AutoModelForCausalLM.from_pretrained("Nexusflow/Athene-70B")
messages = [
{"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))How to use Nexusflow/Athene-70B with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Nexusflow/Athene-70B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "Nexusflow/Athene-70B",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker model run hf.co/Nexusflow/Athene-70B
How to use Nexusflow/Athene-70B with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "Nexusflow/Athene-70B" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "Nexusflow/Athene-70B",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "Nexusflow/Athene-70B" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "Nexusflow/Athene-70B",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'How to use Nexusflow/Athene-70B with Docker Model Runner:
docker model run hf.co/Nexusflow/Athene-70B
We introduce Llama3-Athene-70B, an open-weights LLM trained through RLHF based off Llama-3-70B-Instruct. Athene-70B achieves a high score on Arena-Hard-Auto, a proxy benchmark for Chatbot Arena.
| Model | Arena-Hard |
|---|---|
| Claude-3.5-Sonnet (Proprietary) | 79.3% |
| GPT-4o (Proprietary) | 79.2% |
| Athene-70B (Open) | 77.8% |
| Gemini-Pro-1.5 (Proprietary) | 72.0% |
| Gemma-2-27B (Open) | 57.0% |
| Llama-3-70B (Open) | 46.6% |
Athene-70B uses the same chat template as Llama-3-70B-Instruct. Below is an example simple usage using the Transformers library.
import transformers
import torch
model_id = "Nexusflow/Athene-70B"
pipeline = transformers.pipeline(
"text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.bfloat16},
device_map="auto",
)
messages = [
{"role": "system", "content": "You are an Athene Noctura, you can only speak with owl sounds. Whoooo whooo."},
{"role": "user", "content": "Whooo are you?"},
]
terminators = [
pipeline.tokenizer.eos_token_id,
pipeline.tokenizer.convert_tokens_to_ids("<|end_of_text|>")
]
outputs = pipeline(
messages,
max_new_tokens=256,
eos_token_id=terminators,
do_sample=True,
temperature=0.6,
top_p=0.9,
)
print(outputs[0]["generated_text"][-1])
We would like to thank the LMSYS Organization for their support of testing the model. We would like to thank Meta AI and the open source community for their efforts in providing the datasets and base models.
@misc{Athene2024,
title = {Athene-70B: Redefining the Boundaries of Post-Training for Open Models},
url = {https://nexusflow.ai/blogs/athene},
author = {Frick, Evan and Jin, Peter and Li, Tianle and Ganesan, Karthik and Zhang, Jian and Jiao, Jiantao and Zhu, Banghua},
month = {July},
year = {2024}
}
Base model
meta-llama/Meta-Llama-3-70B