Instructions to use ValiantLabs/Llama3-70B-ShiningValiant2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ValiantLabs/Llama3-70B-ShiningValiant2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="ValiantLabs/Llama3-70B-ShiningValiant2") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("ValiantLabs/Llama3-70B-ShiningValiant2") model = AutoModelForCausalLM.from_pretrained("ValiantLabs/Llama3-70B-ShiningValiant2") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use ValiantLabs/Llama3-70B-ShiningValiant2 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "ValiantLabs/Llama3-70B-ShiningValiant2" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ValiantLabs/Llama3-70B-ShiningValiant2", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/ValiantLabs/Llama3-70B-ShiningValiant2
- SGLang
How to use ValiantLabs/Llama3-70B-ShiningValiant2 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "ValiantLabs/Llama3-70B-ShiningValiant2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ValiantLabs/Llama3-70B-ShiningValiant2", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "ValiantLabs/Llama3-70B-ShiningValiant2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ValiantLabs/Llama3-70B-ShiningValiant2", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use ValiantLabs/Llama3-70B-ShiningValiant2 with Docker Model Runner:
docker model run hf.co/ValiantLabs/Llama3-70B-ShiningValiant2
This model is legacy - we recommend Shining Valiant 2 for Llama 3.1 70b!
Click here to support our open-source dataset and model releases!
Shining Valiant 2 is a chat model built on Llama 3 70b, finetuned on our data for friendship, insight, knowledge and enthusiasm.
- Finetuned on meta-llama/Meta-Llama-3-70B-Instruct for best available general performance
- Trained on our data, focused on science, engineering, technical knowledge, and structured reasoning
Version
This is the 2024-04-20 release of Shining Valiant 2 for Llama 3 70b.
We're working on more Llama 3 releases to come, including Shining Valiant and our Build Tools set of models. We're excited to bring these to everyone soon!
Prompting Guide
Shining Valiant 2 uses the Llama 3 Instruct prompt format:
<|begin_of_text|><|start_header_id|>system<|end_header_id|>{{ system_prompt }}<|eot_id|><|start_header_id|>user<|end_header_id|>{{ user_msg_1 }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>{{ model_answer_1 }}<|eot_id|>
Example input:
<|begin_of_text|><|start_header_id|>system<|end_header_id|>You are Shining Valiant, a highly capable chat AI.<|eot_id|><|start_header_id|>user<|end_header_id|>Hi, can you write me a cover letter for a data analyst position?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
WARNING: text-generation-webui
When using Llama 3 Instruct models (including Shining Valiant 2) with text-generation-webui note that a current bug in webui can result in incorrect reading of the model's ending tokens, causing unfinished outputs and incorrect structure.
For a temporary workaround if you encounter this issue, edit Shining Valiant 2's tokenizer_config file as indicated:
from "eos_token": "<|end_of_text|>",
to "eos_token": "<|eot_id|>",
The Model
Shining Valiant 2 is built on top of Llama 3 70b Instruct, the highest performance open-source model currently available.
Our private data adds specialist knowledge and Shining Valiant's personality: she's friendly, enthusiastic, insightful, knowledgeable, and loves to learn!
Shining Valiant 2 is created by Valiant Labs.
Check out our HuggingFace page to see all of our models!
We care about open source. For everyone to use.
We encourage others to finetune further from our models.
- Downloads last month
- 8,357

