Instructions to use dreamgen/opus-v1-34b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use dreamgen/opus-v1-34b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="dreamgen/opus-v1-34b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("dreamgen/opus-v1-34b")
model = AutoModelForCausalLM.from_pretrained("dreamgen/opus-v1-34b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use dreamgen/opus-v1-34b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "dreamgen/opus-v1-34b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dreamgen/opus-v1-34b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/dreamgen/opus-v1-34b

SGLang

How to use dreamgen/opus-v1-34b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "dreamgen/opus-v1-34b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dreamgen/opus-v1-34b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "dreamgen/opus-v1-34b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dreamgen/opus-v1-34b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Unsloth Studio new

How to use dreamgen/opus-v1-34b with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for dreamgen/opus-v1-34b to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for dreamgen/opus-v1-34b to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for dreamgen/opus-v1-34b to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="dreamgen/opus-v1-34b",
    max_seq_length=2048,
)

Docker Model Runner
How to use dreamgen/opus-v1-34b with Docker Model Runner:
```
docker model run hf.co/dreamgen/opus-v1-34b
```

jinja template for story writing

by tachyphylaxis - opened Feb 28, 2024

Discussion

tachyphylaxis

Feb 28, 2024

I'm using aphrodite-engine, and that's the format it accepts. I'm pretty new to this stuff. Does that format provide for everything you need to prompt it properly? (I noticed you said that SillyTavern can't quite do it).

I rent GPU time with e.g. vast.ai. Which backend do you suggest I use, along with which software?

Thanks.

DreamGenX

DreamGen org Feb 28, 2024

Hey there, did you checkout the guide and code for formatting? That should give you and idea of what the template should look like:

Opus V1 prompting guide with many (interactive) examples and prompts that you can copy.
Google Colab for interactive role-play using opus-v1.2-7b.
Python code to format the prompt correctly.

I am not familiar with Aphrodite engine, and did not find documentation, but I found this:
https://github.com/PygmalionAI/aphrodite-engine/blob/main/examples/chatml_template.jinja

You can adapt the ChatML template for story-writing with Opus by changing the "assistant" role to "text" role. I am not sure if Aphrodite supports name interpolation in the Jinja template, which you would need for proper role-playing support. If it does, it's easy to do add to the template, just follow the examples from the guide or the code I shared.

DreamGenX

DreamGen org Feb 28, 2024

I do something like this here: https://huggingface.co/dreamgen/opus-v1.2-7b/blob/main/tokenizer_config.json#L51 -- I change the assstant role to text in the HF tokenizer chat template.
Note that you still need a system prompt etc.

tachyphylaxis

Feb 28, 2024

Aah, thanks. The changing the "assistant" role to "text" is basically what I was looking for, I think. Aphrodite-engine is a fork of vllm by the pygmalionai people which adds a lot of features that enthusiasts like.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment