endless response

#31

by ramidahbash - opened 3 days ago

3 days ago

I have tried this in AWQ version.
I deployed it using vllm 0.10.2 and 4 H100 GPUs and the response never ends, it looks like he in a conversation with itself so the response is the a question to himself and he answer it in a never ending loop.
Setting the temperature to 1.0 doesn't help.

ZHANGYUXUAN-zR

Z.ai org 2 days ago

try using vllm 0.12.0.

ramidahbash

1 day ago

vllm 0.11.2 doesnt work for me, why 0.12.0 would help?
this model is supported with vllm>=0.10.2

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment