GGUF
llama.cpp

Yeah, worst out of the box experience so far..

#4
by LinuxMagic - opened

llama-server --models-preset ./models-preset.ini --host 0.0.0.0 --port 8080 --threads 16 --offline --flash-attn auto -fit off --gpu-layers 5 --ctx-size 16384 (router mode)
Only 9tokens/second.. I didn't need it to give me a full README as a response to my first query.. and as for reasoning?
Asked it to calculate pi to ten decimal places, and instead it responded with garbage, and then ...

-2023-08-03-13-32-42-0000000000000000_20240220-173426_1

The Art of Tekken: A Guide to Mastering the Game
Introduction to Tekken

Sign up or log in to comment