Instructions to use ghost-x/ghost-7b-alpha-gguf with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ghost-x/ghost-7b-alpha-gguf with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="ghost-x/ghost-7b-alpha-gguf") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("ghost-x/ghost-7b-alpha-gguf", dtype="auto") - llama-cpp-python
How to use ghost-x/ghost-7b-alpha-gguf with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="ghost-x/ghost-7b-alpha-gguf", filename="ghost-7b-alpha-Q4_0.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use ghost-x/ghost-7b-alpha-gguf with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf ghost-x/ghost-7b-alpha-gguf:Q4_K_M # Run inference directly in the terminal: llama-cli -hf ghost-x/ghost-7b-alpha-gguf:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf ghost-x/ghost-7b-alpha-gguf:Q4_K_M # Run inference directly in the terminal: llama-cli -hf ghost-x/ghost-7b-alpha-gguf:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf ghost-x/ghost-7b-alpha-gguf:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf ghost-x/ghost-7b-alpha-gguf:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf ghost-x/ghost-7b-alpha-gguf:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf ghost-x/ghost-7b-alpha-gguf:Q4_K_M
Use Docker
docker model run hf.co/ghost-x/ghost-7b-alpha-gguf:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use ghost-x/ghost-7b-alpha-gguf with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "ghost-x/ghost-7b-alpha-gguf" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ghost-x/ghost-7b-alpha-gguf", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/ghost-x/ghost-7b-alpha-gguf:Q4_K_M
- SGLang
How to use ghost-x/ghost-7b-alpha-gguf with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "ghost-x/ghost-7b-alpha-gguf" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ghost-x/ghost-7b-alpha-gguf", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "ghost-x/ghost-7b-alpha-gguf" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ghost-x/ghost-7b-alpha-gguf", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Ollama
How to use ghost-x/ghost-7b-alpha-gguf with Ollama:
ollama run hf.co/ghost-x/ghost-7b-alpha-gguf:Q4_K_M
- Unsloth Studio new
How to use ghost-x/ghost-7b-alpha-gguf with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for ghost-x/ghost-7b-alpha-gguf to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for ghost-x/ghost-7b-alpha-gguf to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for ghost-x/ghost-7b-alpha-gguf to start chatting
- Docker Model Runner
How to use ghost-x/ghost-7b-alpha-gguf with Docker Model Runner:
docker model run hf.co/ghost-x/ghost-7b-alpha-gguf:Q4_K_M
- Lemonade
How to use ghost-x/ghost-7b-alpha-gguf with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull ghost-x/ghost-7b-alpha-gguf:Q4_K_M
Run and chat with the model
lemonade run user.ghost-7b-alpha-gguf-Q4_K_M
List all available models
lemonade list
Ghost 7B Alpha
The large generation of language models focuses on optimizing excellent reasoning, multi-task knowledge, and tools support.
Introduction
Ghost 7B Alpha is a large language model fine-tuned from Mistral 7B, with a size of 7 billion parameters. The model was developed with the goal of optimizing reasoning ability, multi-task knowledge and supporting tool usage. The model works well with the main trained and optimized languages being English and Vietnamese.
Overall, the model is suitable when making a pretrained version so you can continue to develop the desired tasks, develop virtual assistants, perform features on tasks such as coding, translation, answering questions, creating documents, etc. It is truly an efficient, fast and extremely cheap open model.
Specifications
- Name: Ghost 7B Alpha.
- Model size: 7 billion parameters.
- Context length: 8K, 8192.
- Languages: English and Vietnamese.
- Main tasks: reasoning, multi-tasking knowledge and function tools.
- License: Ghost 7B LICENSE AGREEMENT.
- Based on: Mistral 7B.
- Distributions: Standard (BF16), GGUF, AWQ.
- Developed by: Ghost X, Hieu Lam.
Links
- Card model: ๐ค HuggingFace.
- Official website: Ghost 7B Alpha.
- Demo: Playground with Ghost 7B Alpha.
Distributions
We create many distributions to give you the best access options that best suit your needs. Always make sure you know which version you need and what will help you operate better.
| Version | Model card |
|---|---|
| BF16 | ๐ค HuggingFace |
| GGUF | ๐ค HuggingFace |
| AWQ | ๐ค HuggingFace |
Note
For all official information and updates about the model, see here:
- Card model: ๐ค HuggingFace.
- Official website: Ghost 7B Alpha.
- Demo: Playground with Ghost 7B Alpha.
- Downloads last month
- 151
4-bit
5-bit
8-bit