Instructions to use DQN-Labs-Community/dqnGPT-v1-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use DQN-Labs-Community/dqnGPT-v1-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="DQN-Labs-Community/dqnGPT-v1-GGUF", filename="dqnGPT-v1.IQ4_XS.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use DQN-Labs-Community/dqnGPT-v1-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf DQN-Labs-Community/dqnGPT-v1-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf DQN-Labs-Community/dqnGPT-v1-GGUF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf DQN-Labs-Community/dqnGPT-v1-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf DQN-Labs-Community/dqnGPT-v1-GGUF:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf DQN-Labs-Community/dqnGPT-v1-GGUF:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf DQN-Labs-Community/dqnGPT-v1-GGUF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf DQN-Labs-Community/dqnGPT-v1-GGUF:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf DQN-Labs-Community/dqnGPT-v1-GGUF:Q4_K_M
Use Docker
docker model run hf.co/DQN-Labs-Community/dqnGPT-v1-GGUF:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use DQN-Labs-Community/dqnGPT-v1-GGUF with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "DQN-Labs-Community/dqnGPT-v1-GGUF" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "DQN-Labs-Community/dqnGPT-v1-GGUF", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/DQN-Labs-Community/dqnGPT-v1-GGUF:Q4_K_M
- Ollama
How to use DQN-Labs-Community/dqnGPT-v1-GGUF with Ollama:
ollama run hf.co/DQN-Labs-Community/dqnGPT-v1-GGUF:Q4_K_M
- Unsloth Studio
How to use DQN-Labs-Community/dqnGPT-v1-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for DQN-Labs-Community/dqnGPT-v1-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for DQN-Labs-Community/dqnGPT-v1-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for DQN-Labs-Community/dqnGPT-v1-GGUF to start chatting
- Docker Model Runner
How to use DQN-Labs-Community/dqnGPT-v1-GGUF with Docker Model Runner:
docker model run hf.co/DQN-Labs-Community/dqnGPT-v1-GGUF:Q4_K_M
- Lemonade
How to use DQN-Labs-Community/dqnGPT-v1-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull DQN-Labs-Community/dqnGPT-v1-GGUF:Q4_K_M
Run and chat with the model
lemonade run user.dqnGPT-v1-GGUF-Q4_K_M
List all available models
lemonade list
llm.create_chat_completion(
messages = [
{
"role": "user",
"content": "What is the capital of France?"
}
]
)dqnGPT-v1
dqnGPT-v1 is a general-purpose, multimodal AI assistant designed to act as the central interface of the DQN Labs model ecosystem.
It combines strong reasoning, natural conversation, and expressive personality to deliver an engaging and capable AI experience, while working alongside specialized models such as dqnCode, dqnMath, and dqnScience.
π§ Overview
dqnGPT-v1 is built on top of the Gemma 4 E4B architecture, a compact mixture-of-experts model that balances performance and efficiency.
It is designed not just to answer questions, but to:
- Communicate naturally
- Explain clearly
- React intelligently
- Delegate when appropriate
Unlike specialist models, dqnGPT-v1 acts as a controller and personality layer, making it ideal as a primary AI assistant.
π― Positioning
dqnGPT-v1 is not optimized for a single domain.
Instead, it is designed to:
- Handle a wide variety of everyday tasks
- Provide clear and engaging explanations
- Act as the front-facing AI in a modular system
- Route complex problems to specialized models when needed
It prioritizes usability, clarity, and personality over raw benchmark performance.
π§© System Role
dqnGPT-v1 is part of a modular AI system:
- dqnCode β programming
- dqnMath β mathematics
- dqnScience β scientific reasoning
dqnGPT-v1:
- Attempts to solve problems independently
- Recognizes when deeper expertise is required
- Suggests specialized models only when appropriate
This creates a balanced and natural delegation system. dqnGPT will naturally give you suggestions to use one of the other specialized models like dqnCode, dqnMath, or dqnScience when chatting in order to achieve a potential better response.
π Personality & Interaction Style
dqnGPT-v1 is designed to feel like a conversational, human-like assistant.
Key traits:
- Natural and engaging tone
- Slightly playful but not excessive
- Reacts to interesting or complex ideas
- Adjusts energy based on context
- Avoids overly formal or robotic responses
ποΈ Multimodal Capability
dqnGPT-v1 supports multimodal reasoning through its base architecture.
It can:
- Interpret images
- Explain visual content
- Assist with diagrams and structured inputs
Note: Multimodal performance depends on runtime support.
π§ Model Description
- Base model: google/gemma-4-E4B-it
- Architecture: Mixture-of-Experts (MoE)
- Parameters:
8B total (4.5B active) - Type: Causal Language Model
- Primary role: General assistant / controller
π‘ Intended Uses
Direct Use
- General AI assistant
- Learning and explanations
- Creative writing
- Brainstorming
- Everyday problem solving
System Integration
- Front-end assistant for multi-model systems
- Routing layer for specialized models
- Conversational interface for AI pipelines
βοΈ Key Characteristics
- Balanced reasoning and personality
- Strong instruction following
- Natural conversation flow
- Context-aware delegation
- Consistent tone across responses
- Designed for real-world usability
β οΈ Limitations
- Not optimized for highly specialized domains
- May defer advanced tasks to specialist models
- Multimodal performance depends on runtime support
- Not intended for large-scale enterprise workloads
β‘ Efficiency
dqnGPT-v1 is designed for efficient inference:
- Supports quantized formats (GGUF, 4-bit, etc.)
- Runs on consumer GPUs and local setups
- Optimized for responsiveness and usability
π¦ Usage
This repository provides:
- Custom chat template
- System prompt
- Behavior configuration
π§ Training Details
dqnGPT-v1 is not fine-tuned on domain-specific datasets.
Instead, it uses:
- Prompt-based personality shaping
- Structured chat formatting
- System-level behavior design
π License
Apache 2.0
π¨βπ» Author
Developed by DQN Labs.
This model represents the central interface of the DQN ecosystem. This model card was generated with the help of dqnGPT v0.2.
- Downloads last month
- 163
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit

# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="DQN-Labs-Community/dqnGPT-v1-GGUF", filename="", )