You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Nvidia.Agentic.Coder-4B-GGUF

📌 Model Overview

Model Name: WithinUsAI/Nvidia.Agentic.Coder-4B-GGUF Organization: Within Us AI Model Type: Code LLM (Agentic, Instruction-Following) Parameter Size: 4B Format: GGUF (quantized for local inference) Primary Use: Agentic coding, tool-using workflows, software engineering reasoning

This model is part of the Within Us AI ecosystem focused on building agentic, reasoning-driven coding systems designed to think, act, and verify like real engineers. 

🧬 Architecture & Lineage

  • Base Family: NVIDIA Nemotron-style 4B class models (inferred lineage from naming + ecosystem alignment)
  • Format Conversion: GGUF quantization for efficient local inference
  • Training Approach:
    • Instruction-tuned for coding tasks
    • Agentic workflow emphasis (multi-step reasoning, tool usage)
    • Likely merged / fine-tuned using Within Us AI proprietary pipelines

Related ecosystem models include:

  • NVIDIA-Nemotron-3-Nano-4B
  • Other 4B agentic coders and merges in the same class 

⚙️ Key Capabilities

🧑‍💻 Code Intelligence

  • Multi-language code generation
  • Bug fixing and refactoring
  • Structured output generation

🤖 Agentic Behavior

  • Step-by-step reasoning
  • Task decomposition
  • Tool-calling alignment (design goal)

🧠 Reasoning Focus

  • Instruction-following with logical chaining
  • Designed for evaluation-style datasets (tests-as-truth philosophy)

📦 GGUF Quantization

GGUF allows efficient local inference with tools like:

  • llama.cpp
  • LM Studio
  • Ollama (GGUF-compatible builds)

Typical quantizations for 4B GGUF models include:

  • Q2_K (~1.8GB)
  • Q3_K (~2.0–2.3GB)
  • Q4_K (~2.5GB, recommended balance) 

🚀 Intended Use

✅ Ideal Use Cases

  • Local AI coding assistants
  • Autonomous coding agents
  • SWE-bench style evaluation
  • Tool-augmented workflows
  • Offline developer copilots

⚠️ Limitations

  • Smaller 4B parameter size limits deep reasoning vs larger models
  • Performance depends heavily on prompt structure
  • Tool-use requires external orchestration (not built-in runtime)

🛠️ Usage Example (llama.cpp)

./main -m Nvidia.Agentic.Coder-4B.Q4_K.gguf
-p "Write a Python function to parse JSON logs and extract errors."
-n 512

🧪 Training Philosophy (Within Us AI)

Within Us AI focuses on:

  • Agentic AI systems
  • Test-driven training (tests-as-truth)
  • Diff-first patching workflows
  • Secure and auditable code generation
  • Evaluation-first development pipelines 

📊 Evaluation

No formal benchmark results published yet.

Expected strengths:

  • Strong instruction adherence
  • Lightweight agentic reasoning
  • Efficient local deployment

📚 Datasets & Training Sources

This model follows the Within Us AI methodology:

  • Proprietary datasets created by Within Us AI
  • May include third-party datasets for training (no ownership claimed)
  • Emphasis on:
    • Code reasoning traces
    • Agentic workflows
    • Evaluation-driven samples

📜 License

License Type: Custom / Other (Within Us AI License)

Terms:

  • Within Us AI created the fine-tuning, merging, and training methodology
  • Base model architecture originates from third-party LLM ecosystems (e.g., NVIDIA / Nemotron class)
  • Third-party datasets may be used without claiming ownership
  • Full credit and acknowledgment belong to original dataset and base model creators

🙏 Acknowledgements

Special thanks to:

  • NVIDIA Nemotron ecosystem contributors
  • Open-source GGUF tooling community
  • Dataset creators across Hugging Face
  • The broader open-source AI research community

🔗 Links

Downloads last month
547
GGUF
Model size
4B params
Architecture
nemotron_h
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Datasets used to train WithinUsAI/Nvidia.Agentic.Coder-4B-GGUF

Collection including WithinUsAI/Nvidia.Agentic.Coder-4B-GGUF