Instructions to use 16dvnk/AaI_mini.plus_exp_251111_Base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use 16dvnk/AaI_mini.plus_exp_251111_Base with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="16dvnk/AaI_mini.plus_exp_251111_Base")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("16dvnk/AaI_mini.plus_exp_251111_Base", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use 16dvnk/AaI_mini.plus_exp_251111_Base with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "16dvnk/AaI_mini.plus_exp_251111_Base"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "16dvnk/AaI_mini.plus_exp_251111_Base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/16dvnk/AaI_mini.plus_exp_251111_Base

SGLang

How to use 16dvnk/AaI_mini.plus_exp_251111_Base with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "16dvnk/AaI_mini.plus_exp_251111_Base" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "16dvnk/AaI_mini.plus_exp_251111_Base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "16dvnk/AaI_mini.plus_exp_251111_Base" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "16dvnk/AaI_mini.plus_exp_251111_Base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use 16dvnk/AaI_mini.plus_exp_251111_Base with Docker Model Runner:
```
docker model run hf.co/16dvnk/AaI_mini.plus_exp_251111_Base
```

Safety Concerns

This model has not passed any safety tuning. We are not responsible for any damages.

AaI Introduction

AaI is a model fully made by 16dvnk on his NVIDIA Geforce RTX 4080 Laptop GPU. He trained it for 11 hours straight, and after some tuning, has made this model. The model is made from scratch. He claims the process was a pain, and has taken lots of effort. He named it AaI and not AAI or other variations since he thinks it is an “eyesore”.

Architecture

The model uses a Generative pre-trained transformer architecture.

Technical Specifications

AaI Specs	Details
Creator	16dvnk
Hardware	NVIDIA GeForce RTX 4080 Laptop GPU
Training Duration	21 hours
Framework	PyTorch
Parameter Count	14 million
Model Type	Generative pre-trained transformer
Initial Training Year	2025
Stable Release Status	No stable release as of December 2025

Evaluation Results

The model was evaluated on the ARC-Easy and AaI-sbench benchmark (test split).

Dataset	Split	Metric	Value
ARC-Easy	test	Accuracy	17.85%
AaI-sbench	test	Accuracy	60.00%

Notes

• All current releases have 14M parameters, which is considered small.

• The model was trained using PyTorch.

• As of December 2025, there is no stable release of AaI.

Downloads last month: -

Datasets used to train 16dvnk/AaI_mini.plus_exp_251111_Base

Collection including 16dvnk/AaI_mini.plus_exp_251111_Base

AaI

Collection

All about AaI • 6 items • Updated Dec 18, 2025 • 1

Evaluation results

Accuracy on ai2_arc
test set self-reported

17.850