Instructions to use kingabzpro/qwen3vl-open-schematics-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use kingabzpro/qwen3vl-open-schematics-lora with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="kingabzpro/qwen3vl-open-schematics-lora") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("kingabzpro/qwen3vl-open-schematics-lora", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use kingabzpro/qwen3vl-open-schematics-lora with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "kingabzpro/qwen3vl-open-schematics-lora" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "kingabzpro/qwen3vl-open-schematics-lora", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/kingabzpro/qwen3vl-open-schematics-lora
- SGLang
How to use kingabzpro/qwen3vl-open-schematics-lora with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "kingabzpro/qwen3vl-open-schematics-lora" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "kingabzpro/qwen3vl-open-schematics-lora", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "kingabzpro/qwen3vl-open-schematics-lora" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "kingabzpro/qwen3vl-open-schematics-lora", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use kingabzpro/qwen3vl-open-schematics-lora with Docker Model Runner:
docker model run hf.co/kingabzpro/qwen3vl-open-schematics-lora
About the Model
This is a fine-tuned version of Qwen3-VL-8B-Instruct specialized for electronic schematic understanding.
The model is trained to read schematic images and extract exact component information as it appears in the diagram, rather than generating generic component categories. During fine-tuning, the model learns to map visual schematic elements to:
- Component identifiers and part numbers (e.g.
ATMEGA328P-PU) - Footprint and library names (e.g.
7.62MM-3P) - Net and power labels (e.g.
+5V,GND) - Other visible schematic text and symbols
Unlike general vision-language models, this fine-tuned model is optimized for precision copying of schematic labels, making it suitable for downstream tasks such as BOM generation, schematic analysis, CAD migration, and hardware documentation.
The model operates in a causal generation setting, taking a schematic image and a short instruction prompt, and producing structured text outputs such as component lists, YAML/JSON metadata, or raw schematic text.
Usage
import torch
from transformers import AutoProcessor, AutoModelForVision2Seq
from PIL import Image
MODEL_ID = "kingabzpro/qwen3vl-open-schematics-lora" # change me
processor = AutoProcessor.from_pretrained(MODEL_ID)
model = AutoModelForVision2Seq.from_pretrained(
MODEL_ID,
torch_dtype=torch.bfloat16,
device_map="auto",
).eval()
def build_prompt(example):
name = example.get("name") or "Unknown project"
ftype = example.get("type") or "unknown format"
return (
f"Project: {name}\nFormat: {ftype}\n"
"From the schematic image, extract all component labels and identifiers exactly as shown "
"(part numbers, values, footprints, net labels like +5V/GND).\n"
"Output only a comma-separated list. Do not generalize or add extra text."
)
def run_inference(model_, example, max_new_tokens=256):
prompt = build_prompt(example)
messages = [{
"role": "user",
"content": [
{"type": "image", "image": example["image"]},
{"type": "text", "text": prompt},
],
}]
inputs = processor.apply_chat_template(
messages,
tokenize=True,
add_generation_prompt=True,
return_dict=True,
return_tensors="pt",
).to(model_.device)
with torch.inference_mode():
out = model_.generate(**inputs, max_new_tokens=max_new_tokens, do_sample=False)
gen = out[0][inputs["input_ids"].shape[1]:]
return processor.decode(gen, skip_special_tokens=True)
# ---- Small usage example ----
example = {
"name": "Arduino-like Board",
"type": "kicad",
"image": Image.open("schematic.png").convert("RGB"),
}
print(run_inference(model, example))
Results
This model is a fine-tuned version of Qwen3-VL-8B-Instruct trained specifically to understand electronics schematics and extract component information directly from schematic images.
Compared to the base model, the fine-tuned model is more focused on relevant schematic entities (components, nets, identifiers) instead of raw pin-level text.
Before (Base Qwen3-VL)
The base model reads a lot of schematic text, but mixes pin names, signals, and low-level labels:
R1,10kΩ,PC6_RESET#,PC6_ADC0,PC6_ADC1,...,PB7_XTAL2,VCC,AVCC,AREF,GND,+5V,
C1,22pF,C2,100nF,C3,22pF,X1,16MHz,U1,ATMEGA328P-PU,...
After (Fine-tuned)
After fine-tuning (1 epoch, ~800 samples), the model outputs a cleaner, more task-focused list:
ATMEGA328P-PU, +5V, GND, R, C, C16MHz,
SERVO_A, SERVO_B, SERVO_C, SERVO_D, SERVO_E, SERVO_F
Target (Dataset)
The training target focuses on component identifiers, footprints, and net labels:
+5V, 7.62MM-3P, 7.62MM-3P_1, ..., ATMEGA328P-PU, ATMEGA328P-PU_1,
GND, MBB02070C1002FCT00, ..., Y5P102K2KV16CC0224_2
Even with a small dataset and a single training epoch, the fine-tuned model already shows improved semantic filtering toward schematic-level components, forming a strong base for further refinement with more data and stricter target alignment.
Model tree for kingabzpro/qwen3vl-open-schematics-lora
Base model
Qwen/Qwen3-VL-8B-Instruct