Instructions to use google/gemma-3-270m-it with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use google/gemma-3-270m-it with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="google/gemma-3-270m-it")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("google/gemma-3-270m-it")
model = AutoModelForCausalLM.from_pretrained("google/gemma-3-270m-it")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use google/gemma-3-270m-it with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "google/gemma-3-270m-it"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "google/gemma-3-270m-it",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/google/gemma-3-270m-it

SGLang

How to use google/gemma-3-270m-it with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "google/gemma-3-270m-it" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "google/gemma-3-270m-it",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "google/gemma-3-270m-it" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "google/gemma-3-270m-it",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use google/gemma-3-270m-it with Docker Model Runner:
```
docker model run hf.co/google/gemma-3-270m-it
```

The model is now added to WebAI.js. Test it easily in your browser with no code required.

#11

by AxolsWebAI - opened Nov 19, 2025

Discussion

AxolsWebAI

Nov 19, 2025

Hi there,

We've packaged this model inside our open-source library WebAI.js, so you can now integrate it into your project easily with just a few lines of code:

import { WebAI } from '@axols/webai-js';

const webai = await WebAI.create({
   modelId: "gemma-3-270m-it"
});


await webai.init({
  mode: "webai",
  precision: "q4",
  device: "webgpu"
});


const generation = await webai.generate({
  userInput: {
    messages: [
      {
        role: "user",
        content: "Generate a Python code snippet for the Bubble Sort algorithm, make the answer concise."
      }
    ]
  },
  modelConfig: {
    max_new_tokens: 600,
    temperature: 0.9
  },
  generateConfig: {
    skip_special_tokens: true
  }
});

// Or stream generate easily
const generation = await webai.generateStream({
  userInput: {
    messages: [
      {
        role: "user",
        content: "Generate me a new blog post about the benefits of AI in healthcare."
      }
    ]
  },
  onStream: (chunk) => {
    // Edits here will not have any effect in the playground.
    console.log(chunk);
  }
});

🔬 Try Gemma3 270M Instruct Instantly (No Code Required)

You can benchmark and test the model directly here:
https://www.webai-js.com/models/gemma-3-270m-it/playground

📘 Full API Reference

Detailed parameter explanations can be found here:
https://www.webai-js.com/models/gemma-3-270m-it/api-reference/v1/class-api/methods/webai-generate

🧡 Fully Open Source

WebAI.js is completely open source and free to use.
Star or contribute on GitHub:
https://github.com/axolsai/webai-js

Thanks

BalakrishnaCh

Google org Nov 20, 2025

Hi @AxolsWebAI ,

I really appreciate you showcasing the integration of the Gemma 3 270M Instruct model into your WebAI.js library. It looks like a very straightforward and powerful way for developers to use it in their projects. Thanks for making it open source.

Thanks.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment