Instructions to use google/gemma-4-E2B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use google/gemma-4-E2B with Transformers:
# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("google/gemma-4-E2B") model = AutoModelForImageTextToText.from_pretrained("google/gemma-4-E2B") - Notebooks
- Google Colab
- Kaggle
Can't run Gemma4 locally
I'm trying to run the Gemma4 E2B model locally through the Huggingface transformers module
from transformers import AutoTokenizer, AutoModelForCausalLM
local_path = /teamspace/studios/this_studio/gemma4_test/gemma-4-E2B
tokenizer = AutoTokenizer.from_pretrained(local_path)
model = AutoModelForCausalLM.from_pretrained(local_path)
inputs = tokenizer("Hello, world!", return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0]))
but I keep getting the error message
ModuleNotFoundError: Could not import module 'Gemma4Config'. Are this object's requirements defined correctly?
I've already tried upgrading transformers to version 5.7.0 but it's still the same thing
Please, has anyone faced a similar issue and how did you resolve it
Hi @polymathLTE ,
Thanks for addressing the issue. Looking at your code, could you please wrap local_path in quotes (local_path = "/teamspace/studios/this_studio/gemma4_test/gemma-4-E2B") and please let us if the issue still persists.
Thanks for your response. My local_path is wrapped in quotes, so sorry I omitted that in my opening post. And yes, the issue persists.
Hi @polymathLTE ,
Thanks for your clarification. Since you are loading from "local_path", we need to check some priority. Could you please share the output of these checks?
- Verify Local Config & Model Type:
Run these in your terminal to ensure the file exists and is mapped to the correct architecture:
ls -la /teamspace/studios/this_studio/gemma4_test/gemma-4-E2B/config.json
grep "model_type" /teamspace/studios/this_studio/gemma4_test/gemma-4-E2B/config.json
Additionally could you please provide a screenshot of the files stored in your local directory? This will help us to confirm that all necessary architectural files are present.
Could you please confirm on the priorities and we can look into the next steps.
I cloned the repo from Huggingface with git clone https://huggingface.co/google/gemma-4-E2B
I'm trying to run the Gemma4 E2B model locally through the Huggingface
transformersmodulefrom transformers import AutoTokenizer, AutoModelForCausalLM local_path = /teamspace/studios/this_studio/gemma4_test/gemma-4-E2B tokenizer = AutoTokenizer.from_pretrained(local_path) model = AutoModelForCausalLM.from_pretrained(local_path) inputs = tokenizer("Hello, world!", return_tensors="pt") outputs = model.generate(**inputs) print(tokenizer.decode(outputs[0]))but I keep getting the error message
ModuleNotFoundError: Could not import module 'Gemma4Config'. Are this object's requirements defined correctly?I've already tried upgrading
transformersto version5.7.0but it's still the same thing
Please, has anyone faced a similar issue and how did you resolve it
Good Sir/Maam or M/F/D polymathLTE:
I tried to perform it too, But unfortunately after 8+ hours of work
Tutorial here https://developers.googleblog.com/en/own-your-ai-fine-tune-gemma-3-270m-for-on-device/ , Task was not that hilarious as you can maybe think

I am Right to be Honest. Papaev Burin-Zhargal .
P.S. Result was pretty good
Fixed the issue. Here is a breakdown.
First, I had to clean up all the dependencies
!pip uninstall -y transformers torch torchvision accelerate pillow
I then reinstalled Pytorch - torchvision since it's the base library and other libraries rely on it. I'm using CPU only for inference on this server so I specify in my install code
!pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu
Followed by reinstalling the wrapper libraries
!pip install transformers accelerate pillow
The biggest change is with the Auto Classes. Gemma 4 models (especially the E2B and E4B variants) are multimodal and require either AutoModelForImageTextToText or AutoModelForMultimodalLM instead of AutoModelForCausalLM. I used AutoModelForMultimodalLM in this case.
Full code;
import os
import torch
from transformers import AutoProcessor, AutoModelForMultimodalLM
# 1. Configuration
local_path = "/teamspace/studios/this_studio/gemma4_test/gemma-4-E2B"
# 2. Load Processor and Model
processor = AutoProcessor.from_pretrained(local_path)
model = AutoModelForMultimodalLM.from_pretrained(
local_path,
device_map="auto",
torch_dtype="auto",
trust_remote_code=True
)
# 3 using chat_template gave me some hassle but this was my solution
jinja_file = os.path.join(local_path, "chat_template.jinja")
if os.path.exists(jinja_file):
with open(jinja_file, "r") as f:
processor.chat_template = f.read()
else:
print(f"Warning: {jinja_file} not found. Fallback to manual.")
processor.chat_template = (
"{% for message in messages %}"
"<|turn|>{{ message['role'] }}\n{{ message['content'] }}<|turn|>"
"{% endfor %}{% if add_generation_prompt %}<|turn|>assistant\n{% endif %}"
)
# 4. Evaluation Prompt
messages = [
{
"role": "user",
"content": "I have 3 shirts in the dryer. It takes 45 minutes to dry them all together. "
"How long will it take to dry 30 shirts if they all fit in the same dryer at once? "
"Explain your reasoning step-by-step."
}
]
# 5. Process and Generate
prompt = processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
inputs = processor(text=prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=256,
do_sample=True,
temperature=0.7
)
# 6. Output
print(processor.decode(outputs[0], skip_special_tokens=True))
P.S.
Why AutoModelForCausalLM fails: The error specifically mentions Gemma4Config because the library is trying to map the model type in the local config.json to a known text-only class. Because Gemma 4 E2B includes audio and vision encoders natively, the standard causal (text-only) loader cannot properly initialize the required multimodal sub-modules.
References: google ai mode, gemini 🙌
Thanks @thnamratha and @Burin-Zhargal


