Can't run Gemma4 locally

by polymathLTE - opened 20 days ago

I'm trying to run the Gemma4 E2B model locally through the Huggingface transformers module

from transformers import AutoTokenizer, AutoModelForCausalLM
local_path = /teamspace/studios/this_studio/gemma4_test/gemma-4-E2B

tokenizer = AutoTokenizer.from_pretrained(local_path)
model = AutoModelForCausalLM.from_pretrained(local_path)

inputs = tokenizer("Hello, world!", return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0]))

but I keep getting the error message

ModuleNotFoundError: Could not import module 'Gemma4Config'. Are this object's requirements defined correctly?

I've already tried upgrading transformers to version 5.7.0 but it's still the same thing
Please, has anyone faced a similar issue and how did you resolve it

thnamratha

Google org 19 days ago

Hi @polymathLTE ,

Thanks for addressing the issue. Looking at your code, could you please wrap local_path in quotes (local_path = "/teamspace/studios/this_studio/gemma4_test/gemma-4-E2B") and please let us if the issue still persists.

polymathLTE

18 days ago

Thanks for your response. My local_path is wrapped in quotes, so sorry I omitted that in my opening post. And yes, the issue persists.

thnamratha

Google org 17 days ago

Hi @polymathLTE ,

Thanks for your clarification. Since you are loading from "local_path", we need to check some priority. Could you please share the output of these checks?

Verify Local Config & Model Type:
Run these in your terminal to ensure the file exists and is mapped to the correct architecture:
ls -la /teamspace/studios/this_studio/gemma4_test/gemma-4-E2B/config.json
grep "model_type" /teamspace/studios/this_studio/gemma4_test/gemma-4-E2B/config.json
Additionally could you please provide a screenshot of the files stored in your local directory? This will help us to confirm that all necessary architectural files are present.
Could you please confirm on the priorities and we can look into the next steps.

polymathLTE

17 days ago

The output for the ls and grep commands

Screenshot of my local directory

polymathLTE

17 days ago

I cloned the repo from Huggingface with git clone https://huggingface.co/google/gemma-4-E2B

Burin-Zhargal

15 days ago

•

edited 15 days ago

I'm trying to run the Gemma4 E2B model locally through the Huggingface transformers module
from transformers import AutoTokenizer, AutoModelForCausalLM
local_path = /teamspace/studios/this_studio/gemma4_test/gemma-4-E2B

tokenizer = AutoTokenizer.from_pretrained(local_path)
model = AutoModelForCausalLM.from_pretrained(local_path)

inputs = tokenizer("Hello, world!", return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0]))
but I keep getting the error message
ModuleNotFoundError: Could not import module 'Gemma4Config'. Are this object's requirements defined correctly?
I've already tried upgrading transformers to version 5.7.0 but it's still the same thing
Please, has anyone faced a similar issue and how did you resolve it

Good Sir/Maam or M/F/D polymathLTE:

I tried to perform it too, But unfortunately after 8+ hours of work
Tutorial here https://developers.googleblog.com/en/own-your-ai-fine-tune-gemma-3-270m-for-on-device/ , Task was not that hilarious as you can maybe think

I am Right to be Honest. Papaev Burin-Zhargal .
P.S. Result was pretty good

polymathLTE

13 days ago

•

edited 13 days ago

Fixed the issue. Here is a breakdown.
First, I had to clean up all the dependencies

!pip uninstall -y transformers torch torchvision accelerate pillow

I then reinstalled Pytorch - torchvision since it's the base library and other libraries rely on it. I'm using CPU only for inference on this server so I specify in my install code

!pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu

Followed by reinstalling the wrapper libraries

!pip install transformers accelerate pillow

The biggest change is with the Auto Classes. Gemma 4 models (especially the E2B and E4B variants) are multimodal and require either AutoModelForImageTextToText or AutoModelForMultimodalLM instead of AutoModelForCausalLM. I used AutoModelForMultimodalLM in this case.

Full code;

import os
import torch
from transformers import AutoProcessor, AutoModelForMultimodalLM

# 1. Configuration
local_path = "/teamspace/studios/this_studio/gemma4_test/gemma-4-E2B"

# 2. Load Processor and Model
processor = AutoProcessor.from_pretrained(local_path)
model = AutoModelForMultimodalLM.from_pretrained(
    local_path,
    device_map="auto",
    torch_dtype="auto",
    trust_remote_code=True
)

# 3 using chat_template gave me some hassle but this was my solution
jinja_file = os.path.join(local_path, "chat_template.jinja")
if os.path.exists(jinja_file):
    with open(jinja_file, "r") as f:
        processor.chat_template = f.read()
else:
    print(f"Warning: {jinja_file} not found. Fallback to manual.")
    processor.chat_template = (
        "{% for message in messages %}"
        "<|turn|>{{ message['role'] }}\n{{ message['content'] }}<|turn|>"
        "{% endfor %}{% if add_generation_prompt %}<|turn|>assistant\n{% endif %}"
    )

# 4.  Evaluation Prompt
messages = [
    {
        "role": "user", 
        "content": "I have 3 shirts in the dryer. It takes 45 minutes to dry them all together. "
                   "How long will it take to dry 30 shirts if they all fit in the same dryer at once? "
                   "Explain your reasoning step-by-step."
    }
]

# 5. Process and Generate
prompt = processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
inputs = processor(text=prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs, 
    max_new_tokens=256,
    do_sample=True,
    temperature=0.7
)

# 6. Output
print(processor.decode(outputs[0], skip_special_tokens=True))

P.S.
Why AutoModelForCausalLM fails: The error specifically mentions Gemma4Config because the library is trying to map the model type in the local config.json to a known text-only class. Because Gemma 4 E2B includes audio and vision encoders natively, the standard causal (text-only) loader cannot properly initialize the required multimodal sub-modules.

References: google ai mode, gemini 🙌
Thanks @thnamratha and @Burin-Zhargal

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment