I would like a recommended training environment setup for the Qwen3.5-MoE model (e.g., Qwen3.5-35B-A3B, model_type: qwen3_5_moe).

#16

by 444515liuxin - opened Feb 26

Feb 26

Please provide a complete and reproducible environment specification, including:

OS / Driver / CUDA requirements (e.g., Ubuntu version, NVIDIA driver, CUDA toolkit, NCCL).
A working Python version and the exact PyTorch version (and torchvision/torchaudio if needed).
The exact compatible versions (or installation method) for Transformers and PEFT, ensuring that Transformers can correctly recognize qwen3_5_moe via AutoConfig.from_pretrained(..., trust_remote_code=True).
Any additional dependencies required for training (e.g., flash-attn, deepspeed, megatron-core, etc.) and recommended versions.
Clear guidance on avoiding common dependency conflicts (e.g., between vLLM and Transformers, or numpy/pydantic version constraints).
A minimal sanity check script/commands to verify the environment can load the model config and start training successfully.

Alexandre-Numind

Feb 27

•

edited Feb 27

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment