I would like a recommended training environment setup for the Qwen3.5-MoE model (e.g., Qwen3.5-35B-A3B, model_type: qwen3_5_moe).
Please provide a complete and reproducible environment specification, including:
OS / Driver / CUDA requirements (e.g., Ubuntu version, NVIDIA driver, CUDA toolkit, NCCL).
A working Python version and the exact PyTorch version (and torchvision/torchaudio if needed).
The exact compatible versions (or installation method) for Transformers and PEFT, ensuring that Transformers can correctly recognize qwen3_5_moe via AutoConfig.from_pretrained(..., trust_remote_code=True).
Any additional dependencies required for training (e.g., flash-attn, deepspeed, megatron-core, etc.) and recommended versions.
Clear guidance on avoiding common dependency conflicts (e.g., between vLLM and Transformers, or numpy/pydantic version constraints).
A minimal sanity check script/commands to verify the environment can load the model config and start training successfully.
+1