Qwen3_32B
Section titled “Qwen3_32B”from modal_training_gym.common.models.qwen3_32b import Qwen3_32BQwen3-32B (32 billion parameters) from Alibaba.
Inherits from: HFModelConfiguration, ModelConfiguration
Fields
Section titled “Fields”| Field | Type | Default | Description |
|---|---|---|---|
model_name | str | "Qwen/Qwen3-32B" | HuggingFace repo ID or other model identifier. Default "". |
model_path | `str | None` | None |
architecture | `ModelArchitecture | None` | None |
training | `ModelTrainingConfig | None` | ModelTrainingConfig(gpu_type='H100', n_nodes=4, tensor_model_parallel_size=1, pipeline_model_parallel_size=1, context_parallel_size=1, sequence_parallel=False, expert_model_parallel_size=1, moe_permute_fusion=False, moe_grouped_gemm=False, moe_shared_expert_overlap=False, moe_aux_loss_coeff=0.0, lora_rank=8, lora_alpha=16, target_modules='all-linear', merge_lora=False) |
Methods
Section titled “Methods”download_model(self) -> 'None'
Section titled “download_model(self) -> 'None'”Download or materialize weights into the model volume.