Skip to content
GitHub
View on GitHub

MilesFrameworkConfig

API reference for MilesFrameworkConfig

from modal_training_gym.frameworks.miles.config import MilesFrameworkConfig

Miles RLVR configuration, including Modal infrastructure.

FieldTypeDefaultDescription
miles_imagestr"radixark/miles:dev-202603231227"Docker image with patched Megatron-LM + Miles trainer.
image_run_commandslist[str][]Extra commands appended to the image build. Default [].
n_nodesint1Number of cluster nodes. Default 1.
app_tagsdict{}Extra Modal app tags. Default {}.
FieldTypeDefaultDescription
recipe_argsstr""Raw Miles CLI flag block. Model architecture flags and per-run overrides go here. Values override typed defaults. Default "".
extra_argsstr""Extra flags appended after recipe_args. Default "".
custom_config_yamlstr""Inline YAML overrides passed via --custom-config-path. Default "".
FieldTypeDefaultDescription
colocateboolTrueReuse a single compute pool for actor + rollout. Default True.
actor_nodes`intNone`None
rollout_num_gpus`intNone`None
FieldTypeDefaultDescription
advantage_estimatorstr"grpo"Advantage estimation method. Default "grpo".
eps_clipfloat0.2PPO clipping epsilon. Default 0.2.
clip_gradfloat1.0Gradient clipping norm. Default 1.0.
kl_coeffloat0.0KL divergence penalty coefficient. Default 0.0.
normalize_advantagesboolFalseNormalize advantages. Default False.
seedint1234Random seed. Default 1234.
FieldTypeDefaultDescription
lrfloat1e-06Learning rate. Default 1e-6.
lr_decay_stylestr"constant"LR decay schedule. Default "constant".
weight_decayfloat0.0Weight decay. Default 0.0.
adam_beta1float0.9Adam beta1. Default 0.9.
adam_beta2float0.95Adam beta2. Default 0.95.
FieldTypeDefaultDescription
micro_batch_sizeint1Micro batch size per GPU. Default 1.
n_samples_per_promptint8Rollout samples per prompt. Default 8.
FieldTypeDefaultDescription
bf16boolTrueEnable BF16 training. Default True.
attention_softmax_in_fp32boolTrueCompute attention softmax in FP32. Default True.
FieldTypeDefaultDescription
attention_dropoutfloat0.0Attention dropout rate. Default 0.0.
hidden_dropoutfloat0.0Hidden layer dropout rate. Default 0.0.
FieldTypeDefaultDescription
rollout_temperaturefloat1.0Sampling temperature for rollouts. Default 1.0.
FieldTypeDefaultDescription
no_save_optimboolTrueSkip saving optimizer state. Default True.

Convert typed training defaults to Miles CLI flags.

Shlex-parse recipe_args + extra_args into a flat argv list.

resolved_rollout_num_gpus(self) -> 'int | None'

Section titled “resolved_rollout_num_gpus(self) -> 'int | None'”

Source: modal_training_gym/frameworks/miles/config.py