API Reference
Section titled “API Reference”Complete reference for the training-gym Python library.
| Class | Description |
|---|---|
ModelConfiguration | Base class for model identity and weight-download logic. |
HFModelConfiguration | ModelConfiguration for models hosted on HuggingFace. |
ModelArchitecture | Transformer architecture parameters for a specific model. |
DatasetConfig | Dataset configuration shared across training frameworks. |
WandbConfig | Weights & Biases logging configuration shared across all frameworks. |
ModalRayCluster | Base class for bootstrapping a Ray cluster inside Modal clustered functions. |
LlmJudge | LLM-as-judge client for an OpenAI-compatible chat-completions endpoint. |
TrainResult | One completed training run’s checkpoint handle. |
Models
Section titled “Models”| Class | Description |
|---|---|
Qwen3-4B | Qwen3-4B (4 billion parameters) from Alibaba. |
Qwen3-32B | Qwen3-32B (32 billion parameters) from Alibaba. |
GLM-4.7 | GLM-4.7 large MoE model from Zhipu AI. |
Llama2-7B | Llama 2 7B from Meta. |
Kimi-K2.5 | Kimi K2.5 from Moonshot AI. |
Frameworks
Section titled “Frameworks”| Class | Description |
|---|---|
SlimeConfig | slime GRPO training configuration. |
ModalConfig (slime) | Modal infrastructure configuration for slime — image setup and dev overlays. |
MsSwiftFrameworkConfig | ms-swift Megatron SFT configuration, including Modal infrastructure. |
MsSwiftConfig | Top-level wrapper that composes an ms-swift Megatron SFT run. |
MilesFrameworkConfig | Miles RLVR configuration, including Modal infrastructure. |
MilesConfig | Top-level wrapper that composes a Miles RLVR training run. |
HarborFrameworkConfig | Harbor + Miles configuration for sandbox-based RL training. |
HarborConfig | Top-level wrapper that composes a Harbor + Miles RLVR training run. |