Skip to content

API Reference

API reference for training-gym public classes.

Complete reference for the training-gym Python library.

Core

Class	Description
`ModelConfig`	Base class for model identity and weight-download logic.
`HFModelConfiguration`	ModelConfig for models hosted on HuggingFace.
`ModelArchitecture`	Transformer architecture parameters for a specific model.
`DatasetConfig`	Dataset configuration shared across training frameworks.
`HuggingFaceDataset`	Dataset backed by a HuggingFace `datasets` repo.
`HarborDataset`	Dataset backed by a Harbor task directory structure.
`WandbConfig`	Weights & Biases logging configuration shared across all frameworks.
`ModalRayCluster`	Base class for bootstrapping a Ray cluster inside Modal clustered functions.
`TrainResult`	One completed training run’s checkpoint handle.

Evaluation

Class	Description
`EvalConfig`	Evaluate a deployed model on a dataset config.
`EvalResult`	Saved results for one evaluation run across a dataset.
`EvalRowResult`	One model interaction: the prompt, the raw response, its parsed
`HarborEval`	Evaluate a deployed model on a Harbor dataset using sandbox execution.

Models

Class	Description
`ToolCall`	A parsed tool invocation from model output.
`ParsedResponse`	Structured result of parsing raw model output.
`parse_qwen3_response`	Parse Qwen3-family model output into structured content.
`Qwen3-0.6B`	Qwen3-0.6B (0.6 billion parameters) from Alibaba.
`Qwen3-1.7B`	Qwen3-1.7B (1.7 billion parameters) from Alibaba.
`Qwen3-4B`	Qwen3-4B (4 billion parameters) from Alibaba.
`Qwen3-8B`	Qwen3-8B (8 billion parameters) from Alibaba.
`Qwen3-14B`	Qwen3-14B (14 billion parameters) from Alibaba.
`Qwen3-30B-A3B`	Qwen3-30B-A3B (30B total, ~3B active) MoE model from Alibaba.
`Qwen3-32B`	Qwen3-32B (32 billion parameters) from Alibaba.
`Qwen3.6-27B`	Qwen3.6-27B (27B-parameter dense) model from Alibaba.
`Qwen3.6-35B-A3B`	Qwen3.6-35B-A3B (35B total, ~3B active) MoE model from Alibaba.

Training

Class	Description
`TrainConfig`	Compose dataset, model, and recipe into one training entrypoint.
`MultiTurn`	Configure multi-turn rollout for conversational RL training.
`SlimeRecipe`	Recipe dataclass for configuring slime GRPO training on Modal.
`Qwen3_6_27b_Recipe`	Qwen3.6-27B dense hybrid model on 1×8×H100 with TP4×PP2, colocated GRPO.
`Qwen3_6_35b_Recipe`	Qwen3.6-35B-A3B (MoE) on 1×8×H100 with TP2/PP2/CP1/EP4.

Deployment

Class	Description
`DeploymentConfig`	Deploy a model behind a serving engine.
`ModelDeployment`	A deployed model endpoint.
`SglangRecipe`	SGLang serving configuration.
`VllmRecipe`	vLLM serving configuration.