DatasetConfig

Dataset configuration shared across training frameworks.

from modal_training_gym.common.dataset import DatasetConfig

Dataset configuration shared across training frameworks.

Fields

Field	Type	Default
`dataset_id`	`str`	`""`
`input_key`	`str`	`""`
`label_key`	`str`	`""`
`apply_chat_template`	`bool`	`True`
`always_prepare`	`bool`	`False`

Load raw examples, optionally filtered by split.

Materialize training data to path (and eval splits to eval_paths).

Sniff what prepare() wrote and confirm the columns the framework will index.

Source: modal_training_gym/common/dataset.py