Skip to content
GitHub
View on GitHub

EvalConfig

Evaluate a deployed model on a dataset config.

from modal_training_gym.common.eval import EvalConfig

Evaluate a deployed model on a dataset config.

FieldTypeDefaultDescription
dataset'DatasetConfig'
eval_fnEvalFn | NoneNone
eval_response_fnEvalResponseFn | NoneNone
prompt_columnstr | NoneNone
eval_config_idstr | NoneNone
generate_kwargsdict[str, Any]{}

build_prompt(self, row: 'DatasetRow') -> 'str'

Section titled “build_prompt(self, row: 'DatasetRow') -> 'str'”

evaluate(self, deployment: "'ModelDeployment'", debug: 'bool' = False, max_concurrency: 'int' = 1) -> 'EvalResult'

Section titled “evaluate(self, deployment: "'ModelDeployment'", debug: 'bool' = False, max_concurrency: 'int' = 1) -> 'EvalResult'”

Source: modal_training_gym/common/eval.py