from modal_training_gym.common.harbor.eval import HarborEvalEvaluate a deployed model on a Harbor dataset using sandbox execution.
Inherits from: EvalConfig
Fields
Section titled “Fields”| Field | Type | Default | Description |
|---|---|---|---|
dataset | 'DatasetConfig' | ||
eval_fn | EvalFn | None | None | |
eval_response_fn | EvalResponseFn | None | None | |
prompt_column | str | None | None | |
eval_config_id | str | None | None | |
generate_kwargs | dict[str, Any] | {} | |
model | 'ModelConfig | None' | None | |
test_cases | list[dict[str, str]] | None | None | |
sandbox_timeout | int | 60 | |
sandbox_cpu | float | 1.0 | |
sandbox_memory | int | 1024 | |
sandbox_cpu_policy | str | "limit" | |
sandbox_memory_policy | str | "limit" | |
sandbox_python_version | str | "3.11" | |
extract_code_fn | Callable[[str], str] | None | None |