Skip to content
GitHub
View on GitHub

VllmRecipe

vLLM serving configuration.

from modal_training_gym.deploy_recipes.vllm_recipe import VllmRecipe

vLLM serving configuration.

Inherits from: BaseDeployRecipe

FieldTypeDefaultDescription
recipe_typeDeployRecipeTypevllm
gpuOptional[Literal['H100', 'H200', 'B200', 'B300']]NoneGPU type for the serving container. Default None (inferred).
n_gpuint | NoneNoneNumber of GPUs (tensor-parallel degree for vLLM). Default None.
extra_vllm_argslist[str] | NoneNoneAdditional CLI args passed to vllm serve. Default None.
environment_namestr | NoneNoneModal environment to deploy into. Default None.
deploy_strategystr"rolling"Modal deployment strategy. Default "rolling".

Source: modal_training_gym/deploy_recipes/vllm_recipe.py