vllm.entrypoints.serve.disagg.protocol ¶
GenerateRequest ¶
Bases: BaseModel
Source code in vllm/entrypoints/serve/disagg/protocol.py
features class-attribute instance-attribute ¶
features: MultiModalFeatures | None = None
Multimodal hashes and placeholder positions (populated for MM inputs).
sampling_params instance-attribute ¶
sampling_params: SamplingParams
The sampling parameters for the model.
MultiModalFeatures ¶
Bases: BaseModel
Lightweight multimodal metadata produced by the render step.
Carries hashes (for cache lookup / identification) and placeholder positions so the downstream /generate service knows where in the token sequence each multimodal item lives.
.. note:: Phase 1 — metadata only. Phase 2 should add mm_kwargs (processed tensor data) using a binary transport so the /generate side can skip re-processing. The /generate endpoint must also be updated to inject these features into ProcessorInputs before passing to InputProcessor.process_inputs.
Source code in vllm/entrypoints/serve/disagg/protocol.py
PlaceholderRangeInfo ¶
Bases: BaseModel
Serializable placeholder location for a single multi-modal item.