vLLM Prompt Node for ComfyUI
https://www.theoath.studio/projects/comfy-vllm-node?utm_source=civitai
A ComfyUI custom node that generates Stable Diffusion prompts using a locally running vLLM server. Supports wildcard expansion and a fixed prefix for quality tags or style anchors.
Installation
Clone or copy this folder into your
ComfyUI/custom_nodes/directory:
cd ComfyUI/custom_nodes
git clone https://github.com/OATH-Studio/comfy-vLLMRestart ComfyUI.
Requirements
A running vLLM server (see vLLM docs)
Python package:
requests(pip install requests)ComfyUI
Setup
Start your local vLLM server. The node will automatically detect whichever model is currently loaded. No need to specify it in the node.
Example launch:
vllm serve ./models/Qwen2.5-3B \
--host 0.0.0.0 \
--port 8765 \
--served-model-name Qwen2.5-3BNote: The node queries /v1/models on each generation and uses the first model returned. If you change models, restart your vLLM server. The node picks it up automatically.
Node Inputs
InputTypeDefaultDescriptionpromptSTRINGGeneration instruction. Supports {wild|card} syntax.prefixSTRINGmasterpiece, best quality, highresFixed tags prepended to the output. Not sent to the model.hostSTRINGlocalhostvLLM server host.portINT8765vLLM server port.max_tokensINT128Maximum tokens to generate.temperatureFLOAT0.7Sampling temperature. Higher = more creative.retriesINT3How many times to retry on empty or failed responses.
Node Output
OutputTypeDescriptioncombined_promptSTRINGprefix + generated text, ready to wire into CLIPTextEncode
The node displays a live preview after each generation showing:
Prefix
Raw generated text
Final combined string
Wildcard Syntax
Use {option1|option2|option3} anywhere in your prompt. One option is chosen at random each run. Multiple wildcards are resolved independently.
A {red|blue|green} dragon, {breathing fire into the sky|coiled around a mountain peak in a storm|diving into a glowing ocean abyss|rearing up against a blood moon}Wildcards are expanded before the prompt is sent to the model, so the model always receives a fully resolved string.
Example Workflow
VLLMPromptNode ──→ CLIPTextEncode (positive) ──→ KSampler
↑
CLIPTextEncode (negative) ───┘Prompt Format
The node uses the completions endpoint with a structured format that forces the model to return comma-separated tags only:
### Stable Diffusion prompt tags (comma separated, no sentences):
Input: <your expanded prompt>
Output:Generation stops at the first newline, preventing extra text.
If conversational output appears:
Lower temperature to
0.3–0.5Use a larger model (≥ 1.5B recommended)
Reduce
max_tokens
Model Recommendations
ModelQualityNotesQwen2.5-0.5B⚠️ UnreliableToo small for consistent instruction followingQwen2.5-1.5B✓ UsableOccasional filler, mostly cleanQwen2.5-3B✓✓ RecommendedClean output, follows format reliablyQwen2.5-32B✓✓✓ BestOverkill but flawless
Tested With
vLLM 0.4+
Qwen2.5-0.5B, Qwen2.5-1.5B, Qwen2.5-3B
ComfyUI (latest)
