GPT-4o has been released, and I’m joining the excitement by enabling my comfyui agent open-source project to support GPT-4o integration into comfyui, achieving visual functions.
The project address is:heshengtao/comfyui_LLM_party: A set of block-based LLM agent node libraries designed for ComfyUI development.(一组面向comfyui开发的积木化LLM智能体节点库)
In my open-source project, you can use these features:
You can right-click in the comfyui interface, select
llm
from the context menu, and you will find the nodes for this project. [how to use nodes](how_to_use_nodes.md)Supports API integration or local large model integration. Modular implementation for tool invocation.When entering the base_url, please use a URL that ends with
/v1/
.You can use [ollama](https://github.com/ollama/ollama) to manage your model. Then, enterhttp://localhost:11434/v1/
for the base_url, ollama for the api_key, and your model name for the model_name, such as: llama3. If the call fails with a 503 error, you can try turning off the proxy server.Local knowledge base integration with RAG support.
Ability to invoke code interpreters.
Enables online queries, including Google search support.
Implement conditional statements within ComfyUI to categorize user queries and provide targeted responses.
Supports looping links for large models, allowing two large models to engage in debates.
Attach any persona mask, customize prompt templates.
Supports various tool invocations, including weather lookup, time lookup, knowledge base, code execution, web search, and single-page search.
Use LLM as a tool node.
Rapidly develop your own web applications using API + Streamlit.The picture below is an example of a drawing application.
Added a dangerous omnipotent interpreter node that allows the large model to perform any task.
It is recommended to use the
show_text
node under thefunction
submenu of the right-click menu as the display output for the LLM node.