CogVideoX-v1.5-5B I2V workflow for lazy people (Including low VRAM)

CogVideoX-v1.5-5B I2V workflow for lazy people (Including low VRAM) - Joy caption

NSFW

Update the Florence version：Many people encounter dependency errors when using the Joy Caption plugin. I use Florence as a replacement—it’s easier for beginners to avoid these issues.

Not an upgraded version of the previous one.

By using an LLM to write prompts for CogVideoX-v1.5-5B I2V, it helps those who don't know how to write prompts or are too lazy to do so make better use of CogVideoX v1.5. It also allows users to choose to add guiding prompts or turn off the LLM feature and write prompts entirely on their own.

Although the v1.5 version supports any resolution, there are still differences in quality depending on the resolution. You can test multiple resolutions to find the best one.

For low VRAM users, please keep the following features enabled.

The GGUF version doesn't perform that well based on my tests, but v1.5 is still being updated, so we can expect better results in future versions.

If this happens, it's because the LLM prompt is too long. You can change the random seed to regenerate, or modify the values below to reduce the tokens.

Description

FAQ

Comments (16)

hunzmusicNov 21, 2024· 3 reactions

CivitAI

Love this workflow, esp the prompts being driven by LLMs and producing the best eye candy animations I've seen from this. WD!

In this set up you have it set to 6sec (num_frame 49) at 8fps. One thing you might want to play with is this 1.5 model has been trained at 5sec (num_frame 81) or 10sec (num_frame 161) and at 16fps.

Thank you for posting this, it opened up a lot of things I didn't understand.

Cyberai99Nov 22, 2024· 1 reaction

CivitAI

what folder do i put the Meta-Llama-3-8B-Instruct-Q4_K_S.gguf. file?

Boodengs

Author

Nov 22, 2024

models\LLavacheckpoints

sestadorNov 22, 2024· 1 reaction

CivitAI

Joy Caption seems to expect there is a Folder named "Joy_caption/image_adapter.pt" in the models folder of comfyui. But there is not, do you know how to fix?

Boodengs

Author

Nov 22, 2024

models/Joy_caption, You should create this folder yourself.

starmanjNov 24, 2024

@Charine But there is still no image_adapter.pt file. And odd that the node installation doesn't create the folder! And when I create the folder and download image_adapter.pt from joy caption alpha2, I get this error: Error(s) in loading state_dict for ImageAdapter: Unexpected key(s) in state_dict: "other_tokens.weight"

lemniscatainfini3561Nov 25, 2024

@Charine same error

Boodengs

Author

Nov 25, 2024

@starmanj I haven’t encountered this issue, so I’m not sure where the error is. However, you can use the Florence node as a replacement for this one.

Boodengs

Author

Nov 25, 2024

@lemniscatainfini3561 I haven’t encountered this issue, so I’m not sure where the error is. However, you can use the Florence node as a replacement for this one.

Boodengs

Author

Nov 25, 2024

@starmanj I’ve shared a Florence version that avoids dependency issues.

Boodengs

Author

Nov 25, 2024· 1 reaction

@lemniscatainfini3561 I’ve shared a Florence version that avoids dependency issues.

lemniscatainfini3561Nov 25, 2024

@Charine Thank you!

HenryS45Dec 31, 2024

You should be able to use this:

https://huggingface.co/spaces/fancyfeast/joy-caption-pre-alpha/blob/main/wpkklhc6/image_adapter.pt

download and put in the folder.

Then likely you'll be met with a new error Joy_caption Expected device_type of type str, got: <class 'torch.device'>

What you can do is to

Open the file C:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI_HF_Servelress_Inference\nodes\Joy_Caption.py with a text editor and change line 171 from

with torch.amp.autocast_mode.autocast(DEVICE, enabled=True):

with torch.amp.autocast_mode.autocast('cuda', enabled=True):

After a restart of ComfyUI the problem should be solved.

Credits to original post I found here:
https://www.reddit.com/r/comfyui/comments/1f0amyp/deleted_by_user/

moehawkNov 24, 2024

CivitAI

Is there a way to make this video loop seamlessly ?

Boodengs

Author

Nov 25, 2024· 1 reaction

You can try using the CogvideoX-5b version or the CogvideoX-fun-v1.1 version with Interpolation to input the same image as the first and last frame for a loop.

moehawkNov 25, 2024

@Charine I will give it a try, thank you for your time

Workflows

Other

by Boodengs

Download (Beta) View on CivitAI