CivArchive
    Hunyuan video t2v with facedetailer POC - v1.2
    NSFW
    Preview 58841174
    Preview 58841185
    Preview 58841250
    Preview 59415799
    Preview 59425431

    HUNYUAN VIDEO FACEDETAILER UPDATE V1.2

    Bug fixing. The get latent size node doesen't work with hunyan latent video: it output a different video width breaking the image W/H ratio. Thanks GoonyBird for pointing out.

    ___________________________________________________________________________

    HUNYUAN VIDEO FACEDETAILER UPDATE V1.1

    I tried to do a few improvements in the facedetailer group mainly focused on the flickering that was annoying quite some people (including me) in many situations like small faces or slow motion.

    I used animatediff detector for the BBOX that averages the bbox over neighbor frames, increased the bbox dilation and decreased shift, conditioning and denoise.

    Hope it works a bit better that V1, it comes at the cost of a slightly lower face definition.

    Further optimizations can be done, they are highly dependant of the video produced in the first stage, in case of flickering I'd suggest to play in the facedetailer module with:

    -Simple detector for animatediff

    increase bbox_dilation (max 100)

    -Modelsampling

    Decrease shift (min 0)

    -Fluxguidance

    Decrease guidance (min 1)

    -Basicscheduler

    Decrease denoise below 0.4

    In case you cannot get rid of flickering I have created as an alternative a static bbox facedetailer. Basically it creates a static bbox as union of all possible position of the face in the video. As you may understand this option is only convenient in case of limited face movements in the video. In my opinion quality is a bit lower compared to dynamic bbox but it is flicker free and more robust to many contitions like face disappearing, side view, etc.

    .

    I've also improved few more things:

    -Changed the sampler to TTP_teacache. The ultra fast can be used for refining in intermediate V2V or facedetailer stage with small degradation but significant speed improvements

    -Improved the upscale that is now in line with x16 requirement of hunyuan video.

    -Fixed size (roughly 368x5xx) for the face to be processed by latest facedetailer sampler box, you can adjust if you have > 12Gb card.

    -Optimized the flow so you can separately select upscale, the two facedetailer flavours and the interpolator.

    -Highlighted in green the noder you likely want to adjust.

    Have fun!

    ____________________________________________________________________________________________

    Starting from excellent bonetrousers t2v workflow:

    https://civarchive.com/models/1092466/hunyuan-2step-t2v-and-upscale?modelVersionId=1294744

    I was wondering if something better could be done instead of a simple upscale as a last step.

    Taking inspiration from the facedetailer principle, it could be useful to take additional care to the face upscale which is often of small size, low resolution and tremebling, and in this case upscale would not help much.

    My idea is to bbox the face, cropping it, sendin to a separate Hunyuan detailer workflow (with an additional prompt dedicated to face) and uncrop it back to the original image.

    Image resize is to match the Hunyuan latent size before sending it to VAE encoder and to sampler and to have good resolution image to fed in to HunYuan refiner. After VAE encode, Hunyuan sampler and VAE decode, uncrop and feed to subsequent steps. That's it.

    I did some basic tests with the attached workflow, just as a proof of concept, and results seems promising.

    Of course it depends a lot on the face size (not much to do if face is very small). Workflow isn't coping very well with static subject, but after all we're talking abount animation here, or?

    I could see noticeable improvement in the subject details and less blurring or trembling. Few examples in this post (only the last one is interpolated)

    I'm a very noob Comfyui user, I argued few hours with image list and batches, and my workflow is very basic. I did not not much of finetune, so I believe there's quite some space for improvement by expert users. For example deciding the refiner video size based on the bbox size or even building a proper facedetailer node for Hunyuan.

    Depending on available time I'll try to improve it next weekend, for example adding it to excellent LatentDream allinone workflow.

    As usual coffes are welcome

    https://ko-fi.com/vamjoe

    Description

    HUNYUAN VIDEO FACEDETAILER UPDATE V1.2

    Bug fixing. The get latent size node doesen't work with hunyan latent video: it output a different video width breaking the image W/H ratio. Thanks GoonyBird for pointing out.

    FAQ

    Comments (26)

    sarashinaiFeb 19, 2025
    CivitAI

    What am I missing? In the workflow from the sample video, everything is disabled apart from the low resolution group...

    infusionminxFeb 19, 2025

    There's a control switch on the left to turn them on...

    sarashinaiFeb 19, 2025

    @Infernomax Oh, I can see that, but when I generate videos and save metadata, it saves what's been turned on and what hasn't. So, when I open yours, I'd expect more than the basic generation to be active...

    slimedFeb 23, 2025· 1 reaction

    the extension cable is controlled by the back door light switch, Clark

    sarashinaiFeb 23, 2025· 1 reaction

    @slimed There's a reference here that I just know I'm not getting...

    infusionminxFeb 19, 2025
    CivitAI

    Spent a whole night trying to get 1.1 working ahah, still getting an error message on the Batch Crop From Mask when starting with 320x320px. I get error: zero-size array to reduction operation minimum which has no identity - Can you help or explain how I am meant to feed in the image accurately sized for the mask detection and original?

    iljoe
    Author
    Feb 19, 2025

    Sorry to hear that. In the facedetailer (dynamic bbox) you get an error in the batch crop from mask when the face is not available in all frames of the video, because the mask batch get broken. Disable the group facedetailer (dynamic bbox) and user facedetailer (static bbox) which is less detailed but more robust. Also use version 1.2, version 1.1 had incorrect widh/height and might cause the image to be cut. Finally 320x320 is a bit high for a 12Gb, reduce the following upscales or you may get an oom.

    swinFeb 20, 2025

    In my situation, when I faced this error, I lowered the bbox_threshold to 0.5 in the Simple Detector node, and it resolved the issue. I was working with footage where the face had a small size/resolution, and I believe the high threshold prevented the system from accurately locating the faces in the video.

    3427221Feb 19, 2025
    CivitAI

    I like it so far, but no double block lora loader in your workflow, meaning after two loras (or let's say 3 max, you get blurry and more messy output, when you load with double block you can stack up to 5, maybe 6 loras and still maintain decent output

    Edit : finally couldn't run, at some point it stop with SimpleMath+

    SimpleMath.execute() got an unexpected keyword argument 'c' error

    swinFeb 20, 2025

    Try replacing the math node with the Math Expression from the comfyui-custom-scripts,it worked in my case.

    3427221Feb 20, 2025

    @swin Okay i'll try next time

    soundgateFeb 22, 2025· 1 reaction
    CivitAI

    This is one of the BEST workflows I've seen! Its fast- clear (if you don't mess to hard with lora's) and easy to understand! Thanks for sharing!

    slimedFeb 23, 2025

    lost me at to

    soundgateFeb 23, 2025

    @slimed I tend to run a lot of lora's- Its' not you're flow I think its me. This is still one of my fav workflow though!

    iljoe
    Author
    Feb 25, 2025· 1 reaction

    noArtifact suggested that double_block lora loader could work better, I'll give it a try

    soundgateFeb 25, 2025

    @iljoe hummmm.... any other suggestions? That's what I've been using =D. Thanks for the suggestion though! Keep em coming.

    mxcrookFeb 22, 2025
    CivitAI

    I found another variant of matte tracking you might find useful. Your end results look great (WAY better than mine). But maybe you can find a use for the method, cause I'm hitting a wall. https://civitai.com/models/1282180/hunyuan-masking-localized-detailing

    topspeedFeb 24, 2025
    CivitAI

    This looks awesome and got all the pieces. However, I though this ran on 12gb given "Fixed size (roughly 368x5xx) for the face to be processed by latest facedetailer sampler box, you can adjust if you have > 12Gb card."

    Anyone able to actually running this wf on 12gb? If so, any particular tips. I oom at the "TTP_TeaCache HunyuanVideo Sampler" node.

    iljoe
    Author
    Feb 25, 2025

    I have a 12Gb and it's working, but I'm using secondary card for display do I might have few hundreds of Mb of advantage on you. Which sampler is givin you issues? It 's the first hunyuan low resolution group try to decrease a bit the latent video size or the number of frames. If it's the intermadiate v2v group try to decrease the upscale.

    boxertwin75593Feb 25, 2025
    CivitAI

    This is amazing and is my go-to workflow!

    I've slightly improved when stacking LoRAs by lowering the character LoRA to 0 6 and then reloading only the character LoRA to 1.0 at the face detailing step. Can give greater response to motion LoRAs.

    One thing I do still occasionally get is clipping square (thin black line) around bbox, any suggestions?

    iljoe
    Author
    Feb 25, 2025· 1 reaction

    Thanks for the suggestion, I'm collecting hints from all you guys and eventually integrate them in an update. Black line is caused by face getting close/out of image border. Try to reduce bbox_dilation to have the bbox always inside the image, otherwise go for the static bbox group instead.

    boxertwin75593Feb 25, 2025

    Oh one more thing, when hands go all messed-up after upscale am I using the wrong upscale model?

    4880569Mar 18, 2025· 1 reaction
    CivitAI

    Hey Mate. Thank you, this is gold! I modify some node: delete BBOX and Upscale by model group. BBOX indeed makes face look sharper but somehow it differs from the original facial expression. And upscale by model I just don't like it. I am tweaking the Latent upscale and refinement (basically the same with latent upscale). Then I add interpolation and VFI as the cherry on top.

    thaddeuskMar 21, 2025
    CivitAI

    I keep getting this error on the Face Detailer steps:

    SimpleMath.execute() got an unexpected keyword argument 'c'

    Jaffer1979Apr 9, 2025
    CivitAI

    Thanks for sharing this I'm loving it.

    cooperdkJul 17, 2025
    CivitAI

    This won't load on a current CUI with nothing added.

    Workflows
    Hunyuan Video

    Details

    Downloads
    2,465
    Platform
    CivitAI
    Platform Status
    Available
    Created
    2/18/2025
    Updated
    5/13/2026
    Deleted
    -

    Files

    hunyuanVideoT2vWith_v12.zip

    Mirrors

    HuggingFace (1 mirrors)
    CivitAI (1 mirrors)