CivArchive
    MoanForge – MMAudio SFW+NSFW Audio Enhancer w/ Qwen TTS - v1.0
    NSFW

    Add killer audio to ANY clip - moaning/voice-over TTS, filthy SFX, breathy voices - or keep it clean with SFW sounds. No editing skills needed.

    There are descriptions in the workflow - READ THEM CAREFULLY.


    Yo creators - if you're tired of flat, boring video audio and want to crank immersion to 11 with AI-powered SFX layers, breathy moans, custom voices, wet sounds, slapping, and seamless mixing - this is your new go-to workflow.

    Built in ComfyUI specifically for NSFW / adult video generation. Takes any input clip, interpolates frames for smooth playback, generates ultra-realistic or exaggerated lewd sound effects via dual MMAudio branches (SFW for clean ambience & impacts, NSFW for gagging, slurping, sticky thrusts, heavy breathing), overlays expressive TTS dialogue/moans from Qwen3-TTS (voice cloning from reference or pure text-based design), and even lets you blend in the original video audio for that hybrid real-AI punch.

    No more manual editing in Audacity or Resolve - queue it up, tweak volumes/seeds/prompts, and export MP4s with timestamped filenames ready for your stash. Perfect for adult ASMR, scene enhancement, erotic animation, or just experimenting with er... sound design.

    ### Key Features
    - **Dual MMAudio Branches** - SFW (vanilla model for natural sounds) + NSFW (gold-tuned for explicit SFX). Mute groups to switch modes without errors.
    - **Qwen TTS Power** - Separate groups for Voice Design (text-based fantasy voices like “sultry breathy moans”) or Voice Cloning (from ref audio + transcript). Auto-trims to match video length.
    - **Geeky AudioMixer Core** - 4-track mixing with per-layer volume, start offsets, fades, master normalize/compress/limit. Crank original video audio or mute it via Primitive toggle.
    - **RIFE FPS Converter** - Auto-handles any input FPS → targets 25 fps (or your choice) for perfect MMAudio sync.
    - **Shared Seed Control** - One Primitive seeds MMAudio + TTS for consistent randomness across layers.

    ### How to Use
    1. Drop your video into VHS_LoadVideo.
    2. Tweak prompts:
    - MMAudio SFW and/or NSFW
    - TTS (Voice Design or Cloning)
    3. Mute groups for modes: Bypass SFW/NSFW/TTS as needed
    4. Adjust mixer volumes: audio_1 = main voice/TTS, optionals = SFX/original.
    5. Queue — outputs MP4 with embedded audio.

    ### Pro Tip – Pair with LTX2 (Latent Text-to-Video)
    LTX2 is blowing up for realistic lip sync and facial animation right now - but it lacks rich, layered lewd audio (moans, wet SFX, heavy breathing, gagging).
    Easy hybrid:
    1. Generate your talking-head / character video in LTX2.
    2. Load the LTX2 MP4 into this workflow's VHS_LoadVideo node.
    3. Run as usual - MMAudio adds filthy SFX layers, Qwen TTS overlays breathy moans/dialogue.
    4. Mix with LTX2 original audio (or mute it) - final export has perfect visual sync + dirty sound design.
    LTX2 does lips & face, this workflow does the lewd audio. Instant upgrade. 🔥

    ### Low-VRAM Tips (12GB seems to be a minimum):
    - Use **Qwen3-TTS 0.6B** instead of 1.7B → saves ~3–4 GB with only minor quality drop on short clips.
    - Set **unload_model_after_generate = true** in TTS nodes → unloads TTS immediately after generation.
    - Reduce **RIFE batch_size** to 4 or 2 → lowers peak during interpolation.
    - For tighter VRAM: mute RIFE group if not needed, or add **Unload All Models** node (ComfyUI-Unload-Model extension) after TTS before MMAudio.
    - Peak usage drops to ~9-11 GB with these tweaks - 8 GB cards are very tight (possible only with extreme cuts like no RIFE + tiny clips).

    ### Requirements (Custom Nodes via ComfyUI Manager)
    - comfyui-mmaudio (kijai/ComfyUI-MMAudio)
    - qwen3-tts-comfyui (flybirdxx/ComfyUI-Qwen-TTS or similar fork)
    - ComfyUI_Geeky_AudioMixer (GeekyGhost/ComfyUI_Geeky_AudioMixer)
    - ComfyUI-VideoHelperSuite (Kosinkadink/ComfyUI-VideoHelperSuite)
    - rgthree-comfy (rgthree/rgthree-comfy)
    - ComfyUI-VFI (Fannovel16/ComfyUI-Frame-Interpolation) for RIFE

    ### Models Needed (auto-download or manual from Hugging Face)
    - MMAudio SFW: kijai/mmaudio_large_44k_v2_fp16.safetensors
    - MMAudio NSFW: phazei/mmaudio_large_44k_nsfw_gold_8.5k_final_fp16.safetensors
    - Qwen3-TTS: Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice (or 0.6B for lower VRAM)

    Description

    FAQ

    Workflows
    Other

    Details

    Downloads
    80
    Platform
    CivitAI
    Platform Status
    Available
    Created
    1/30/2026
    Updated
    4/27/2026
    Deleted
    -

    Files

    moanforgeMmaudioSFWNSFW_v10.zip

    Mirrors

    Huggingface (1 mirrors)