CivArchive
    Photo Background - 2d Compositing|写真背景・二次元合成 - v1.0 [hunyuan]
    NSFW

    Photo Background - 2d Compositing|写真背景・二次元合成

    Trained on 2d illustrations composited on a photo background.

    This is a small LoRA I thought would be interesting to see how models trained on illustrations or real world images/video can produce the composite, mixed reality effect.

    ℹ️ LoRA work best when applied to the base models on which they are trained. Please read the About This Version on the appropriate base models and workflow/training information.

    Metadata is included in all uploaded files, you can drag the generated videos into ComfyUI to use the embedded workflows.

    Recommended prompt structure:

    Positive prompt (trigger at the end of prompt, before quality tags for non-hunyaun versions):

    {{tags}}
    real world location, photo background,
    masterpiece, best quality, very awa, absurdres

    Negative prompt:

    (worst quality, low quality, sketch:1.1), error, bad anatomy, bad hands, watermark, ugly, distorted, censored, lowres

    Description

    Trained with https://github.com/tdrussell/diffusion-pipe

    Training data consists of:

    • 37 images as a combination of

      • Images used from other versions this model card

      • Images extracted as keyframes from several videos

    • 23 video clips ~70 frames each

      • 70 frames was too long for the 368 resolution for videos (exceeded 24gb vram)

    Training configs:

    dataset.toml

    # Aspect ratio bucketing settings
    enable_ar_bucket = true
    min_ar = 0.5
    max_ar = 2.0
    num_ar_buckets = 7
    
    # Frame buckets (1 is for images)
    frame_buckets = [1]
    
    [[directory]]
    # Set this to where your dataset is
    path = '/mnt/d/huanvideo/training_data/images'
    # Reduce as necessary
    num_repeats = 5
    
    [[directory]] # IMAGES
    # Path to the directory containing images and their corresponding caption files.
    path = '/mnt/d/huanvideo/training_data/images'
    num_repeats = 5
    resolutions = [1024]
    frame_buckets = [1] # Use 1 frame for images.
    
    
    [[directory]] # VIDEOS
    # Path to the directory containing videos and their corresponding caption files.
    path = '/mnt/d/huanvideo/training_data/videos'
    num_repeats = 5
    resolutions = [368] 
    frame_buckets = [33, 49, 81] # Define frame buckets for videos.

    config.toml

    # Dataset config file.
    output_dir = '/mnt/d/huanvideo/training_output'
    dataset = 'dataset.toml'
    
    # Training settings
    epochs = 50
    micro_batch_size_per_gpu = 1
    pipeline_stages = 1
    gradient_accumulation_steps = 4
    gradient_clipping = 1.0
    warmup_steps = 100
    
    # eval settings
    eval_every_n_epochs = 5
    eval_before_first_step = true
    eval_micro_batch_size_per_gpu = 1
    eval_gradient_accumulation_steps = 1
    
    # misc settings
    save_every_n_epochs = 15
    checkpoint_every_n_minutes = 30
    activation_checkpointing = true
    partition_method = 'parameters'
    save_dtype = 'bfloat16'
    caching_batch_size = 1
    steps_per_print = 1
    video_clip_mode = 'single_middle'
    
    [model]
    type = 'hunyuan-video'
    
    transformer_path = '/mnt/d/huanvideo/models/diffusion_models/hunyuan_video_720_cfgdistill_fp8_e4m3fn.safetensors'
    vae_path = '/mnt/d/huanvideo/models/vae/hunyuan_video_vae_bf16.safetensors'
    llm_path = '/mnt/d/huanvideo/models/llm'
    clip_path = '/mnt/d/huanvideo/models/clip'
    
    dtype = 'bfloat16'
    transformer_dtype = 'float8'
    timestep_sample_method = 'logit_normal'
    
    [adapter]
    type = 'lora'
    rank = 32
    dtype = 'bfloat16'
    
    [optimizer]
    type = 'adamw_optimi'
    lr = 5e-5
    betas = [0.9, 0.99]
    weight_decay = 0.02
    eps = 1e-8
    LORA
    Hunyuan Video

    Details

    Downloads
    337
    Platform
    CivitAI
    Platform Status
    Available
    Created
    2/20/2025
    Updated
    3/31/2026
    Deleted
    -
    Trigger Words:
    photo background
    real world location