ComfyUI wrapper nodes for HunyuanVideo
This repository provides a wrapper for integrating HunyuanVideo into ComfyUI, allowing you to generate high-quality video content using advanced AI models.
Before you begin, ensure you have the following installed:
git-lfscbmffmpeg
You can install these prerequisites using the following command:
sudo apt-get update && sudo apt-get install git-lfs cbm ffmpeg-
Install
comfy-cli:pip install comfy-cli
-
Initialize ComfyUI:
#wget https://files.pythonhosted.org/packages/30/e8/a390dd2e83f468327b944bacc5cd2e787e0151f690fec9682a78130a488f/comfyui_frontend_package-1.21.6-py3-none-any.whl wget https://files.pythonhosted.org/packages/6f/41/23e60b0dac42da9a6a264a1a9a82046283aeddbe522717c14be4e85421fd/comfyui_frontend_package-1.21.7-py3-none-any.whl pip install comfyui_frontend_package-1.21.7-py3-none-any.whl pip uninstall questionary pip install "questionary<2.1.0" pip install --upgrade typer comfy --here install comfy --here update
-
Clone and Install ComfyScript:
- 由于 comfyui 从 0.3.60 开始大范围替换 schema 为v3 导致使用 ComfyScript 会出现节点 输入输出定义的 conflict 问题
- 要想正常使用需要退回版本到 comfyui 0.3.59
- https://github.com/Chaoses-Ib/ComfyScript/commit/a54e894e4d33b899f3663e6f12bba71c7241411a
cd ComfyUI/custom_nodes git clone https://github.com/Chaoses-Ib/ComfyScript.git cd ComfyScript pip install -e ".[default,cli]" pip uninstall aiohttp pip install -U aiohttp
- replace by
comfy run --workflow video_wan2_2_14B_fun_camera_api_2.json --wait
-
Clone and Install ComfyUI-HunyuanVideoWrapper:
cd ../ git clone https://github.com/kijai/ComfyUI-HunyuanVideoWrapper cd ComfyUI-HunyuanVideoWrapper pip install -r requirements.txt
-
Load ComfyScript Runtime:
from comfy_script.runtime import * load() from comfy_script.runtime.nodes import *
-
Install Example Dependencies:
cd examples comfy node install-deps --workflow=hyvideo_t2v_example_01.json -
Update ComfyUI Dependencies:
cd ../../ComfyUI pip install --upgrade torch torchvision torchaudio -r requirements.txt -
Transpile Example Workflow:
python -m comfy_script.transpile hyvideo_t2v_example_01.json
-
Download and Place Model Files:
Download the required model files from Hugging Face:
huggingface-cli download Kijai/HunyuanVideo_comfy --local-dir ./HunyuanVideo_comfyCopy the downloaded files to the appropriate directories:
cp -r HunyuanVideo_comfy/ .
cp HunyuanVideo_comfy/hunyuan_video_720_cfgdistill_fp8_e4m3fn.safetensors ComfyUI/models/diffusion_models
cp HunyuanVideo_comfy/hunyuan_video_vae_bf16.safetensors ComfyUI/models/vae-
Run the Example Script:
Create a Python script
run_t2v.py:from comfy_script.runtime import * load() from comfy_script.runtime.nodes import * with Workflow(): vae = HyVideoVAELoader(r'hunyuan_video_vae_bf16.safetensors', 'bf16', None) model = HyVideoModelLoader(r'hunyuan_video_720_cfgdistill_fp8_e4m3fn.safetensors', 'bf16', 'fp8_e4m3fn', 'offload_device', 'sdpa', None, None, None) hyvid_text_encoder = DownloadAndLoadHyVideoTextEncoder('Kijai/llava-llama-3-8b-text-encoder-tokenizer', 'openai/clip-vit-large-patch14', 'fp16', False, 2, 'disabled') hyvid_embeds = HyVideoTextEncode(hyvid_text_encoder, '''high quality nature video of a red panda balancing on a bamboo stick while a bird lands on the panda's head, there's a waterfall in the background''', 'bad quality video', 'video', None, None, None) samples = HyVideoSampler(model, hyvid_embeds, 512, 320, 85, 30, 6, 9, 6, 1, None, 1, None) images = HyVideoDecode(vae, samples, True, 64, 256, True) _ = VHSVideoCombine(images, 24, 0, 'HunyuanVideo', 'video/h264-mp4', False, True, None, None, None, pix_fmt = 'yuv420p', crf=19, save_metadata = True, trim_to_audio = False)
Run the script:
python run_t2v.py
- prompt = "high quality nature video of a red panda balancing on a bamboo stick while a bird lands on the panda's head, there's a waterfall in the background"
HunyuanVideo_00003.mp4
- prompt = "high quality anime-style video of a chibi cat with big sparkling eyes, wearing a magical hat, holding a wand, and surrounded by glowing magical orbs, in a lush enchanted forest with floating cherry blossoms and a sparkling stream in the background"
HunyuanVideo_00004.mp4
huggingface-cli download --resume-download Kijai/HunyuanVideo_comfy hunyuan_video_720_cfgdistill_fp8_e4m3fn.safetensors --local-dir ComfyUI/models/diffusion_models --local-dir-use-symlinks False
huggingface-cli download --resume-download Kijai/HunyuanVideo_comfy hunyuan_video_vae_bf16.safetensors --local-dir ComfyUI/models/vae --local-dir-use-symlinks False
huggingface-cli download --resume-download Kijai/HunyuanVideo_comfy hunyuan_video_custom_720p_fp8_scaled.safetensors --local-dir ComfyUI/models/diffusion_models --local-dir-use-symlinks False
cp HunyuanVideo_comfy/hunyuan_video_custom_720p_fp8_scaled.safetensors ComfyUI/models/diffusion_models
wget https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/resolve/main/split_files/clip_vision/llava_llama3_vision.safetensors
cp llava_llama3_vision.safetensors ComfyUI/models/clip_vision
wget https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/resolve/main/split_files/text_encoders/llava_llama3_fp8_scaled.safetensors
cp llava_llama3_fp8_scaled.safetensors ComfyUI/models/clip
wget https://huggingface.co/camenduru/FLUX.1-dev/resolve/main/clip_l.safetensors
cp clip_l.safetensors ComfyUI/models/clip
pip install sageattention
export HF_ENDPOINT=https://hf-mirror.com
comfy launch -- --listen 0.0.0.0###### python -m comfy_script.transpile hyvideo_custom_testing_01_edit.json
from comfy_script.runtime import *
load()
from comfy_script.runtime.nodes import *
image_path = "爱可菲.webp"
prompt = 'Realistic, High-quality. A woman is boxing with a panda, and they are at a stalemate.'
with Workflow():
image, _ = LoadImage(image_path)
image, width, height = ImageResizeKJv2(image, 896, 512, 'lanczos', 'pad', '255,255,255', 'center', 16)
PreviewImage(image)
# _ = HyVideoTeaCache(0.10000000000000002, 'offload_device', 0, -1)
vae = HyVideoVAELoader('hunyuan_video_vae_bf16.safetensors', 'bf16', None)
torch_compile_args = HyVideoTorchCompileSettings('inductor', False, 'default', False, 64, True, True, False, False, False)
block_swap_args = HyVideoBlockSwap(20, 0, False, False)
model = HyVideoModelLoader('hunyuan_video_custom_720p_fp8_scaled.safetensors', 'bf16', 'fp8_scaled', 'offload_device', 'sageattn', torch_compile_args, block_swap_args, None, False, True)
clip = DualCLIPLoader('clip_l.safetensors', 'llava_llama3_fp8_scaled.safetensors', 'hunyuan_video', 'default')
clip_vision = CLIPVisionLoader('llava_llama3_vision.safetensors')
clip_vision_output = CLIPVisionEncode(clip_vision, image, 'center')
conditioning = TextEncodeHunyuanVideoImageToVideo(clip, clip_vision_output, prompt, 2)
conditioning2 = TextEncodeHunyuanVideoImageToVideo(clip, clip_vision_output, 'Aerial view, aerial view, overexposed, low quality, deformation, a poor composition, bad hands, bad teeth, bad eyes, bad limbs, distortion, blurring, text, subtitles, static, picture, black border.', 2)
hyvid_embeds = HyVideoTextEmbedBridge(conditioning, 7.500000000000002, 0, 1, False, True, conditioning2)
samples = HyVideoEncode(vae, image, False, 64, 256, True, 0, 1, 'sample')
samples = HyVideoSampler(model, hyvid_embeds, width, height, 85, 30, 0, 13.000000000000002, 2, True, None, samples, 1, None, None, None, None, 'FlowMatchDiscreteScheduler', 0, 'dynamic', None, None, None, None)
images = HyVideoDecode(vae, samples, True, 64, 256, True, 0, False)
images2 = ImageConcatMulti(2, images, image, 'left', False)
_ = VHSVideoCombine(images2, 24, 0, 'HunyuanVideoCustom_wrapper', 'video/h264-mp4', False, False, None, None, None)
from datasets import load_dataset
ds = load_dataset("svjack/daily-actions-locations-en-zh")
df = ds["train"].to_pandas()
df.to_csv("en_action.csv", index = False)
vim run_akf.py
import os
import time
import pandas as pd
import subprocess
from pathlib import Path
from itertools import product
# Configuration
SEEDS = [42]
IMAGE_PATHS = ['npl.jpg'] # Using the new image path
OUTPUT_DIR = 'ComfyUI/temp'
CSV_PATH = 'en_action.csv'
PYTHON_PATH = '/environment/miniconda3/envs/system/bin/python'
def get_latest_output_count():
"""Return the number of MP4 files in the output directory"""
try:
return len(list(Path(OUTPUT_DIR).glob('*.mp4')))
except:
return 0
def wait_for_new_output(initial_count):
"""Wait until a new MP4 file appears in the output directory"""
timeout = 300 # Increased timeout for video generation (5 minutes)
start_time = time.time()
while time.time() - start_time < timeout:
current_count = get_latest_output_count()
if current_count > initial_count:
time.sleep(1) # additional 1 second delay
return True
time.sleep(0.5)
return False
def generate_script(image_path, seed, action):
"""Generate the Hunyuan Video script with the given parameters"""
prompt = f'Realistic, High-quality. the man {action}'
script_content = f"""from comfy_script.runtime import *
load()
from comfy_script.runtime.nodes import *
image_path = "{image_path}"
prompt = '{prompt}'
with Workflow():
image, _ = LoadImage(image_path)
image, width, height = ImageResizeKJv2(image, 896, 512, 'lanczos', 'pad', '255,255,255', 'center', 16)
PreviewImage(image)
# _ = HyVideoTeaCache(0.10000000000000002, 'offload_device', 0, -1)
vae = HyVideoVAELoader('hunyuan_video_vae_bf16.safetensors', 'bf16', None)
torch_compile_args = HyVideoTorchCompileSettings('inductor', False, 'default', False, 64, True, True, False, False, False)
block_swap_args = HyVideoBlockSwap(20, 0, False, False)
model = HyVideoModelLoader('hunyuan_video_custom_720p_fp8_scaled.safetensors', 'bf16', 'fp8_scaled', 'offload_device', 'sageattn', torch_compile_args, block_swap_args, None, False, True)
clip = DualCLIPLoader('clip_l.safetensors', 'llava_llama3_fp8_scaled.safetensors', 'hunyuan_video', 'default')
clip_vision = CLIPVisionLoader('llava_llama3_vision.safetensors')
clip_vision_output = CLIPVisionEncode(clip_vision, image, 'center')
conditioning = TextEncodeHunyuanVideoImageToVideo(clip, clip_vision_output, prompt, 2)
conditioning2 = TextEncodeHunyuanVideoImageToVideo(clip, clip_vision_output, 'Aerial view, aerial view, overexposed, low quality, deformation, a poor composition, bad hands, bad teeth, bad eyes, bad limbs, distortion, blurring, text, subtitles, static, picture, black border.', 2)
hyvid_embeds = HyVideoTextEmbedBridge(conditioning, 7.500000000000002, 0, 1, False, True, conditioning2)
samples = HyVideoEncode(vae, image, False, 64, 256, True, 0, 1, 'sample')
samples = HyVideoSampler(model, hyvid_embeds, width, height, 85, 30, 0, 13.000000000000002, 2, True, None, samples, 1, None, None, None, None, 'FlowMatchDiscreteScheduler', 0, 'dynamic', None, None, None, None)
images = HyVideoDecode(vae, samples, True, 64, 256, True, 0, False)
images2 = ImageConcatMulti(2, images, image, 'left', False)
_ = VHSVideoCombine(images2, 24, 0, 'HunyuanVideoCustom_wrapper', 'video/h264-mp4', False, False, None, None, None)
"""
return script_content
def main():
# Load actions from CSV
try:
actions = pd.read_csv(CSV_PATH)["en_action"].tolist()
except Exception as e:
print(f"Error loading CSV file: {e}")
return
# Ensure output directory exists
os.makedirs(OUTPUT_DIR, exist_ok=True)
# Generate all combinations of seeds and image paths
seed_image_combinations = list(product(SEEDS, IMAGE_PATHS))
# Main generation loop
for action in actions:
for seed, image_path in seed_image_combinations:
# Generate script
script = generate_script(image_path, seed, action)
# Write script to file
with open('run_hunyuan_video.py', 'w') as f:
f.write(script)
# Get current output count before running
initial_count = get_latest_output_count()
# Run the script
print(f"Generating video with action: {action}, seed: {seed}, image: {image_path}")
subprocess.run([PYTHON_PATH, 'run_hunyuan_video.py'])
# Wait for new output
if not wait_for_new_output(initial_count):
print("Timeout waiting for new output. Continuing to next generation.")
continue
if __name__ == "__main__":
main()HunyuanVideoCustom_wrapper_00001.mp4
git clone https://huggingface.co/datasets/svjack/Xiang_InfiniteYou_Handsome_Pics_Captioned
import os
import time
import pandas as pd
import subprocess
from pathlib import Path
from itertools import product
from datasets import load_dataset
from PIL import Image
# Configuration
SEEDS = [42]
OUTPUT_DIR = 'ComfyUI/temp'
INPUT_DIR = 'ComfyUI/input'
PYTHON_PATH = '/environment/miniconda3/bin/python'
def get_latest_output_count():
"""Return the number of MP4 files in the output directory"""
try:
return len(list(Path(OUTPUT_DIR).glob('*.mp4')))
except:
return 0
def wait_for_new_output(initial_count):
"""Wait until a new MP4 file appears in the output directory"""
timeout = 3000 # Increased timeout for video generation (5 minutes)
start_time = time.time()
while time.time() - start_time < timeout:
current_count = get_latest_output_count()
if current_count > initial_count:
time.sleep(1) # additional 1 second delay
return True
time.sleep(0.5)
return False
def generate_script(image_path, seed, prompt):
"""Generate the Hunyuan Video script with the given parameters"""
script_content = f"""from comfy_script.runtime import *
load()
from comfy_script.runtime.nodes import *
image_path = "{image_path}"
prompt = '{prompt}'
with Workflow():
image, _ = LoadImage(image_path)
image, width, height = ImageResizeKJv2(image, 896, 512, 'lanczos', 'pad', '255,255,255', 'center', 16)
PreviewImage(image)
# _ = HyVideoTeaCache(0.10000000000000002, 'offload_device', 0, -1)
vae = HyVideoVAELoader('hunyuan_video_vae_bf16.safetensors', 'bf16', None)
torch_compile_args = HyVideoTorchCompileSettings('inductor', False, 'default', False, 64, True, True, False, False, False)
block_swap_args = HyVideoBlockSwap(20, 0, False, False)
model = HyVideoModelLoader('hunyuan_video_custom_720p_fp8_scaled.safetensors', 'bf16', 'fp8_scaled', 'offload_device', 'sageattn', torch_compile_args, block_swap_args, None, False, True)
clip = DualCLIPLoader('clip_l.safetensors', 'llava_llama3_fp8_scaled.safetensors', 'hunyuan_video', 'default')
clip_vision = CLIPVisionLoader('llava_llama3_vision.safetensors')
clip_vision_output = CLIPVisionEncode(clip_vision, image, 'center')
conditioning = TextEncodeHunyuanVideoImageToVideo(clip, clip_vision_output, prompt, 2)
conditioning2 = TextEncodeHunyuanVideoImageToVideo(clip, clip_vision_output, 'Aerial view, aerial view, overexposed, low quality, deformation, a poor composition, bad hands, bad teeth, bad eyes, bad limbs, distortion, blurring, text, subtitles, static, picture, black border.', 2)
hyvid_embeds = HyVideoTextEmbedBridge(conditioning, 7.500000000000002, 0, 1, False, True, conditioning2)
samples = HyVideoEncode(vae, image, False, 64, 256, True, 0, 1, 'sample')
samples = HyVideoSampler(model, hyvid_embeds, width, height, 85, 30, 0, 13.000000000000002, 2, True, None, samples, 1, None, None, None, None, 'FlowMatchDiscreteScheduler', 0, 'dynamic', None, None, None, None)
images = HyVideoDecode(vae, samples, True, 64, 256, True, 0, False)
images2 = ImageConcatMulti(2, images, image, 'left', False)
_ = VHSVideoCombine(images2, 24, 0, 'HunyuanVideoCustom_wrapper', 'video/h264-mp4', False, False, None, None, None)
"""
return script_content
def save_image(image_data, index):
"""Save image from dataset to file with zero-padded index"""
os.makedirs(INPUT_DIR, exist_ok=True)
filename = f"{index:04d}.png" # 4-digit zero-padded number
filepath = os.path.join(INPUT_DIR, filename)
if isinstance(image_data, Image.Image):
image_data.save(filepath)
else:
# Handle case where image_data might be a dictionary or array
Image.fromarray(image_data).save(filepath)
return filepath
def main():
# Load dataset
try:
ds = load_dataset("Xiang_InfiniteYou_Handsome_Pics_Captioned", split="train")
except Exception as e:
print(f"Error loading dataset: {e}")
return
# Ensure output directory exists
os.makedirs(OUTPUT_DIR, exist_ok=True)
# Main generation loop
for i in range(len(ds)):
# Get image and caption
image_data = ds[i]["image"]
prompt = ds[i]["joy-caption"].replace("'", "").replace('"', '')
# Save image and get path
image_path = save_image(image_data, i)
image_path = image_path.split("/")[-1]
for seed in SEEDS:
# Generate script
script = generate_script(image_path, seed, prompt)
# Write script to file
with open('run_hunyuan_video.py', 'w') as f:
f.write(script)
# Get current output count before running
initial_count = get_latest_output_count()
# Run the script
print(f"Generating video for sample {i} with prompt: {prompt}, seed: {seed}")
subprocess.run([PYTHON_PATH, 'run_hunyuan_video.py'])
# Wait for new output
if not wait_for_new_output(initial_count):
print("Timeout waiting for new output. Continuing to next generation.")
continue
if __name__ == "__main__":
main()This repository extends the functionality of ComfyUI-HunyuanVideoWrapper by adding support for Lora models, enabling the generation of high-quality video content with custom character and action LoRA models.
Open source is truly amazing, and HunyuanVideo now supports LoRA models! I recently tested HunyuanVideo with both action LoRA and character LoRA, and the results are fantastic.
- Repository: ComfyUI-HunyuanVideoWrapper
- Workflow Example: HunyuanVideo LoRA Workflow
-
Download the LoRA Model:
Download the LoRA model from CivitAI:
kxsr_walking_anim_v1-5.safetensors
Copy the model to the
lorasdirectory:cp kxsr_walking_anim_v1-5.safetensors ComfyUI/models/loras
-
Install Workflow Dependencies:
Install dependencies for the workflow:
comfy node install-deps --workflow='hunyuanvideo lora Walking Animation Share.json' -
Transpile the Workflow:
Transpile the workflow file:
python -m comfy_script.transpile 'hunyuanvideo lora Walking Animation Share.json' -
Run the Workflow:
Create a Python script
run_t2v_walking_lora.py:from comfy_script.runtime import * load() from comfy_script.runtime.nodes import * with Workflow(): vae = HyVideoVAELoader(r'hunyuan_video_vae_bf16.safetensors', 'bf16', None) lora = HyVideoLoraSelect('kxsr_walking_anim_v1-5.safetensors', 1, None, None) model = HyVideoModelLoader(r'hunyuan_video_720_cfgdistill_fp8_e4m3fn.safetensors', 'bf16', 'fp8_e4m3fn', 'offload_device', 'sdpa', None, None, lora) hyvid_text_encoder = DownloadAndLoadHyVideoTextEncoder('Kijai/llava-llama-3-8b-text-encoder-tokenizer', 'openai/clip-vit-large-patch14', 'fp16', False, 2, 'disabled') hyvid_embeds = HyVideoTextEncode(hyvid_text_encoder, "kxsr, Shrek, full body, no_crop", 'bad quality video', 'video', None, None, None) samples = HyVideoSampler(model, hyvid_embeds, 512, 320, 85, 30, 6, 9, 6, 1, None, 1, None) images = HyVideoDecode(vae, samples, True, 64, 256, True) _ = VHSVideoCombine(images, 24, 0, 'HunyuanVideo', 'video/h264-mp4', False, True, None, None, None, pix_fmt = 'yuv420p', crf=19, save_metadata = True, trim_to_audio = False)
Run the script:
python run_t2v_walking_lora.py
- prompt = "kxsr, Shrek, full body, no_crop"
HunyuanVideo_00005.mp4
- Action LoRA Demo: ComfyOnline - Action LoRA
- Character LoRA Demo: ComfyOnline - Character LoRA
-
Download the Makima LoRA Model:
Download the Makima LoRA model from CivitAI:
makima_hunyuan.safetensors
Copy the model to the
lorasdirectory:cp makima_hunyuan.safetensors ComfyUI/models/loras
-
Install Workflow Dependencies:
Install dependencies for the workflow:
comfy node install-deps --workflow='hunyuan video lora makima character.json' -
Transpile the Workflow:
Transpile the workflow file:
python -m comfy_script.transpile 'hunyuan video lora makima character.json' -
Run the Workflow:
Create a Python script
run_t2v_makima_lora.py:from comfy_script.runtime import * load() from comfy_script.runtime.nodes import * with Workflow(): vae = HyVideoVAELoader(r'hunyuan_video_vae_bf16.safetensors', 'bf16', None) lora = HyVideoLoraSelect('makima_hunyuan.safetensors', 1, None, None) model = HyVideoModelLoader(r'hunyuan_video_720_cfgdistill_fp8_e4m3fn.safetensors', 'bf16', 'fp8_e4m3fn', 'offload_device', 'sdpa', None, None, lora) hyvid_text_encoder = DownloadAndLoadHyVideoTextEncoder('Kijai/llava-llama-3-8b-text-encoder-tokenizer', 'openai/clip-vit-large-patch14', 'fp16', False, 2, 'disabled') hyvid_embeds = HyVideoTextEncode(hyvid_text_encoder, "kxsr, 1 lively kxsr running on campus, cinematic, anime aesthetic", 'bad quality video', 'video', None, None, None) samples = HyVideoSampler(model, hyvid_embeds, 512, 320, 85, 30, 6, 9, 6, 1, None, 1, None) images = HyVideoDecode(vae, samples, True, 64, 256, True) _ = VHSVideoCombine(images, 24, 0, 'HunyuanVideo', 'video/h264-mp4', False, True, None, None, None, pix_fmt = 'yuv420p', crf=19, save_metadata = True, trim_to_audio = False)
Run the script:
python run_t2v_makima_lora.py
- prompt = "kxsr, 1 lively kxsr running on campus, cinematic, anime aesthetic"
HunyuanVideo_00006.mp4
-
Download the Makima LoRA Model:
Download the Makima LoRA model from Huggingface:
xiangling_test_epoch4.safetensors
Copy the model to the
lorasdirectory:cp xiangling_test_epoch4.safetensors ComfyUI/models/loras
-
Run the Workflow:
Create a Python script
run_t2v_xiangling_lora.py:#### character do something (seed 42) from comfy_script.runtime import * load() from comfy_script.runtime.nodes import * with Workflow(): vae = HyVideoVAELoader(r'hunyuan_video_vae_bf16.safetensors', 'bf16', None) lora = HyVideoLoraSelect('xiangling_test_epoch4.safetensors', 2.0, None, None) model = HyVideoModelLoader(r'hunyuan_video_720_cfgdistill_fp8_e4m3fn.safetensors', 'bf16', 'fp8_e4m3fn', 'offload_device', 'sdpa', None, None, lora) hyvid_text_encoder = DownloadAndLoadHyVideoTextEncoder('Kijai/llava-llama-3-8b-text-encoder-tokenizer', 'openai/clip-vit-large-patch14', 'fp16', False, 2, 'disabled') hyvid_embeds = HyVideoTextEncode(hyvid_text_encoder, "solo,Xiangling, cook rice in a pot genshin impact ,1girl,highres,", 'bad quality video', 'video', None, None, None) samples = HyVideoSampler(model, hyvid_embeds, 478, 512, 85, 30, 6, 9, 42, 1, None, 1, None) images = HyVideoDecode(vae, samples, True, 64, 256, True) #_ = VHSVideoCombine(images, 24, 0, 'HunyuanVideo', 'video/h264-mp4', False, True, None, None, None) _ = VHSVideoCombine(images, 24, 0, 'HunyuanVideo', 'video/h264-mp4', False, True, None, None, None, pix_fmt = 'yuv420p', crf=19, save_metadata = True, trim_to_audio = False)
Run the script:
python run_t2v_xiangling_lora.py
- prompt = "solo,Xiangling, cook rice in a pot genshin impact ,1girl,highres,"
HunyuanVideo_00029_epoch4_seed42.mp4
Scaled dot product attention (sdpa) should now be working (only tested on Windows, torch 2.5.1+cu124 on 4090), sageattention is still recommended for speed, but should not be necessary anymore making installation much easier.
Vid2vid test: source video
chrome_O4wUtaOQhJ.mp4
text2vid (old test):
chrome_SLgFRaGXGV.mp4
Transformer and VAE (single files, no autodownload):
https://huggingface.co/Kijai/HunyuanVideo_comfy/tree/main
Go to the usual ComfyUI folders (diffusion_models and vae)
LLM text encoder (has autodownload):
https://huggingface.co/Kijai/llava-llama-3-8b-text-encoder-tokenizer
Files go to ComfyUI/models/LLM/llava-llama-3-8b-text-encoder-tokenizer
Clip text encoder (has autodownload)
Either use any Clip_L model supported by ComfyUI by disabling the clip_model in the text encoder loader and plugging in ClipLoader to the text encoder node, or allow the autodownloader to fetch the original clip model from:
https://huggingface.co/openai/clip-vit-large-patch14, (only need the .safetensor from the weights, and all the config files) to:
ComfyUI/models/clip/clip-vit-large-patch14
Memory use is entirely dependant on resolution and frame count, don't expect to be able to go very high even on 24GB.
Good news is that the model can do functional videos even at really low resolutions.