How to run LTX Video 13B 0.9.7 in ComfyUI?

Technology
May 13, 2025
How to run LTX Video 13B 0.9.7 in ComfyUI?
Explore LTX Video 13B model, the free, open-source video generator that offers high-quality, fast video rendering on consumer-grade GPUs. Learn how to use it!

1. Introduction

In the ever-evolving landscape of video generation technology, LTX Video 13B has emerged as a true game-changer. Fully free and open-source, this powerful model delivers impressive quality and speed—even on consumer-grade GPUs (e.g., NVIDIA 4090, 5090). With the release of version 0.9.7, LTX Video 13B introduces enhanced capabilities that make it easier than ever to generate stunning videos from images. In this post, we focus specifically on its image-to-video features, including the ability to animate between a start and end frame—a key upgrade that opens up new creative possibilities. We’ll walk you through the model's features, how to install and run it inside ComfyUI, and how to get the most out of its image-to-video workflows. Whether you're a content creator, hobbyist, or simply curious about video AI, this guide will help you get up and running.

2. Key Features of LTX Video

LTX Video distinguishes itself in the competitive landscape of video generation tools with a rich set of powerful and user-friendly features. Some of its standout capabilities include:

  • Multiscale Video Rendering
    LTX Video generates videos in progressive layers, starting with a coarse layout and gradually refining details. This results in high-quality, coherent outputs.

  • Full HD Output & Extended Duration
    Users can generate videos in Full HD (1920×1080) and create clips that go beyond just a few seconds, making it suitable for more ambitious or narrative-driven projects.

  • Keyframe Control with Reference Images
    Incorporate start and end frames or insert key reference images throughout the timeline to guide the animation. This gives users greater creative control and consistency across frames.

  • Integrated Free Upscalers
    LTX Video includes built-in upscaling tools that improve the resolution and visual fidelity of the output—ideal for creators aiming for professional-quality results.

These features make LTX Video a flexible and powerful solution for hobbyists, artists, and professionals alike, offering creative freedom without sacrificing quality or accessibility.

3. How to install LTXV 13B in ComfyUI

Before running LTX Video 13B in ComfyUI, you'll need to set up a few prerequisites. Here's a step-by-step guide to getting everything in place:

  1. Download the FP8 Quantized Model

    • Model name: ltxv-13b-0.9.7-dev_fp8_e4m3fn.safetensors

    • Download link: Hugging Face - Kijai/LTXV

    • Place it in the following folder: ComfyUI/models/checkpoints/

  2. Download the Required CLIP Model

  3. Choose a Workflow File (JSON)

    • Visit the GitHub page: ComfyUI-LTXVideo

    • Available workflows:

      • Simplified image to video

      • Simplified image to video with extension

      • Simplified image to video with keyframes

    • Download the one that best suits your needs.

  4. Update ComfyUI

    • Open the ComfyUI Manager

    • Click "Update ComfyUI"

    • Restart ComfyUI once the update completes

  5. Load the Workflow

    • Drag and drop the downloaded workflow .json file directly into the ComfyUI interface
  6. Install Required Nodes

    • Go back into the ComfyUI Manager

    • Open "Custom Node Manager"

    • Search for ltx

    • Install the node called ComfyUI-LTXVideo

    • Restart ComfyUI again to ensure everything loads correctly

Once all steps are completed, you’ll be ready to start creating videos using LTX Video 13B inside ComfyUI.\

In the next part, we’ll dive into the “Simplified Image to Video” workflow—a great starting point that shows how to generate high-quality videos from a single image using LTXV’s default settings and structure.

4. Image-to-Video Workflow with LTX Video 13B in ComfyUI

Now that everything is set up, it’s time to start using the Simplified Image-to-Video Workflow. Begin by dragging the workflow file into ComfyUI. Once it's loaded, you’ll need to select the correct models that we just downloaded: the FP8 quantized LTX Video 13B model and the CLIP model. Make sure these are properly assigned by selecting them. After that, load an initial image and fill in a positive prompt to finish the first part of the workflow. See the image below:

Positive prompt*: The camera remains still as the woman, dressed in a sensual outfit, stands or reclines on a floating cloud. Her long hair sways gently in the wind, and her body shifts subtly with each breath, creating a soft bounce in her chest. She keeps her eyes locked on the camera, exuding serene sensuality as the clouds drift around her, with soft light highlighting her figure*

Next, we’ll configure the resolution in the LTX Base Sampler Node. LTX Video 13B only generates at two fixed base resolutions: 768×512 (landscape) or 512×768 (portrait). When you load an initial image, it will automatically be resized to match one of these formats. You can’t use arbitrary resolutions here—this is the low-resolution base generation stage.

Choose:

  • 768×512 if your image is wider (e.g., 16:9 ratio)

  • 512×768 if your image is taller (portrait-oriented)

Since I’m using an image with a 16:9 aspect ratio, I’ll stick with 768×512.

You don’t need to change the num_frames, which is already set to 97. This will give you about 4 seconds of video at 24fps, as set earlier.

Finally, to save the output video, make sure the save_output toggle in the Video Combine node is set to True. This ensures your rendered result is saved in the ComfyUI/output directory.

Since we haven’t installed the upscaler models yet, we’ll need to bypass the last two sections of the workflow: the Latent Upscaler group and the Add Details group. To do this, simply right-click on each group node and select “Bypass Group Nodes” from the menu. This will temporarily skip those parts of the workflow during generation. See below how the disabled group nodes should look:

Once that’s done, you’re ready to generate a non-upscaled version of your video. Just click Generate, and LTX Video will process your initial image into a 4-second clip based on the current settings. Below are some examples of the low resolution base generations.

5. How to Enable LTXV 13B Upscaling in ComfyUI (Spatial + Temporal)

Now that we've generated a basic video, you might want to create videos with higher quality. To do that, we’ll install the upscalers used in the last two sections of the workflow: Spatial and Temporal upscaling.

You can download the models from the links below:

Once downloaded, place both files into the following directory:
ComfyUI/models/upscale_models

After copying the models, return to ComfyUI, click “Edit” in the top menu, and select “Refresh Node Definitions” (or simply press R) to reload the available models inside the workflow.

Next, re-enable the Latent Upscaler and Add Details groups that you previously bypassed. Right-click each group and select “Set Group Nodes to Always” to activate them again.

Before hitting Generate, make sure the correct upscaling model(s) are toggled on—either spatial, temporal, depending on your desired output. These upscalers will refine your initial low-resolution base generation into a higher quality final result. See the image below for reference.

6. LTX-Video for ComfyUI: Final Thoughts, Strengths, and Limitations

LTX-Video in ComfyUI is a fast and efficient tool for generating AI-powered video from still images using latent video diffusion. With optional spatial and temporal upscaling, it can transform low-resolution outputs (768x512 or 512x768) into higher-resolution formats like 1280x720 or 1920x1080. This makes it highly practical for creators seeking a solid blend of speed and visual quality.

Once the LTXV upscalers are installed and activated, the initial low-res video can be upscaled to significantly better quality. This workflow is efficient, modular, and friendly to systems with limited VRAM thanks to tiling support.

✅ Strengths

  • Excellent for portrait shots, stylized animations, and smooth, low-motion sequences.

  • Fast generation speeds with customizable output settings.

  • Simple to use within ComfyUI’s flexible, node-based workflow.

  • A good solution for quick prototyping, aesthetic reels, or content that prioritizes style over realism.

⚠️ Limitations & Considerations

  • Fixed resolution generation: LTX always resizes your input image to either 768x512 or 512x768 before generation, so precise resolution control must happen during upscaling.

  • Prompt adherence is weak, especially across longer sequences—complex text prompts may not translate accurately into consistent visual output.

  • Anatomical inaccuracies: Faces, limbs, and especially eyes often degrade with larger movements. Fast motion can lead to warping, flickering, or unstable anatomy across frames.

  • Temporal coherence is limited: High-motion clips suffer from inconsistencies in lighting, structure, and detail retention.

Overall Summary

LTX-Video is a strong option for artists and creators who need fast AI video generation with a balance of quality and efficiency. It performs best in scenarios with limited motion, such as character loops, slow zooms, or atmospheric visuals. However, it’s not yet ready for anatomically demanding content or high-fidelity character animation.

Frequently Asked Questions

AI Video Generation

Create Amazing AI Videos

Generate stunning videos with our powerful AI video generation tool.

OR