Run ComfyUI with Sage Attention on RunPod (Network Volume Setup)

September 17, 2025
ComfyUI
RunPod
Run ComfyUI with Sage Attention on RunPod (Network Volume Setup)
Learn to run ComfyUI with Sage Attention on RunPod. This guide covers Network Volume setup, pod deployment, workflow execution, and model downloads.

1. Introduction: Running ComfyUI with Sage Attention

In the world of AI and machine learning, efficiency and speed are paramount. This tutorial will guide you through the process of setting up ComfyUI with Sage Attention on RunPod, a platform that allows you to leverage high-performance GPUs without the need for expensive hardware. Sage Attention 2.2 enhances the attention mechanism in ComfyUI, significantly reducing rendering times, especially for complex workflows. By utilizing RunPod, you can create a flexible and scalable environment that supports your AI projects.

This guide will cover everything from creating a RunPod account to deploying your first workflow, ensuring that you have a seamless experience. With the added benefit of a Network Volume, your setup will retain all necessary files and configurations, allowing for persistent storage across sessions. Let’s dive into the steps required to get your ComfyUI environment up and running!

2. Create and Fund Your RunPod Account

To begin your journey with ComfyUI on RunPod, the first step is to create an account. Visit the RunPod website and sign up using your preferred credentials.

After setting up your account, add funds to start deploying GPU pods. A minimum of $10 is required, which you can quickly do via the Billing section in your RunPod dashboard.

💡 This funding covers your GPU usage and Network Volume storage.

With your account funded, it’s time to set up a Network Volume — the persistent storage that will retain your ComfyUI setup, models, and workflows. A properly configured volume ensures that all your work remains safe and ready to use each time you launch a pod.

3. Creating a Network Volume and Choosing a Region

Workflows in ComfyUI with Sage Attention often use large model files and custom node configurations, which can be cumbersome to download or set up repeatedly. This is where RunPod’s Network Volumes make a difference.

A Network Volume is a form of persistent cloud storage linked to your GPU pods. It keeps your ComfyUI installation, models, extensions, and custom workflows safe, even after your pod stops or is terminated. Without a Network Volume, all your data would be lost when a pod shuts down, forcing you to redo the setup from scratch.

How to Create a Network Volume on RunPod

  1. Access the “Storage” section
    Log in to your RunPod dashboard and click Storage in the left sidebar. This is where all network volumes are managed.

  2. Create a new volume
    Click the New Network Volume button (top left of the page) to open the creation form.

  3. Choose a region and name your volume
    Select a datacenter close to your GPU pods to minimize latency. The page will show which GPUs are available in each region to help you choose. Then, give your volume a descriptive name, like “ComfyUI SageAttention”, to stay organized.

  4. Specify the storage size
    Decide on the amount of storage you’ll need. 50 GB is a good starting point for your initial ComfyUI setup, models, and workflows, and you can expand the volume later as your collection grows.

  5. Review and create
    After entering all details, click Create Network Volume to finalize your persistent storage setup.

💡 Pro Tip: Check which region consistently offers the GPUs you’ll use before creating your volume — this prevents moving large files later.

🔔 Important Notes on Network Volumes

  • Pricing
    Network Volumes cost $0.07 per GB per hour, which comes to roughly $7/month for 100 GB.

  • Persistence
    Your volume keeps all ComfyUI files, models, and workflows safe — even if a pod is stopped or deleted — saving you from having to re-download or reconfigure anything.

  • Region Lock-In
    Volumes are specific to their region. If you switch GPU regions later (e.g., EU-RO → US-CA), you’ll need to manually transfer your data to a new volume in that region.

  • Volume Size
    You can increase your volume size later, but cannot decrease it. It’s best to start with a modest size (50–100 GB) and scale up as needed.

  • For this guide, we’ll use EU-RO-1, which currently provides excellent access to the RTX 4090 (24 GB VRAM) — a powerful and reliable choice for most ComfyUI workflows.

  • With your Network Volume ready, you’re now set to deploy your first pod using the Next Diffusion – ComfyUI SageAttention template.

We’ll be using the EU-RO-1 region for this guide, as it currently provides reliable access to the RTX 4090 (24 GB VRAM), ideal for running ComfyUI workflows. Once your Network Volume is ready, the next step is deploying your first pod with the Next Diffusion – ComfyUI SageAttention template.

4. Deploying the ComfyUI with SageAttention Pod

With your Network Volume ready, the next step is to deploy a pod using the Next Diffusion – ComfyUI SageAttention template. This custom template automatically installs ComfyUI, ComfyUI Manager, and Sage Attention directly onto your Network Volume, keeping everything organized and persistent across sessions.

Accessing Your Network Volume

Now that your Network Volume is created, let’s put it to use:

  1. Navigate to the Storage section in the left sidebar.

  2. Find your volume (e.g., ComfyUI SageAttention– 50GB) and click Deploy Pod with Volume.

    This will redirect you to the Deploy a Pod page, where your volume is already selected and ready in the Secure Cloud section.

  3. On this page, there’s one crucial step: expand the additional filters and ensure CUDA 12.8 is selected. The Next Diffusion – ComfyUI SageAttention template requires CUDA 12.8 for proper installation and GPU acceleration.

GPU Selection

Next, choose the GPU you want for your pod. For this guide, we’ll start off with the RTX 4090, which is ideal for running heavy ComfyUI workflows.

Tip: For the initial setup, you can pick a cheaper GPU since the first launch mainly installs the template onto your Network Volume. This can save money if you don’t need full performance yet. But for this tutorial, we’re going all-in with the RTX 4090.

Selecting the Next Diffusion – ComfyUI SageAttention Template

Now it’s time to select the correct Docker template for your pod:

  1. Click Change Template.

  2. Search, Find & Select the following template name: Next Diffusion – ComfyUI SageAttention (If the template doesn’t appear, click the following link to automatically select the correct ComfyUI SageAttention Template)

This template ensures that ComfyUI, ComfyUI Manager, and Sage Attention are installed properly and ready for reuse on your persistent Network Volume.

Final Settings

  • GPU Count: 1

  • Pricing Type: On-Demand

💰 At the time of writing, the RTX 4090 costs around $0.59/hour, which is very reasonable for high-end GPU performance.

Launch Your Pod

Scroll down and click Deploy On-Demand to start your pod. You’ll be redirected to the My Pods section, where your GPU instance will begin spinning up with the Next Diffusion – ComfyUI SageAttention template.

5. Pod Initialization & Logs

Once your pod is deployed, head to the Pods section in the sidebar — you should already be redirected there after clicking Deploy On-Demand in the previous step.

Checking Initialization Logs

When you start the pod, a right-hand panel will automatically slide out. From there, click on Logs. Inside the logs, you’ll find two tabs:

  • System Logs – shows the progress of pulling the Docker image

  • Container Logs – shows runtime messages from the container

After the Docker image is fully pulled, switch to the Container Logs tab. Here you’ll see a message from Next Diffusion indicating that initialization is in progress. During this phase, the Network Volume is being set up, and ComfyUI Manager, Sage Attention, and all necessary packages are being cloned and installed.

Note: This can take ~15 minutes, depending on your pod’s speed. Once everything is downloaded and installed, ComfyUI will automatically start on port 8188, so you can access the interface without opening VS Code.

After this setup period, you’ll see logs indicating that ComfyUI has started successfully on port 8188.

From here, we’ll move on to starting ComfyUI in the browser, connecting to the interface, and loading a workflow. Once your workflow is loaded, we’ll show how to download models and manage files using the VS Code environment on port 8888, ensuring your Network Volume is fully populated and ready for production use.

Let’s begin by showing how to start ComfyUI and access the interface on port 8188.

6. Starting ComfyUI and Loading a Workflow

Once your pod has finished initializing, ComfyUI is already running on port 8188, so you don’t need to open VS Code just to start it.

Accessing ComfyUI

  1. Go to the Pods section in your RunPod dashboard.

  2. Expand your active pod by clicking the arrow or panel toggle.

  3. Click Connect on your pod.

  4. Select HTTP Service → :8188 to open the ComfyUI web interface.

After opening port 8188, you’ll see the ComfyUI canvas, with ComfyUI Manager preloaded and ready to use.

Selecting a Workflow

  1. Click the ComfyUI icon in the top-left corner.

  2. Click on Browse Templates.

  3. From the sidebar, select Flux, then choose Flux Dev fp8 workflow.

Note: If you try to open/run the workflow, you’ll get a message that the required model is missing.

Don’t worry — in the next section, we’ll show how to download the correct models and manage your file structure using VS Code on port 8888, ensuring the workflow can run properly and all files persist on your Network Volume.

7. Downloading Models via VS Code (Port 8888)

When you try running the workflow loaded in ComfyUI, you may see an error — the required Flux Dev FP8 model isn’t yet downloaded. Let’s fix that using VS Code on port 8888.

Access VS Code Server

  1. Go back to the Pods section in your RunPod dashboard.

  2. Click Connect on your active pod.

  3. Select HTTP Service → :8888 to open the VS Code Server.

💡 Tip: This environment lets you manage your file structure and ensure models are saved to your Network Volume, so they persist for future pods.

Open the Integrated Terminal

  1. After VS Code opens, click the terminal icon in the top-right corner (or press Ctrl + J) to open the integrated terminal panel.

  2. Make sure the terminal is pointing to your workspace root directory.

Download the Flux Dev FP8 Model

Navigate to the checkpoints folder using the terminal:

ts
1
2cd ComfyUI/models/checkpoints

💡 Note: The cd command stands for “change directory” — it moves your terminal into the folder where you want the model to be downloaded.

Alternatively, you can right-click the folder in VS Code and select “Open in Integrated Terminal” to open the terminal directly in that folder.

Enter the following command and hit enter to download the model:

ts
1
2wget https://huggingface.co/lllyasviel/flux1_dev/resolve/main/flux1-dev-fp8.safetensors

The terminal should look like this before hitting Enter:

The model will download directly into the folder and be saved on your Network Volume, so it will persist for future sessions.

💡 Tip: You can use this method for any model, VAE, or text encoder — just copy the Hugging Face download link and ensure your terminal is in the correct folder.

Refresh ComfyUI

After downloading your model, you need to refresh ComfyUI so it recognizes the new files:

  1. Go back to the ComfyUI web interface (port 8188).

  2. Click the Edit menu in the top-left corner.

  3. Select “Refresh Node Definitions”.

    • Alternatively, you can simply press R on your keyboard.

💡 Tip: Refreshing ensures that your newly downloaded model appears in the checkpoint loader node dropdown and is ready to use in your workflow.

Run the Workflow

Once your model is loaded and visible in ComfyUI:

  1. In the checkpoint loader node, select flux1-dev-fp8.safetensors.

  2. Enter a text prompt for your generation. For example:

    ts
    1Busty sci-fi woman, medium-close shot, standing in front of a massive glowing cube with neon letters "GPU", front view, sleek cyber armor with glowing purple trims, dramatic deep shadows, moody purple mist, confident smirk, slightly turned torso for dynamic pose, cinematic sci-fi lighting, subtle cleavage, expressive eyes
  3. Click Run to start the workflow.

🚀 Note: The model may take a few seconds to load the first time, as it caches into memory. Future runs will be much faster.

8. Conclusion: Ready to Go!

Congratulations! You’ve successfully set up ComfyUI with SageAttention on RunPod, complete with a persistent Network Volume to securely store your models, workflows, and extensions. With this powerful combination, you can now run high-performance AI workflows faster and more efficiently than ever, leveraging the speed and flexibility of Sage Attention!

You’ve learned how to:

  • Create and fund a RunPod account

  • Set up a Network Volume for persistent storage

  • Deploy a pod with the Next Diffusion – ComfyUI SageAttention template

  • Access ComfyUI on port 8188 and start your workflows

  • Download models and manage files through VS Code on port 8888

  • Run your first AI workflow

This setup provides a solid foundation for experimentation — from generating images to testing custom nodes and workflows. With everything stored on your Network Volume, future pods are faster and ready to use, letting you focus on creating. Whether you’re an experienced developer or just starting out, your ComfyUI environment on RunPod delivers the speed, flexibility, and convenience to bring your ideas to life. Now it’s time to unleash your creativity!

Frequently Asked Questions

AI Video Generation

Create Amazing AI Videos

Generate stunning videos with our powerful AI video generation tool.

Get Started Now
OR