Memory check before inference to avoid VAE Decode using exceeded VRAM. #5745

wl2018 · 2024-11-23T14:20:35Z

Check if free memory is not less than expected before doing actual decoding, and if it fails, try to free for required amount of memory, and if it still fails, switch to tiled VAE decoding directly.

It seems PyTorch may continue occupying memory until the model is destroyed after OOM occurs. This commit tries to avoid OOM from happening in the first place for VAE Decode.

This is for VAE Decode ran with exceeded VRAM from #5737.

comfyanonymous · 2024-11-24T02:34:05Z

comfy/sd.py

+            logging.debug(f"Free memory: {free_memory} bytes, predicted memory useage of one batch: {memory_used} bytes")
+            if free_memory < memory_used:
+                logging.debug("Possible out of memory is detected, try to free memory.")
+                model_management.free_memory(memory_used, self.device, [self.patcher])


the load_models_gpu function aleady calls free_memory.

Got that. So there's no need for two checks and additional free_memory.

comfyanonymous · 2024-11-24T02:36:59Z

comfy/sd.py

+                logging.debug(f"Free memory: {free_memory} bytes")
+                if free_memory < memory_used:
+                    logging.warning("Warning: Out of memory is predicted for regular VAE decoding, directly switch to tiled VAE decoding.")
+                    predicted_oom = True


The reason for actually trying is because the memory estimation might not be accurate and will overestimate the amount of memory so it is better to try the decoding.

The proper way to solve the issue would be to free the memory properly on OOM before doing tiled decode.

OK, so we need to know how to properly free the VRAM after OOM first...
I don't think it's suitable to just destroy entire model object and then reload it.

But actually sometimes I found OOM didn't occur at all, and it just continued running and consumed a lot of shared GPU memory and became very slow. This happened randomly.

Another point is what's the drawback to use tiled decode? Looks like tiled decode isn't slow, at least on my computer. So why we so care about overestimate?

Tiled decode gives lower quality images/videos.

Check if free memory is not less than expected before doing actual decoding, and if it fails, switch to tiled VAE decoding directly. It seems PyTorch may continue occupying memory until the model is destroyed after OOM occurs. This commit tries to avoid OOM from happening in the first place for VAE Decode. This is for VAE Decode ran with exceeded VRAM from comfyanonymous#5737.

wl2018 requested a review from comfyanonymous as a code owner November 23, 2024 14:20

wl2018 force-pushed the pr-20241123_VAE-Decode-improvements branch 2 times, most recently from e85d80f to 58eb317 Compare November 23, 2024 15:17

wl2018 changed the title ~~Memory check before inference to avoid VAE Decode using exceeded RAM.~~ Memory check before inference to avoid VAE Decode using exceeded VRAM. Nov 23, 2024

comfyanonymous reviewed Nov 24, 2024

View reviewed changes

wl2018 force-pushed the pr-20241123_VAE-Decode-improvements branch from 58eb317 to a3b9b3c Compare November 24, 2024 10:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory check before inference to avoid VAE Decode using exceeded VRAM. #5745

Memory check before inference to avoid VAE Decode using exceeded VRAM. #5745

wl2018 commented Nov 23, 2024

comfyanonymous Nov 24, 2024

wl2018 Nov 24, 2024

comfyanonymous Nov 24, 2024

wl2018 Nov 24, 2024 •

edited

Loading

comfyanonymous Nov 25, 2024

Memory check before inference to avoid VAE Decode using exceeded VRAM. #5745

Are you sure you want to change the base?

Memory check before inference to avoid VAE Decode using exceeded VRAM. #5745

Conversation

wl2018 commented Nov 23, 2024

comfyanonymous Nov 24, 2024

Choose a reason for hiding this comment

wl2018 Nov 24, 2024

Choose a reason for hiding this comment

comfyanonymous Nov 24, 2024

Choose a reason for hiding this comment

wl2018 Nov 24, 2024 • edited Loading

Choose a reason for hiding this comment

comfyanonymous Nov 25, 2024

Choose a reason for hiding this comment

wl2018 Nov 24, 2024 •

edited

Loading