Massive RAM consuming since recent commits #5094

Death0Justice · 2022-11-26T21:40:55Z

Death0Justice
Nov 26, 2022

I started playing with the sd webui since a couple of months ago. At that time, after the huge consumption of RAM at the start-up, which is to load the weights in the models, the RAM consumption typically stay nice and low even if you start generating images with it. So I can easily use other applications with it running in the background.
However recently, the webui starts to consume a lot of RAM after the start up, and it usually start to do so after one round of image generation. Afterwards, it doesn't release the memory so the consumption stays high, severely impacting the performance of my pc on other applications.
The question I'd like to ask is mainly just a reason. I am not quite sure why this would be the case, and whether it is me that is doing something wrong, like improper settings, or it is some recent features that are contributing to this problem. I don't understand why it would happen otherwise because the previous versions don't have this problem.
I run my webui with xformers and deepdanbooru, and the ddb is correctly set to using gpu, for I can observe the correct register info in the starting up messages.

ghost · 2022-11-26T23:43:52Z

ghost
Nov 26, 2022

How much specifically?

Mine was and still is eating ~11 gigs of RAM, and up to 30 on generation of big depth maps on CPU.

5 replies

Death0Justice Nov 26, 2022
Author

I don't know where you get your info from, but apparently in task manager the python process which webui is in is consuming up to 7 gigs of memory, which is a lot for my 16G RAM. Wasn't like that when I first used the webui. It only took around 2 to 3 gigs during a generation run, most of the time the python process rarely even popped among those processes with most memory consumption.

Kilvoctu Nov 27, 2022

My Python process also eats up ~2GB RAM after the first generation run and lingers around afterward. Then after a while, I don't know what triggers it, that RAM is cleared out and returns to normal. Right now it's sitting at 7MB RAM usage, and doesn't even use extra RAM when generating (and performs the same). Very puzzling.
I don't do anything fancy with the web ui, just generating images and swapping checkpoints.

Wish I had more information. These are the commandline args I use. --xformers --opt-channelslast --listen --api

Death0Justice Nov 27, 2022
Author

Wish I had more information. These are the commandline args I use. --xformers --opt-channelslast --listen --api

What does the --opt-channelslast do? I don't quite understand its short description in wiki. Just a quick question by the way.

Kilvoctu Nov 27, 2022

To be honest, I have no idea what it does, lol.
When auto added it, I just tried it out to see what happened. My observations for my machine was that it was about 10-15% faster generations with looked like the same outputs, at same resource usage. So I kept it and didn't think too much about it.

aliencaocao Nov 27, 2022

--opt-channelslast will make the image tensors to have the format HxWxC, C is channels. Default, the format is CHW. On Nvidia GPUs with Tensor Cores, HWC is faster than CHW. See https://docs.nvidia.com/deeplearning/performance/dl-performance-convolutional/index.html

It should not affect anything with RAM.

@Death0Justice what is your cache in ram setting? There was recent changes to the caching logic, where cache=1 will actually cache 2 in RAM now.

Death0Justice · 2022-11-27T05:13:13Z

Death0Justice
Nov 27, 2022
Author

@aliencaocao Because this might be a potential answer, I pull it out from nested reply for clearer reference.

My cache in ram setting is 0, because the models is too large for my 16G RAM to handle (And I rarely use more than one model in one session).

5 replies

aliencaocao Nov 27, 2022

Can you check if you have the unload vae to ram or unload anything to ram checked? There are also recent changes to vae selection and if you have a VAE file it will be loaded in ram too (even if you didn't explicitly set)

Death0Justice Nov 27, 2022
Author

Actually let me post my settings page capture.

And regarding your problem, I don't think I had any settings with loading things to ram checked.

aliencaocao Nov 27, 2022

Try setting SD_VAE to None.

Death0Justice Nov 27, 2022
Author

Setting SD_VAE to None doesn't seem to work, the RAM consumption is still lingering at about 5 gigs after first round of generation. And I can confirm the vae file is completely unused since no vae loading message is shown in start up messages.

aliencaocao Nov 27, 2022

Im out of ideas now then, maybe you can checkout older commits and see which commit causes the issue

Atomizer74 · 2022-11-27T06:51:59Z

Atomizer74
Nov 27, 2022

After seeing this discussion, I was still on the commit before SD2.0 support was added, so I took the first screenshot during an X/Y generation.

Later, I git pulled to the latest commit, and ran the same X/Y generation and observed this:

The only differences that I recall is, because I saw in the commit that the deepbooru missing button was fixed, so I removed the --deepdanbooru commandline from my webui-user.bat, and the only other thing is I had pulled in the DPM++ SDE Karras sampler pull request to my own local copy, and was using that for the generation in the first screenshot, I didn't reapply it because again, it looked like automatics commit had added the SDE sampler in, though I guess he missed actually adding the options to the UI, so it was missing for the second generation, which I used the DPM++ 2M Karras sampler instead.

That is the only 2 differences I am aware of between the two generations, but as you can see, it is using more RAM after swapping to the latest commit, was it just the sampler? Or something else, I have no idea, but figured I would share in case it helps at all.

Edit: Oh, final thing I guess, I have a GTX 1080 Ti, and 32GB RAM.

1 reply

Death0Justice Nov 27, 2022
Author

Thanks for the info, the ram consumption increase shown here is exactly what I am experiencing. However, my problem seems to occur from an earlier time spot. But my memory (not my pc's ironically) is progressively worsening, so I am not quite sure now how long have I encountered this problem. I can only narrow the occuring time spot to from 2 to 4 weeks ago to now.

Death0Justice · 2022-11-27T07:27:20Z

Death0Justice
Nov 27, 2022
Author

Currently I see the RAM consumption differs from sampler to sampler, both according to my tests and others', so this might be related to one of the sampler changes among recent commits.
Also let me be clear, the core problem I encountered now is that the memory is lingering and isn't being freed after a generation run. And this is happening to all the samplers that I have tested. During the run there might be an abnormal increase in ram consumption, which is also concerning, but that is a little bit off tangent, and I kinda can tolerate it if the used memory can be freed at last. In any case, can't really expect such a ram consuming job to be done alongside with other large application like games or lots of chrome pages with high ram demand.

4 replies

mrdami3n Mar 1, 2023

I noticed this earlier today. I have probably one of the lowest spec system to run stable diffusion. With only 8gb of ram and GTX 1650. So I never expected it to run in the first place and if it did id expect it to be slow. Now it would run fine for 2 weeks sure it was rather slow but my pc would not have any issues and just plod on like usual no lag or freezing up.

Now yesterday I noticed when changing models through the webUI what normally would happen quickly was causing my system to hang and lag. So I opened up task manager and found python to be running a lot more ram than usual. I thought it was a one off so I rebooted and started up the webUI again.

Nope, that didn't fix it. Lag again but this time while loading up. I have noticed some changes to the load up model it says the following:

"Model loaded in 37.7s (load weights from disk: 0.2s, create model: 0.6s, apply weights to model: 10.2s, apply half(): 2.1s, load VAE: 3.7s, move model to device: 16.5s, load textual inversion embeddings: 4.4s).
Running on local URL: http://127.0.0.1:7860"

Where as in the past it was

Model loaded in 15.4s (load weights from disk: 0.2s, create model: 0.9s, apply weights to model: 8.1s, apply half(): 6.0

Load VAE - Move model to device load textual inversion embeddings are all new. To my prompt maybe this is causing the lag? I see you turned off load vae and it didn't fix it not promising. But as you said I am seeing the same issues as you runs fine for 2 weeks then has issues. Like some memory is either being stored or leaked somewhere that has caused some hiccups.

mrdami3n Mar 1, 2023

I am running this on windows btw. I just deleted my memory dump file it was over 3GB in size. Now my model takes 24 seconds to run and it doesn't seem to lag as much. I don't know if it was a coincidence or not.

cgessai Apr 18, 2023

Hey @ghostidle this is a weird question but have you been installing any of the GPT AI programs like Alpaca Electron or GPT4ALL? Because after months and months of using SD just fine with my 16GB RAM 4GB GTX 1650, all of a sudden SD is earing up my ram like crazy. Also, according to the task manager it doesn't seem to be keeping the model loaded in VRAM on my NVidia card- which is very weird and maybe a clue that one of the other (AI) things I've installed recently has somehow impacted my environment for SD.

Thomasxchrozo Jul 22, 2023

yo @ghostidle how did u clean ur memory dump file?...cause i myself have problem running these

xBelladonna · 2024-02-25T10:54:24Z

xBelladonna
Feb 25, 2024

It's 2024 and I'm experiencing this too. I have 64GB of RAM and an RTX 4090..... and this eats literally all of my RAM, then fills up all of the swap, crashing my computer after several rounds of generation. It seems the cache limit isn't being respected as the data is essentially lingering like OP said. This is generally known as a memory leak.

0 replies

elvissteinjr · 2024-03-07T20:51:18Z

elvissteinjr
Mar 7, 2024

I've experienced this as well, seemingly all of the sudden. As it turned out, upgrading torch for good measure when getting the latest WebUI release was not a good idea.

From what I observed on my system, it seemed like some sort of garbage collection wasn't kicking in properly every now and then? And it seemed to be connected to --medvram-sdxl to some extent (didn't observe high usage without, but also can't actually generate SDXL without it). Sometimes it cleaned up after generation and it went fine, but it'd often fail and such take up all system RAM (32 GB in my case) + roll the dice between crashing just python or take the GPU driver with it (sounds unrelated but happened a few times).
Baseline use was also higher than expected.

I don't know what triggers this specifically, but I downgraded to torch 2.0.1, torchvision 0.15.2, xformers 0.0.20 and the issues were gone. Versioning warnings aside, everything seems to be working.
Didn't exactly choose this version of torch for anything except it being mentioned in a similar post a while back. Newer ones may work as well.
And yeah, that torch version is newer than what was available when OP opened this, but whatever the current cause for some is, it works there.

As I also experienced the same behavior in WebUI Forge, I'm not sure if this is something to fix in WebUI itself. But I also don't feel knowledgeable enough of the ecosystem to make a useful report on upstream torch if the issue lies there.

So for now I'll just leave this info here for anyone else affected. Happy to be corrected if some of this is factually wrong, but I went through venv reinstalls, WebUI downgrades, etc. and only this helped me.

1 reply

dustinandrews Mar 21, 2024

This was very helpful. I struggled with interlocking version dependencies for a while, but this command worked to get a version that runs well. I had to give up on xformers to get a working combination.

From: https://pytorch.org/get-started/previous-versions/

pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118

Previously I would be lucky to get 5 images before it crashed. Sometimes taking the entire computer down with it. My memory use is still high, but it's working fine.

rocco-p · 2024-04-04T13:55:59Z

rocco-p
Apr 4, 2024

I can confirm that the latest version (1.8.0) is much slower for me and takes more RAM. It eats all my 16 Gb and make the whole computer lag. With the previous version, I was able to use other apps during generation.

First, I used the version 1.7.0 and then I upgraded to 1.8.0 using simply a "git pull". As suggested in the console window, I upgraded PyTorch to version 2.1.2 and started experiencing the high RAM usage.

So I decided to revert to the previous version using :

git reset --hard cf2772f

This command only affects the repository, not the VENV and any dependencies, the PyTorch is still 2.1.2 at this point. And launching webui-user as usual worked as expected at the same speed as before.

My guess is that the cause of this issue is not PyTorch itself, maybe the way it's called. To reproduce this installation, I deleted everything and started from scratch

git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git (this installs the latest version, I run as usual, it downloads everything needed
git reset --hard cf2772f

Maybe this helps. I started looking at the source code if I see something. Otherwise, I will continue with version 1.7.0 as it works well.

0 replies

Hackgets · 2024-04-28T15:36:59Z

Hackgets
Apr 28, 2024

I had this problem after upgrading to 1.9.0, no problem with 1.8.0.

0 replies

BlackWyvern · 2024-05-03T02:23:30Z

BlackWyvern
May 3, 2024

+1'ing. Upgrading from 1.8.0 more than quadrupled my RAM use. Will ramp itself up to 16-20GB on a 2GB 1.5 model and not release unless task killed.

Win 10, 3080 10G, 32GB RAM
version: [v1.9.3] • python: 3.10.11 • torch: 2.1.2+cu121 • xformers: 0.0.23.post1 • gradio: 3.41.2 • checkpoint: [b965aee5a3]

0 replies

kingsushi001 · 2024-05-05T14:50:43Z

kingsushi001
May 5, 2024

So this issue is almost 2 years old. Doubt it'll get fixed

Just starting up A1111 takes 3.5gb. After that it's all down hill till it crashes. Great job

3 replies

mrdami3n May 5, 2024

As I mentioned above there was kind of a way of fixing, still some people have had issues with it. In all fairness with the release of stable diffusion forge you might as well use that instead. I don't seem to have the same problem with that version and it also creates images much quicker due to the way it allocates memory. Might be worth a check.

kingsushi001 May 5, 2024

I tried Forge a couple of weeks ago, lol, for some reason I got a lot of errors trying to run it, so reverted back to A1111.
I might give it another go and actually try working through the errors.

kingsushi001 May 10, 2024

Tried Forge again tonight. Still getting errors

EmersonBiggons · 2024-05-07T09:31:34Z

EmersonBiggons
May 7, 2024

+1 I hope someone figures something out, this is getting really irritating. Downgrading for me isn't much of an option for me specifically so I am kinda hoping for some kind of fix...

0 replies

kingsushi001 · 2024-05-07T09:40:37Z

kingsushi001
May 7, 2024

Any of you by chance using Ultimate SD Upscale? Because that's where my problems started. Switched over to the normal SD Upscale, and RAM usage is way down

0 replies

rocco-p · 2024-05-09T14:19:54Z

rocco-p
May 9, 2024

I tried to upgrade to 1.9.3 and I was expecting the same behavior as 1.8.0. And I can't explain why but it seems to be better. The RAM usage is still high sometimes, compared to 1.7.0, but a bit less than 1.8.0. And I can use SDXL and ControlNet now.

What I tried also are some performance tips such as disabling the hardware acceleration in browser, and setting to maximum performance in NVIDIA configuration (also, check your configuration if using a laptop, make sure it's not running on battery and disable any power saving feature). Also, what I have done, to avoid any slowdown caused by browser, is to use Automatic1111 in API mode (with the command line parameter --api), and created some Python scripts to make the query directly. This saves RAM and can increase performance.

Below is one of my scripts. Generates 1920x1080 picture with SDXL if someone is interested.

Reference here : https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/API

import requests
import base64

# Define the URL and the payload to send.
url = 'http://127.0.0.1:7860'
prompt = '(fantasy castle floating on a cloud), blue sky, photography, best quality'
negative_prompt = '2D, illustration, low quality, blurry'

payload = {
    "scheduler": "Automatic",
    "prompt": prompt,
    "negative_prompt" : negative_prompt,
    "width": 960,
    "height": 540,
    "steps": 20,
    "denoising_strength": 0.7,
    "sampler_name": "DPM++ 2M",
    "enable_hr": True,
    "hr_scale": 2,
    "hr_resize_x": 0,
    "hr_resize_y": 0,
    "hr_second_pass_steps": 0,
    "hr_prompt": prompt,
    "hr_negative_prompt": negative_prompt,
    "hr_upscaler": "Latent",
    "Tiled VAE": {"args": [True, 1024, 64, True, True, True, False]},
    "send_images": False,
    "save_images": True
}

# Send said payload to said URL through the API.
response = requests.post(url=f'{url}/sdapi/v1/txt2img', json=payload)
r = response.json()

0 replies

elapodove · 2025-03-14T04:29:15Z

elapodove
Mar 14, 2025

All right so, i know very little about AI and all, had this massive ram deal too, today fidgeting around, found a settings menu for wsl 2 (im using docker) so one of the settings let me reasing the max amount of ram it could use, well for my machine it had 16 gigs by default, maybe its this wsl 2 thing causing the massive ram comsumption

1 reply

elapodove Mar 14, 2025

Tada! working with no issue, cranked it to 6gigs max

This is the wsl menu i was talking about, sorry about the language, english it's not my main language

Massive RAM consuming since recent commits #5094

Replies: 14 comments · 20 replies

Death0Justice Nov 26, 2022 Author

Death0Justice Nov 27, 2022 Author

Death0Justice Nov 27, 2022 Author

Death0Justice Nov 27, 2022 Author

Death0Justice Nov 27, 2022 Author

Death0Justice Nov 27, 2022 Author

Death0Justice Nov 27, 2022 Author

Replies: 14 comments 20 replies

Death0Justice Nov 26, 2022
Author

Death0Justice Nov 27, 2022
Author

Death0Justice
Nov 27, 2022
Author

Death0Justice Nov 27, 2022
Author

Death0Justice Nov 27, 2022
Author

Death0Justice Nov 27, 2022
Author

Death0Justice
Nov 27, 2022
Author