inference time exploding in real-time-gui #155

hwk9764 · 2025-04-08T07:34:23Z

Hello. I tried the real-time-gui, and I noticed that when I begin real-time voice conversion and leave it without restarting by pressing stop voice conversion and repress start voice conversion, the inference time strangely keeps exploding endlessly.

The inference time continues to grow and gets bigger and goes beyond the block time, the outputs are buffered. If I increase the block time, this issue is temporary resolved but latency slows down and even if the block time is increased, the inference time eventually passes the block time soon.

Do you happen to know about this issue? If so, I would greatly appreciate it let me know how to resolve it.

Plachtaa · 2025-04-11T11:59:30Z

hi, thanks for your feedback on this issue.
I am sorry to say this depends individuals' device difference and it's hard to find out the exact cause of the issue on your device
there may be a plan of releasing a streaming simulator for users to debug before they start using real time GUI, but before that there is nothing I can do : (

hwk9764 · 2025-04-12T07:32:04Z

Is there anything that comes to mind that I could look into to help resolve this issue? If so, even a small hint would be a huge help.

DocZenith · 2025-04-28T12:34:47Z

I have the same or similar issue. The program runs but the CPU/GPU load keeps gradually increasing until it hits 100%, then the inference times skyrocket and audio becomes choppy. I have it on 2 machines and it does it on both. One is 9800X3D with 4090 and the other is Epyc 7C13 with Ada4500. The only difference is that on the Epyc, since it is a weaker machine, the audio breaks in matter of seconds while on the 9800X3D it can run for about 20 seconds before it breaks. But the symptoms are the same.

Both machines are different SW wise, one has win10 21H2 LTSC other has 22H2. I have tried different cuda toolkits version (11.8 and 12.6) and corresponding torch versions, but it happens all the same on both.

Btw interesting fact, when I let pip install all torch components according to the requirements.txt, it never works. It only works if I manually install them from https://pytorch.org/get-started/locally/ according to the cuda version in the system. I also tried different GPU drivers, but nothing seems to make a difference.

This affects the real-time conversion only. The webui or even training works fine. Hitting the stop Voice conversion button and then starting again clears the issue and then it works for another 20 seconds or so. There must be some process that instead of repeating keeps running again and again while the previous one is not terminated, until it chokes the system. But Iam not a coder so I have no chance of diagnosing this.

Any help would be appreciated, cause this SW works great for me audio-quality wise, even better than RVC, especially after training my own checkpoints, but this issue keeps me from using it.

https://www.directupload.eu/file/d/8903/ulod2wy5_png.htm

DocZenith · 2025-04-28T13:05:38Z

Sorry for double post, but progress. I randomly tried older version, this one:

https://github.com/Plachtaa/seed-vc/tree/258fde908e9569e0882238da5e1d76d16681a7e0

And it works fine. The CPU load does not keep climbing and I had it running for 5 minutes and it still ran and sounded fine, so it is something that appeared in newer versions. I will keep trying different versions until I find the one where the problem appeared.

DocZenith · 2025-04-28T13:55:42Z

I found it. This is the last working version, from January 17th. All those that come after suffer from this bug.

https://github.com/Plachtaa/seed-vc/tree/670679ebf66c952cf8d13e9f728abbdd65b5014d

The problem seems to be somewhere in the "real-time-gui.py" script. I took all the files from the current version, and replaced just this one from the january 17th version and it works fine.

mraffeiner · 2025-04-29T21:44:11Z

I was running into the same issue. Thank you @DocZenith for finding this! I further isolated the issue to a single line change in this commit. 99b572b#diff-48390b9ad1b2b7742a5c58f5c2a123708c9737ba101653103c81508634d72d0fL898

It is extremely easy to reproduce by setting the block time to anything above 0.5 which triggers the min condition consistently and results in a steady increase in inference time until it exceeds the block time and breaks.

I'm have no idea what the commit originally intended to fix but reverting that one line definitely fixed it for me.

revert
self.vad_chunk_size = min(500, 1000 * self.gui_config.block_time)

to
self.vad_chunk_size = 1000 * self.gui_config.block_time

hwk9764 · 2025-04-30T13:37:05Z

@DocZenith @mraffeiner thank you so much for your finding and sharing! I'll try doing it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inference time exploding in real-time-gui #155

inference time exploding in real-time-gui #155

hwk9764 commented Apr 8, 2025

Plachtaa commented Apr 11, 2025

hwk9764 commented Apr 12, 2025

DocZenith commented Apr 28, 2025

DocZenith commented Apr 28, 2025

DocZenith commented Apr 28, 2025 •

edited

Loading

mraffeiner commented Apr 29, 2025 •

edited

Loading

hwk9764 commented Apr 30, 2025

inference time exploding in real-time-gui #155

inference time exploding in real-time-gui #155

Comments

hwk9764 commented Apr 8, 2025

Plachtaa commented Apr 11, 2025

hwk9764 commented Apr 12, 2025

DocZenith commented Apr 28, 2025

DocZenith commented Apr 28, 2025

DocZenith commented Apr 28, 2025 • edited Loading

mraffeiner commented Apr 29, 2025 • edited Loading

hwk9764 commented Apr 30, 2025

DocZenith commented Apr 28, 2025 •

edited

Loading

mraffeiner commented Apr 29, 2025 •

edited

Loading