Mr DeepFakes Forums
  • New and improved dark forum theme!
  • Guests can now comment on videos on the tube.
   
HarryPalmerPing pong on Faceswap
#1
Anyone tried this yet? Would be glad to hear your experiences and tips.
#2
Haven't tried it. It's supposed to save VRAM during training, but at the expense of training time (200% to 400% increase in training time). I'm guessing this is in place so that VRAM can be used elsewhere, such as increasing batch sizes, model size, etc. I also don't know if it stacks with Memory Saving Gradients option, but it looks as if it should. I'm also not sure if it works with all available training models. I'll look into it over the next few days. Have you tried it yet?
#3
Nope, haven't had time today. Would be good if someone in the know could outline the use cases for ping pong and Memory Saving Gradients though, both totally new to me until yesterday.
#4
It means I can train Villain now. They do stack. Ping pong only loads one half at a time (i.e. trains on side A for 100 iterations, unloads the model, loads side B, trains on that for 100 iterations etc,), so I guess training time is about doubled, but can't compare.

I don't really understand what Memory Saving Gradients does, I'll need to look more. From my testing it looks like it takes longer to start up (and does a load of model configuration, looking at how much it spams the logs at debug level), and then lets you have about double the batchsize/potentially train a model you couldn't train before. I don't notice much slowdown on this to be honest, but there is definitely some.

A certain user here was whinging that they weren't supporting lower end cards on FS now. These changes look like they're almost entirely there to make things more accessible to lower end users, so shows what he knows. I guess this also means model developers will be able to make bigger (albeit slower) models too. One user on Discord was claiming he could now train Original Model at batchsize 160 on a GTX980 4GB on Windows, which seems a hell of an achievement to me, seeing as often Original model wouldn't even start on this configuration. I might have a play around to see what the minimum VRAM amount I can start training a model is now. In theory, if these changes scale, then it may be possible to train models on less than 1GB of VRAM now.

Also, it works with all existing (and in theory future) models, which could probablly have only been implemented with the refactor, so I guess it's beginning to make sense why they did it now.

I just tested limiting my VRAM. It looks like lightweight will not go lower than 1096MB VRAM useage, at batchsize 8, but that still seems pretty good to me.
#5
Am training a DFaker model with Ping pong with data from a set I used with Original and Dfl-H128 before. I'm able to get a 16 batch in a model I couldn't run at all on a 4gb 1050ti before. Yes, it's slow - after five hours I'm still in the middle .03s, but that's not so bad on a model that was blocked to me before.

I have no luck at all using MSG - it crashes every time I have used it, either attempting to use it by itself or in collaboration with Ping pong. Perhaps I'll submit the Crash log to Discord next time, see if anyone can discern why.

Would love to hear from other 4gb VRAM users what their experiences are on Ping pong.
#6
For Memory Saiving Gradients you need to have toposort installed.

From inside your faceswap environment:
pip install toposort
#7
Installing and updating was easy when I was using Anaconda version of Faceswap. But now I have the Windows-installer version, where the CLI does not give me an input prompt. How can I pass commands to do this?
#8
Can confirm unable to run Dfl-H128 on Ping Pong on a 4gb 1050ti with Lowmem switched off - OOM errors, even on the lowest settings.
#9
If you go start>Anaconda Prompt (it gets installed with faceswap), then do:
conda activate faceswap
pip install toposort

You can then close anaconda prompt when it's updated and carry on
#10
(03-18-2019, 03:34 PM)HarryPalmer Wrote: You are not allowed to view links. Register or Login to view.Can confirm unable to run Dfl-H128 on Ping Pong on a 4gb 1050ti with Lowmem switched off - OOM errors, even on the lowest settings.

So now I'm able to run a 32 batch training session on Dfl-H128, using Ping pong, with the Lowmem setting. That maxes out the 4gb VRAM on the 1050ti. Thing is, I wonder if there is any advantage in this, since the reduced training time benefit of the larger batch size is offset (exceeded?) by the latency of Ping pong. I'm not sure that Ping pong can do anything for Dfl-H128 on a 4gb card. Trouble is, it would be a long process to test it out and see if there are any savings.

(03-18-2019, 03:49 PM)anotherfaker Wrote: You are not allowed to view links. Register or Login to view.If you go start>Anaconda Prompt (it gets installed with faceswap), then do:
conda activate faceswap
pip install toposort

You can then close anaconda prompt when it's updated and carry on

Thanks. I followed the exact advice, apparently with success, but on attempting to use it, got a crash.


Code:
ModuleNotFoundError: No module named 'toposort'

Forum Jump:

Users browsing this thread: 1 Guest(s)