MrDeepFakes Forums

Some content may not be available to Guests. Consider registering an account to enjoy unrestricted access to guides, support and tools

  • We are looking for community members who are intested in helping out. See our HELP WANTED post.

1080 TI and super low BS.

DeepCloud

DF Vagrant
==            resolution: 256                 ==

==             face_type: f                   ==
==            learn_mask: True                ==
==        optimizer_mode: 3                   ==
==                 archi: df                  ==
==               ae_dims: 512                 ==
==            ed_ch_dims: 21                  ==
==           random_warp: True                ==
==    true_face_training: False               ==
==      face_style_power: 0.0                 ==
==        bg_style_power: 0.0                 ==
==               ct_mode: rct                 ==
==              clipgrad: False               ==
==            batch_size: 4                   ==
==                                            ==
==---------------- Running On ----------------==
==                                            ==
==          Device index: 0                   ==
==                  Name: GeForce GTX 1080 Ti ==
==                  VRAM: 11.00GB             ==
==                                            ==

This is my SAEHD training with 256 resolution, everything else I left it as default. I couldn't run BS greater than 4 despite turning random warp off. Is this normal for 1080 TI? I have seen some users mentioned they that could push to 8BS on SAEHD256 with thier 1080ti.

What is wrong with my settings here?
 

TalosOfCrete

DF Vagrant
-You don't need learn_mask on the entire time. It is generally recommended to only have it on for a couple thousand iterations, as it learns the mask fast and consumes a considerable chunk of VRAM.

-In general, I never recommend 256x256. The sweet spot is usually 160x160 or 192x192. For example, Ctrl+Shift+Face trains at 160x160. Cranking it up to 256x256 is extremely computationally intensive, and is forcing you to use Optimizer 3 and BS 4 (256x256 is only really beneficial in high resolution, very close up shots). For your hardware (and in general) you should definitely turn it down.

-Last tips:
-Use clipgrad. It can save your model's ass quite often. A collapsed model will mean you'll have to start over from your most recent backup. Not fun. The performance penalty is quite small anyway.
-On your hardware, you may be able to add a couple to a few (~3) more e_ch/d_ch dims (encoder/decoder dimensions per channel) as well as a ~32 jump in AE dims if you drop your resolution. This will make smaller details like eye movements and teeth easier and faster to capture (IF YOU BUMP THEM, MAKE SURE TO INCREASE BOTH - simply increasing one or the other may waste computational resources, with the model failing to either deliver the information from one end to the other (AE dims) or failing to capture the detail (e_ch/d_ch dims).
 

TMBDF

Moderator | Deepfake Creator | Guide maintainer
Staff member
Moderator
Verified Video Creator
TalosOfCrete said:
-You don't need learn_mask on the entire time.  It is generally recommended to only have it on for a couple thousand iterations, as it learns the mask fast and consumes a considerable chunk of VRAM.

-In general, I never recommend 256x256.  The sweet spot is usually 160x160 or 192x192.  For example, Ctrl+Shift+Face trains at 160x160.  Cranking it up to 256x256 is extremely computationally intensive, and is forcing you to use Optimizer 3 and BS 4 (256x256 is only really beneficial in high resolution, very close up shots).  For your hardware (and in general) you should definitely turn it down.

-Last tips:
  -Use clipgrad.  It can save your model's ass quite often.  A collapsed model will mean you'll have to start over from your most recent backup.  Not fun.  The performance penalty is quite small anyway.
  -On your hardware, you may be able to add a couple to a few (~3) more e_ch/d_ch dims (encoder/decoder dimensions per channel) as well as a ~32 jump in AE dims if you drop your resolution.  This will make smaller details like eye movements and teeth easier and faster to capture (IF YOU BUMP THEM, MAKE SURE TO INCREASE BOTH - simply increasing one or the other may waste computational resources, with the model failing to either deliver the information from one end to the other (AE dims) or failing to capture the detail (e_ch/d_ch dims).

Nice to see people reading some of my stuff (unless you knew all of that, then good job anyway on just not being another n00b ;P ).
Exactly, learn mask can be enabled for a while (I'd say it's best to turn it once the faces are well trained). Learn mask is surprisingly heavy on vram and overall speed. I recommend just using FAN-DST, it gets the job done 99% of the time.

256 is fine for closeups and really only that, if you're doing SFW fakes it might be worth it but only if you have a really good dataset (super sharp, consistent in the lighting), otherwise all flaws of your dataset will show up even more, especially if the person has facial hair which changes look constantly and at higher res model can have difficulty properly training on like individual strands/clumps of hair, etc).
256 I'd recommend also for stuff like 4K porn where you just need some more detail but all of that can be faked with upscaling. Built in sharpening along with upscaling (RankSRGAN) can really make the faces look high res. Anything above 128 if fine for 60% of scenes, then 160/176/192 will help with some closeups, etc or just to make small detail learning more effective (stuff like freckles, beauty marks, etc).
Remember 256x256 is 4 times the resolution of 128x128, thus in perfectly "scalable" world it would mean 4 times more data/vram usage/4 times slower/4 times smaller batch size, etc.

Still don't know why clipgrad isn't on by default and not toggable, perfromance hit is was a margin of error, I measured 50 ms increase in iteration time, didn't check vram but it didn't cause OOM on maxed out batch so it's probably not much.

I'd probably focus more on resolution tho than on increasing ae/e/d_ch dims, especially since at lower resolution the change might not be noticeable and higher resolution would definitely be noticeable,
 

DeepCloud

DF Vagrant
tutsmybarreh said:
TalosOfCrete said:
-You don't need learn_mask on the entire time.  It is generally recommended to only have it on for a couple thousand iterations, as it learns the mask fast and consumes a considerable chunk of VRAM.

-In general, I never recommend 256x256.  The sweet spot is usually 160x160 or 192x192.  For example, Ctrl+Shift+Face trains at 160x160.  Cranking it up to 256x256 is extremely computationally intensive, and is forcing you to use Optimizer 3 and BS 4 (256x256 is only really beneficial in high resolution, very close up shots).  For your hardware (and in general) you should definitely turn it down.

-Last tips:
  -Use clipgrad.  It can save your model's ass quite often.  A collapsed model will mean you'll have to start over from your most recent backup.  Not fun.  The performance penalty is quite small anyway.
  -On your hardware, you may be able to add a couple to a few (~3) more e_ch/d_ch dims (encoder/decoder dimensions per channel) as well as a ~32 jump in AE dims if you drop your resolution.  This will make smaller details like eye movements and teeth easier and faster to capture (IF YOU BUMP THEM, MAKE SURE TO INCREASE BOTH - simply increasing one or the other may waste computational resources, with the model failing to either deliver the information from one end to the other (AE dims) or failing to capture the detail (e_ch/d_ch dims).

Nice to see people reading some of my stuff (unless you knew all of that, then good job anyway on just not being another n00b ;P ).
Exactly, learn mask can be enabled for a while (I'd say it's best to turn it once the faces are well trained). Learn mask is surprisingly heavy on vram and overall speed. I recommend just using FAN-DST, it gets the job done 99% of the time.

256 is fine for closeups and really only that, if you're doing SFW fakes it might be worth it but only if you have a really good dataset (super sharp, consistent in the lighting), otherwise all flaws of your dataset will show up even more, especially if the person has facial hair which changes look constantly and at higher res model can have difficulty properly training on like individual strands/clumps of hair, etc).
256 I'd recommend also for stuff like 4K porn where you just need some more detail but all of that can be faked with upscaling. Built in sharpening along with upscaling (RankSRGAN) can really make the faces look high res. Anything above 128 if fine for 60% of scenes, then 160/176/192 will help with some closeups, etc or just to make small detail learning more effective (stuff like freckles, beauty marks, etc).
Remember 256x256 is 4 times the resolution of 128x128, thus in perfectly "scalable" world it would mean 4 times more data/vram usage/4 times slower/4 times smaller batch size, etc.

Still don't know why clipgrad isn't on by default and not toggable, perfromance hit is was a margin of error, I measured 50 ms increase in iteration time, didn't check vram but it didn't cause OOM on maxed out batch so it's probably not much.

I'd probably focus more on resolution tho than on increasing ae/e/d_ch dims, especially since at lower resolution the change might not be noticeable and higher resolution would definitely be noticeable,

Thanks, I will try clipgrad now.
 
Top