MrDeepFakes Forums

Some content may not be available to Guests. Consider registering an account to enjoy unrestricted access to guides, support and tools

  • We are looking for community members who are intested in helping out. See our HELP WANTED post.

model training change

thanks
I think maybe I can short the training time if H128 model could transfer saw
seems I have to restart......

there is another problem with sae
my couputer can run H128,but sae can't.
 

mondomonger

DF Admirer
Verified Video Creator
SAE is slow to train and conversions are 4x slower (with FAN-dst). The upside is SAE does a better job with profiles and can handle obstructions. Model stability was an issue in the past, not sure about now.

H128 is stable. Can't handle obstructions or profiles very well. Much, much faster. I have never had an H128 model collapse and I have multiple models at 600k+ iterations and sometimes recycle to 2-3-4 different pornstars or celebs.

I have stopped H128 and worked on SAE twice now - wasting 2 weeks in my opinion. I keep coming back to H128.

I love clips that are SAE. They look great, but to me the time/effort vs the improvement in output is just not worth it.

(Feels nice to write this and not worry about iperov coming on here to blast me out of the galaxy.......)
 
"OOM"
it means my GUP is too low to run sae??
I use Geforce 1060 6gb
I watch the post seems that sae will do bettter than H128,so I try to use it.


===== Model summary =====
== Model name: SAE
==
== Current iteration: 0
==
== Model options:
== |== batch_size : 8
== |== sort_by_yaw : False
== |== random_flip : True
== |== resolution : 128
== |== face_type : f
== |== learn_mask : True
== |== optimizer_mode : 1
== |== archi : df
== |== ae_dims : 512
== |== e_ch_dims : 42
== |== d_ch_dims : 21
== |== multiscale_decoder : False
== |== ca_weights : False
== |== pixel_loss : False
== |== face_style_power : 10.0
== |== bg_style_power : 10.0
== |== apply_random_ct : False
== |== clipgrad : False
== Running on:
== |== [0 : GeForce GTX 1060 6GB]
=========================
Starting. Press "Enter" to stop training and save model.
Error: OOM when allocating tensor with shape[8,252,64,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node model_3/conv2d_31/convolution}} = Conv2D[T=DT_FLOAT, _class=["loc:mad:gradients/model_3/conv2d_31/convolution_grad/Conv2DBackpropInput"], data_format="NCHW", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](model_3/leaky_re_lu_29/LeakyRelu, conv2d_31/kernel/read)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

[[{{node add_29/_1117}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_7871_add_29", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

Traceback (most recent call last):
File "C:\Users\user\Desktop\DeepFaceLabCUDA10.1AVX\_internal\DeepFaceLab\mainscripts\Trainer.py", line 107, in trainerThread
iter, iter_time = model.train_one_iter()
File "C:\Users\user\Desktop\DeepFaceLabCUDA10.1AVX\_internal\DeepFaceLab\models\ModelBase.py", line 472, in train_one_iter
losses = self.onTrainOneIter(sample, self.generator_list)
File "C:\Users\user\Desktop\DeepFaceLabCUDA10.1AVX\_internal\DeepFaceLab\models\Model_SAE\Model.py", line 430, in onTrainOneIter
src_loss, dst_loss, = self.src_dst_train (feed)
File "C:\Users\user\Desktop\DeepFaceLabCUDA10.1AVX\_internal\python-3.6.8\lib\site-packages\keras\backend\tensorflow_backend.py", line 2715, in __call__
return self._call(inputs)
File "C:\Users\user\Desktop\DeepFaceLabCUDA10.1AVX\_internal\python-3.6.8\lib\site-packages\keras\backend\tensorflow_backend.py", line 2675, in _call
fetched = self._callable_fn(*array_vals)
File "C:\Users\user\Desktop\DeepFaceLabCUDA10.1AVX\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1439, in __call__
run_metadata_ptr)
File "C:\Users\user\Desktop\DeepFaceLabCUDA10.1AVX\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 528, in __exit__
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[8,252,64,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node model_3/conv2d_31/convolution}} = Conv2D[T=DT_FLOAT, _class=["loc:mad:gradients/model_3/conv2d_31/convolution_grad/Conv2DBackpropInput"], data_format="NCHW", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](model_3/leaky_re_lu_29/LeakyRelu, conv2d_31/kernel/read)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

[[{{node add_29/_1117}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_7871_add_29", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

Done.
 

zipperguy

DF Vagrant
Verified Video Creator
Try lowering the batch size. I have GTX 1080 and couldn't use a batch size greater than 4 without running into memory problems.
 

[deleted]

DF Vagrant
Kyuri said:
Change optimizer_mode to 2 or 3

Use H128 bro its quicker and can be just as good as SAE and you can use a higher batch. especially no need for the extra overhead of SAE if you are just doing 128
 
GhostTears said:
Kyuri將optimizer_mode said:
更改為2或3

使用H128更快,可以和SAE一樣好,你可以使用更高的批次。如果你只是做128,特別是不需要額外的SAE開銷

H128 is good
但 h128 can’t do blowjob.........


When I training SAE
the “system shows out of memory “
but It still can run.
should I worry about this massage?
 

[deleted]

DF Vagrant
liang741107 said:
GhostTears said:
Kyuri將optimizer_mode said:
更改為2或3

使用H128更快,可以和SAE一樣好,你可以使用更高的批次。如果你只是做128,特別是不需要額外的SAE開銷

H128 is good
但 h128 can’t do blowjob.........


When I training SAE
the “system shows out of memory “
but It still can run.
should I worry about this massage?



it might help to know what batch size you are running and what kind of GPU you have. I think @dpfks has a spreadsheet somewhere with the proper batch for SAE on lots of cards.
 
Top