MrDeepFakes Forums
  • New and improved dark forum theme!
  • Guests can now comment on videos on the tube.
   
  •  Previous
  • 1
  • 2
  • 3(current)
  • 4
  • 5
  • ...
  • 39
  • Next 
[GUIDE] DeepFaceLab - Google Colab Tutorial
#21
It is the clock speed. I use 2 16gb VRAM cards, the Quadro p5000 with a clock speed of 1607mhz and the Tesla T4 with a clock speed of 585 mhz.

The Quadro runs SAE at batch 21 around 2700-2800ms iter, the T4 runs the identical model around 3500 to 3600.
#22
I've followed the tutorial and make my model trained, but I found the Colab session keeps reseting and clearing all the files for 2-3 hours, not 12 hours as said. Does anyone know why ?
#23
(04-30-2019, 06:21 AM)squally2k Wrote: You are not allowed to view links. Register or Login to view.I've followed the tutorial and make my model trained, but I found the Colab session keeps reseting and clearing all the files for 2-3 hours, not 12 hours as said.  Does anyone know why ?

Because, you closed the browser tab where Colab was.
#24
(04-30-2019, 07:47 AM)chervonij Wrote: You are not allowed to view links. Register or Login to view.
(04-30-2019, 06:21 AM)squally2k Wrote: You are not allowed to view links. Register or Login to view.I've followed the tutorial and make my model trained, but I found the Colab session keeps reseting and clearing all the files for 2-3 hours, not 12 hours as said.  Does anyone know why ?

Because, you closed the browser tab where Colab was.

Oic, I thought it would keep running in the backend . Thanks @chervonij .
#25
Code:
Using TensorFlow backend.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
Loading: 0it [00:00, ?it/s]
Loading: 0it [00:00, ?it/s]
Error: integer division or modulo by zero
Traceback (most recent call last):
 File "/content/DeepFaceLab/mainscripts/Trainer.py", line 44, in trainerThread
   device_args=device_args)
 File "/content/DeepFaceLab/models/ModelBase.py", line 204, in __init__
   self.sample_for_preview = self.generate_next_sample()
 File "/content/DeepFaceLab/models/ModelBase.py", line 394, in generate_next_sample
   return [next(generator) for generator in self.generator_list]
 File "/content/DeepFaceLab/models/ModelBase.py", line 394, in <listcomp>
   return [next(generator) for generator in self.generator_list]
 File "/content/DeepFaceLab/samplelib/SampleGeneratorFace.py", line 52, in __next__
   generator = self.generators[self.generator_counter % len(self.generators) ]
ZeroDivisionError: integer division or modulo by zero
Done.
I'm getting this problem after I click train, I've followed the guide to a T. How can I go about fixing it?
#26
(04-30-2019, 01:28 PM)Sturmux Wrote: You are not allowed to view links. Register or Login to view.
Code:
Using TensorFlow backend.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
Loading: 0it [00:00, ?it/s]
Loading: 0it [00:00, ?it/s]
Error: integer division or modulo by zero
Traceback (most recent call last):
 File "/content/DeepFaceLab/mainscripts/Trainer.py", line 44, in trainerThread
   device_args=device_args)
 File "/content/DeepFaceLab/models/ModelBase.py", line 204, in __init__
   self.sample_for_preview = self.generate_next_sample()
 File "/content/DeepFaceLab/models/ModelBase.py", line 394, in generate_next_sample
   return [next(generator) for generator in self.generator_list]
 File "/content/DeepFaceLab/models/ModelBase.py", line 394, in <listcomp>
   return [next(generator) for generator in self.generator_list]
 File "/content/DeepFaceLab/samplelib/SampleGeneratorFace.py", line 52, in __next__
   generator = self.generators[self.generator_counter % len(self.generators) ]
ZeroDivisionError: integer division or modulo by zero
Done.
I'm getting this problem after I click train, I've followed the guide to a T. How can I go about fixing it?

You don't have aligned images.
#27
can it actually train 256 resolution?

I have not been able to train at 256.

edit: if you have, what were your settings?
#28
@Oglethorpe

It can. But I have not tested what the optimal settings will be for Colab.
#29
with everything at default and optimizer at 3 it gives OOM errors and never runs.

so im guessing I need to lower ae_dims, e_ch_dims and d_ch_dims, would the trade off be worth it just to train at 256 ?
#30
@Oglethorpe

Yes, try gradually reducing these settings. Also note that the batch should not be very small, otherwise you will train for a very long time. To reduce the memory load, you can disable mask learning.
  •  Previous
  • 1
  • 2
  • 3(current)
  • 4
  • 5
  • ...
  • 39
  • Next 

Forum Jump:

Users browsing this thread: 2 Guest(s)