MrDeepFakes Forums

Some content may not be available to Guests. Consider registering an account to enjoy unrestricted access to guides, support and tools

  • We are looking for community members who are intested in helping out. See our HELP WANTED post.

DFL CUDA error on GTX 1050 TI?

tarqua

DF Vagrant
I've searched everywhere but haven't found any clarity on this error so far... I'm trying to run DFL on a new PC: 
  • Intel(R) Zeon(R) CPU E3-1225 V2 @ 3.20GHz
  • Windows 10 64-bit x64 processor
  • 16 GB DDR3 RAM
  • NVIDIA GeForce GTX 1050 Ti (4 GB dedicated), freshly installed drivers
I know the 1050 isn't ideal, but it's what I have to work with for now and is still better than running it on my CPU. Currently I can run DeepFaceLabOpenCLSSE_build_04_07_2019 but if I try to train using DeepFaceLabCUDA9.2SSE_build_04_07_2019, I get the following error every time, no matter what settings/batch size I try. Any assistance is much appreciated. I'm also welcome to any advice on the best settings for my system being new to DFL. I'm starting with a faceset of 3,534 in src and 2,654 in dst for training. 

Running trainer.

Loading model...

Model first run. Enter model options as default for each run.
Write preview history? (y/n ?:help skip:n) : n
Target iteration (skip:unlimited/default) :
0
Batch_size (?:help skip:0) : 100
Feed faces to network sorted by yaw? (y/n ?:help skip:n) : n
Flip faces randomly? (y/n ?:help skip:y) :
y
Src face scale modifier % ( -30...30, ?:help skip:0) :
0
Resolution ( 64-256 ?:help skip:128) :
128
Half or Full face? (h/f, ?:help skip:f) :
f
Learn mask? (y/n, ?:help skip:y) :
y
Optimizer mode? ( 1,2,3 ?:help skip:1) :
1
AE architecture (df, liae ?:help skip:df) :
df
AutoEncoder dims (32-1024 ?:help skip:512) :
512
Encoder dims per channel (21-85 ?:help skip:42) :
42
Decoder dims per channel (10-85 ?:help skip:21) :
21
Remove gray border? (y/n, ?:help skip:n) :
n
Use multiscale decoder? (y/n, ?:help skip:n) :
n
Use pixel loss? (y/n, ?:help skip: n ) :
n
Face style power ( 0.0 .. 100.0 ?:help skip:0.00) :
0.0
Background style power ( 0.0 .. 100.0 ?:help skip:0.00) :
0.0
Using TensorFlow backend.
Loading: 100%|####################################################################| 3534/3534 [00:17<00:00, 202.30it/s]
Loading: 100%|####################################################################| 2654/2654 [00:05<00:00, 483.74it/s]
===== Model summary =====
== Model name: SAE
==
== Current iteration: 0
==
== Model options:
== |== batch_size : 100
== |== sort_by_yaw : False
== |== random_flip : True
== |== resolution : 128
== |== face_type : f
== |== learn_mask : True
== |== optimizer_mode : 1
== |== archi : df
== |== ae_dims : 512
== |== e_ch_dims : 42
== |== d_ch_dims : 21
== |== remove_gray_border : False
== |== multiscale_decoder : False
== |== pixel_loss : False
== |== face_style_power : 0.0
== |== bg_style_power : 0.0
== Running on:
== |== [0 : GeForce GTX 1050 Ti]
=========================
Starting. Press "Enter" to stop training and save model.
Error: OOM when allocating tensor with shape[504] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[{{node mul_137}} = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Adam/beta_2/read, Variable_101/read)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

         [[{{node Mean_2/_1091}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_6240_Mean_2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

Traceback (most recent call last):
  File "C:\DeepFaceLabCUDA9.2SSE\_internal\DeepFaceLab\mainscripts\Trainer.py", line 93, in trainerThread
    iter, iter_time = model.train_one_iter()
  File "C:\DeepFaceLabCUDA9.2SSE\_internal\DeepFaceLab\models\ModelBase.py", line 362, in train_one_iter
    losses = self.onTrainOneIter(sample, self.generator_list)
  File "C:\DeepFaceLabCUDA9.2SSE\_internal\DeepFaceLab\models\Model_SAE\Model.py", line 375, in onTrainOneIter
    src_loss, dst_loss, = self.src_dst_train (feed)
  File "C:\DeepFaceLabCUDA9.2SSE\_internal\python-3.6.8\lib\site-packages\keras\backend\tensorflow_backend.py", line 2715, in __call__
    return self._call(inputs)
  File "C:\DeepFaceLabCUDA9.2SSE\_internal\python-3.6.8\lib\site-packages\keras\backend\tensorflow_backend.py", line 2675, in _call
    fetched = self._callable_fn(*array_vals)
  File "C:\DeepFaceLabCUDA9.2SSE\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1439, in __call__
    run_metadata_ptr)
  File "C:\DeepFaceLabCUDA9.2SSE\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 528, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[504] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[{{node mul_137}} = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Adam/beta_2/read, Variable_101/read)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

         [[{{node Mean_2/_1091}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_6240_Mean_2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

Done.
Press any key to continue . . .
 

tarqua

DF Vagrant
Thanks. I've tried batch size of 4 even and get nowhere with SAE. After more reading and understanding I tried H64 with a batch of 8 and it at least started training but still didn't last long. I figured 4GB would be able to at least half what I see as settings for 8GB but I guess not :/
 

Pocketspeed

DF Admirer
Verified Video Creator
You should be able to use a lightweight option, or lowmem option, (I forget what it's called with DFL, I think light encoder). I know that there have been successful DeepFakes with DFL on 2 Gb cards, but maybe not using the SAE model, I don't know. Check the Guides section for more info about DFL. The batch size will probably need to be low, though.
 

coffee

DF Vagrant
Verified Video Creator
I have a GTX 980 and top out at about BS 10 for SAE. I need to use optimizer mode 2, though. I also need to restart my PC sometimes for it to work - I think the cache builds up too much maybe. The only other thing I have different is multiscale decoder - true.
 

dpfks

DF Enthusiast
Staff member
Administrator
Verified Video Creator
tarqua said:
Thanks. I've tried batch size of 4 even and get nowhere with SAE. After more reading and understanding I tried H64 with a batch of 8 and it at least started training but still didn't last long. I figured 4GB would be able to at least half what I see as settings for 8GB but I guess not :/

use BS 4, optimizer 2
 
Top