MrDeepFakes Forums

Some content may not be available to Guests. Consider registering an account to enjoy unrestricted access to guides, support and tools

  • We are looking for community members who are intested in helping out. See our HELP WANTED post.

can anyone help please to get cuda work on gtx1650 ?

xxxxxx

DF Pleb
i bought a new laptop GeForce GTX1650. Tried DeepFaceLab_CUDA_9.2_SSE, didnt work cuda error, then tried DeepFaceLab_CUDA_10.1_AVX, again an error:


Performing 1st pass...
Running on GeForce GTX 1650 #0.
Exception: Traceback (most recent call last):
  File "D:\DeepFaceLab_CUDA_10.1_AVX\_internal\DeepFaceLab\joblib\SubprocessorBase.py", line 59, in _subprocess_run
    self.on_initialize(client_dict)
  File "D:\DeepFaceLab_CUDA_10.1_AVX\_internal\DeepFaceLab\mainscripts\Extractor.py", line 72, in on_initialize
    nnlib.import_all (device_config)
  File "D:\DeepFaceLab_CUDA_10.1_AVX\_internal\DeepFaceLab\nnlib\nnlib.py", line 1312, in import_all
    nnlib.import_keras(device_config)
  File "D:\DeepFaceLab_CUDA_10.1_AVX\_internal\DeepFaceLab\nnlib\nnlib.py", line 206, in import_keras
    nnlib._import_tf(device_config)
  File "D:\DeepFaceLab_CUDA_10.1_AVX\_internal\DeepFaceLab\nnlib\nnlib.py", line 194, in _import_tf
    nnlib.tf_sess = tf.Session(config=config)
  File "D:\DeepFaceLab_CUDA_10.1_AVX\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1551, in __init__
    super(Session, self).__init__(target, graph, config=config)
  File "D:\DeepFaceLab_CUDA_10.1_AVX\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 676, in __init__
    self._session = tf_session.TF_NewSessionRef(self._graph._c_graph, opts)
tensorflow.python.framework.errors_impl.InternalError: cudaGetDevice() failed. Status: CUDA driver version is insufficient for CUDA runtime version

Then installed cuda 10.1 from the nvidea site. Although GeForce GTX1650 isnt on this list (https://developer.nvidia.com/cuda-gpus).

Didnt work either. What am i doing wrong.
Can anyone help please?
 

Groggy4

NotSure
Verified Video Creator
Have you downloaded nvidia's "geforce experience" and got the latest available build? Not sure it's pre installed on every laptops. I had to get that software manually.
 

xxxxxx

DF Pleb
Hi, thanks for replying.

Yes i see the icon now clicking right the mousebutten on the NVIDIA icon. I need to agree for it to proceed.
Do i need this programm? Do i use DeepFaceLab_CUDA_10.1_AVX ?
 

Groggy4

NotSure
Verified Video Creator
Yeah you need it to get the latest drivers for your graphic card, among other things. You could try the AVX first. Should work with your processor I guess.
 

TMBDF

Moderator | Deepfake Creator | Guide maintainer
Staff member
Moderator
Verified Video Creator
You don't need to install CUDA, just make sure you have newest GPU drivers and it should all work, also you didn't sepcify which version of DFL (date) you are running, your OS, and where this error occurs, without this we can't help you.
 

xxxxxx

DF Pleb
i have to stop and pull it over the weekend, your helping me alot thanks, hope you will be around next week
 

xxxxxx

DF Pleb
My new laptop:

Acer
Intel(R) Core(TM) i7-9750H CPU @ 2.60GhZ 2.59GHZ
(RAM): 32GB (31,8 GB)
64-bits, x64-processor
Windows10
512 GB SSD + 1 TB HDD
NVIDEA GeForce GTX 1650

Downloaded both DeepFaceLab_CUDA_9.2_SSE_build_11_12_2019 and DeepFaceLab_CUDA_10.1_AVX_build_11_14_2019 on my D: drive


First i deinstalled cuda 10.1.

Then like you said get the latest CPU drivers by opening GeForce Experience created a NVIDEA account and installed and then upgraded the latest version of the GeForce Game Ready Driver.
Is this ok?
Then tried again to run DeepFaceLab_CUDA_10.1_AVX_build_11_14_2019. Then at step 6 i got this error:


Running trainer.

Loading model...

Model first run.
Enable autobackup? (y/n ?:help skip:n) : y
Write preview history? (y/n ?:help skip:n) : y
Choose image for the preview history? (y/n skip:n) : y
Target iteration (skip:unlimited/default) :
0
Batch_size (?:help skip:0) : ?
Larger batch size is better for NN's generalization, but it can cause Out of Memory error. Tune this value for your videocard manually.
Batch_size (?:help skip:0) : 4
Feed faces to network sorted by yaw? (y/n ?:help skip:n) :
n
Flip faces randomly? (y/n ?:help skip:y) :
y
Src face scale modifier % ( -30...30, ?:help skip:0) :
0
Resolution ( 64-256 ?:help skip:128) :
128
Half or Full face? (h/f, ?:help skip:f) :
f
Learn mask? (y/n, ?:help skip:y ) :
y
Optimizer mode? ( 1,2,3 ?:help skip:1) :
1
AE architecture (df, liae ?:help skip:df) :
df
AutoEncoder dims (32-1024 ?:help skip:512) :
512
Encoder dims per channel (21-85 ?:help skip:42) :
42
Decoder dims per channel (10-85 ?:help skip:21) :
21
Use CA weights? (y/n, ?:help skip:n ) :
n
Use pixel loss? (y/n, ?:help skip:n ) :
n
Face style power ( 0.0 .. 100.0 ?:help skip:0.00) : 1
Background style power ( 0.0 .. 100.0 ?:help skip:0.00) : 1
Color transfer mode apply to src faceset. ( none/rct/lct/mkl/idt/sot, ?:help skip:none) : rct
Enable gradient clipping? (y/n, ?:help skip:n) :
n
Pretrain the model? (y/n, ?:help skip:n) : y
Using TensorFlow backend.
Loading: 100%|##################################################################| 24711/24711 [01:07<00:00, 366.94it/s]
Choose image for the preview history. [p] - next. [enter] - confirm.
=============== Model Summary ===============
== ==
== Model name: SAE ==
== ==
== Current iteration: 0 ==
== ==
==------------- Model Options -------------==
== ==
== autobackup: True ==
== write_preview_history: True ==
== sort_by_yaw: False ==
== random_flip: True ==
== resolution: 128 ==
== face_type: f ==
== learn_mask: True ==
== optimizer_mode: 1 ==
== archi: df ==
== ae_dims: 512 ==
== e_ch_dims: 42 ==
== d_ch_dims: 21 ==
== ca_weights: False ==
== pixel_loss: False ==
== face_style_power: 1.0 ==
== bg_style_power: 1.0 ==
== ct_mode: rct ==
== clipgrad: False ==
== pretrain: True ==
== batch_size: 4 ==
== ==
==-------------- Running On ---------------==
== ==
== Device index: 0 ==
== Name: GeForce GTX 1650 ==
== VRAM: 4.00GB ==
== ==
=============================================
Starting. Press "Enter" to stop training and save model.
Error: OOM when allocating tensor with shape[64512,512] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node mul_69}} = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Adam/beta_2/read, Variable_86/read)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

[[{{node Mean_11/_1133}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_8203_Mean_11", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

Traceback (most recent call last):
File "D:\DeepFaceLab_CUDA_10.1_AVX\_internal\DeepFaceLab\mainscripts\Trainer.py", line 109, in trainerThread
iter, iter_time = model.train_one_iter()
File "D:\DeepFaceLab_CUDA_10.1_AVX\_internal\DeepFaceLab\models\ModelBase.py", line 525, in train_one_iter
losses = self.onTrainOneIter(sample, self.generator_list)
File "D:\DeepFaceLab_CUDA_10.1_AVX\_internal\DeepFaceLab\models\Model_SAE\Model.py", line 509, in onTrainOneIter
src_loss, dst_loss, = self.src_dst_train (feed)
File "D:\DeepFaceLab_CUDA_10.1_AVX\_internal\python-3.6.8\lib\site-packages\keras\backend\tensorflow_backend.py", line 2715, in __call__
return self._call(inputs)
File "D:\DeepFaceLab_CUDA_10.1_AVX\_internal\python-3.6.8\lib\site-packages\keras\backend\tensorflow_backend.py", line 2675, in _call
fetched = self._callable_fn(*array_vals)
File "D:\DeepFaceLab_CUDA_10.1_AVX\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1439, in __call__
run_metadata_ptr)
File "D:\DeepFaceLab_CUDA_10.1_AVX\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 528, in __exit__
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[64512,512] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node mul_69}} = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Adam/beta_2/read, Variable_86/read)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

[[{{node Mean_11/_1133}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_8203_Mean_11", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.



Then i tried to run DeepFaceLab_CUDA_9.2_SSE_build_11_12_2019
And again this error appeared:



Running trainer.

Loading model...

Model first run.
Enable autobackup? (y/n ?:help skip:n) : y
Write preview history? (y/n ?:help skip:n) : y
Choose image for the preview history? (y/n skip:n) :
n
Target iteration (skip:unlimited/default) :
0
Batch_size (?:help skip:0) : 3
Feed faces to network sorted by yaw? (y/n ?:help skip:n) :
n
Flip faces randomly? (y/n ?:help skip:y) :
y
Src face scale modifier % ( -30...30, ?:help skip:0) :
0
Resolution ( 64-256 ?:help skip:128) :
128
Half or Full face? (h/f, ?:help skip:f) :
f
Learn mask? (y/n, ?:help skip:y ) :
y
Optimizer mode? ( 1,2,3 ?:help skip:1) :
1
AE architecture (df, liae ?:help skip:df) :
df
AutoEncoder dims (32-1024 ?:help skip:512) :
512
Encoder dims per channel (21-85 ?:help skip:42) :
42
Decoder dims per channel (10-85 ?:help skip:21) :
21
Use CA weights? (y/n, ?:help skip:n ) :
n
Use pixel loss? (y/n, ?:help skip:n ) :
n
Face style power ( 0.0 .. 100.0 ?:help skip:0.00) : 1
Background style power ( 0.0 .. 100.0 ?:help skip:0.00) : 1
Color transfer mode apply to src faceset. ( none/rct/lct/mkl/idt, ?:help skip:none) : lct
Enable gradient clipping? (y/n, ?:help skip:n) :
n
Pretrain the model? (y/n, ?:help skip:n) :
n
Using TensorFlow backend.
Loading: 100%|#######################################################################| 632/632 [00:09<00:00, 68.57it/s]
Loading: 100%|#####################################################################| 2175/2175 [00:37<00:00, 57.92it/s]
=============== Model Summary ===============
== ==
== Model name: SAE ==
== ==
== Current iteration: 0 ==
== ==
==------------- Model Options -------------==
== ==
== autobackup: True ==
== write_preview_history: True ==
== sort_by_yaw: False ==
== random_flip: True ==
== resolution: 128 ==
== face_type: f ==
== learn_mask: True ==
== optimizer_mode: 1 ==
== archi: df ==
== ae_dims: 512 ==
== e_ch_dims: 42 ==
== d_ch_dims: 21 ==
== ca_weights: False ==
== pixel_loss: False ==
== face_style_power: 1.0 ==
== bg_style_power: 1.0 ==
== ct_mode: lct ==
== clipgrad: False ==
== batch_size: 3 ==
== ==
==-------------- Running On ---------------==
== ==
== Device index: 0 ==
== Name: GeForce GTX 1650 ==
== VRAM: 4.00GB ==
== ==
=============================================
Starting. Press "Enter" to stop training and save model.
Error: OOM when allocating tensor with shape[64512,512] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node mul_69}} = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Adam/beta_2/read, Variable_86/read)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

[[{{node Mean_11/_1133}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_8203_Mean_11", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

Traceback (most recent call last):
File "D:\DeepFaceLab_CUDA_9.2_SSE\_internal\DeepFaceLab\mainscripts\Trainer.py", line 109, in trainerThread
iter, iter_time = model.train_one_iter()
File "D:\DeepFaceLab_CUDA_9.2_SSE\_internal\DeepFaceLab\models\ModelBase.py", line 525, in train_one_iter
losses = self.onTrainOneIter(sample, self.generator_list)
File "D:\DeepFaceLab_CUDA_9.2_SSE\_internal\DeepFaceLab\models\Model_SAE\Model.py", line 509, in onTrainOneIter
src_loss, dst_loss, = self.src_dst_train (feed)
File "D:\DeepFaceLab_CUDA_9.2_SSE\_internal\python-3.6.8\lib\site-packages\keras\backend\tensorflow_backend.py", line 2715, in __call__
return self._call(inputs)
File "D:\DeepFaceLab_CUDA_9.2_SSE\_internal\python-3.6.8\lib\site-packages\keras\backend\tensorflow_backend.py", line 2675, in _call
fetched = self._callable_fn(*array_vals)
File "D:\DeepFaceLab_CUDA_9.2_SSE\_internal\python-3.6.8\lib\site-packages\tensorflow\python\client\session.py", line 1439, in __call__
run_metadata_ptr)
File "D:\DeepFaceLab_CUDA_9.2_SSE\_internal\python-3.6.8\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 528, in __exit__
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[64512,512] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node mul_69}} = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Adam/beta_2/read, Variable_86/read)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

[[{{node Mean_11/_1133}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_8203_Mean_11", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.



Does my new laptop need anything more?
Are my settings ok?
What am i doing wrong?
Can anyone help me please?
 

Groggy4

NotSure
Verified Video Creator
Your settings seems fine. It's just that your 4 gb gpu don't manage the stress alone. Which gives you the oom, out of memory error. Use optimizer mode 2. It will get an helping hand from your ram/cpu. And maybe leave the style powers at 0. They are memory demanding. Also learning mask is probably not necessary. Could be enabled later. Then try with a batch of 4.
 

xxxxxx

DF Pleb
its running... Though i see errors:

Running trainer.

Loading model...

Model first run.
Enable autobackup? (y/n ?:help skip:n) : y
Write preview history? (y/n ?:help skip:n) : y
Choose image for the preview history? (y/n skip:n) : y
Target iteration (skip:unlimited/default) :
0
Batch_size (?:help skip:0) : 4
Feed faces to network sorted by yaw? (y/n ?:help skip:n) :
n
Flip faces randomly? (y/n ?:help skip:y) : n
Src face scale modifier % ( -30...30, ?:help skip:0) :
0
Resolution ( 64-256 ?:help skip:128) :
128
Half or Full face? (h/f, ?:help skip:f) :
f
Learn mask? (y/n, ?:help skip:y ) : n
Optimizer mode? ( 1,2,3 ?:help skip:1) : 2
AE architecture (df, liae ?:help skip:df) :
df
AutoEncoder dims (32-1024 ?:help skip:512) :
512
Encoder dims per channel (21-85 ?:help skip:42) :
42
Decoder dims per channel (10-85 ?:help skip:21) :
21
Use CA weights? (y/n, ?:help skip:n ) :
n
Use pixel loss? (y/n, ?:help skip:n ) :
n
Face style power ( 0.0 .. 100.0 ?:help skip:0.00) :
0.0
Background style power ( 0.0 .. 100.0 ?:help skip:0.00) :
0.0
Color transfer mode apply to src faceset. ( none/rct/lct/mkl/idt/sot, ?:help skip:none) : lct
Enable gradient clipping? (y/n, ?:help skip:n) :
n
Pretrain the model? (y/n, ?:help skip:n) :
n
Using TensorFlow backend.
Loading: 100%|######################################################################| 707/707 [00:02<00:00, 266.23it/s]
Loading: 100%|####################################################################| 2153/2153 [00:06<00:00, 340.66it/s]
Choose image for the preview history. [p] - next. [enter] - confirm.
=============== Model Summary ===============
== ==
== Model name: SAE ==
== ==
== Current iteration: 0 ==
== ==
==------------- Model Options -------------==
== ==
== autobackup: True ==
== write_preview_history: True ==
== sort_by_yaw: False ==
== random_flip: False ==
== resolution: 128 ==
== face_type: f ==
== learn_mask: False ==
== optimizer_mode: 2 ==
== archi: df ==
== ae_dims: 512 ==
== e_ch_dims: 42 ==
== d_ch_dims: 21 ==
== ca_weights: False ==
== pixel_loss: False ==
== face_style_power: 0.0 ==
== bg_style_power: 0.0 ==
== ct_mode: lct ==
== clipgrad: False ==
== batch_size: 4 ==
== ==
==-------------- Running On ---------------==
== ==
== Device index: 0 ==
== Name: GeForce GTX 1650 ==
== VRAM: 4.00GB ==
== ==
=============================================
Starting. Press "Enter" to stop training and save model.
2019-11-19 19:24:41.418020: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 888.70M (931869440 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
[19:24:46][#000001][6455ms][1.4434][1.5399]
2019-11-19 19:30:25.036612: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 88.87M (93186816 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 19:30:25.042118: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 88.87M (93186816 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
[19:30:39][#000267][1243ms][1.3630][1.1271]


Do i need to stop it? Or let it run?
 

Groggy4

NotSure
Verified Video Creator
I'm not 100% sure. I just restarted it the few times I got that before. It usually makes your ram go sky high for a while. Then stabilize again. But I don't think it will ruin anything. But try restart and see if it happens again. Maybe then go batch 3 for a while. Then increase it to 4 later.
 

xxxxxx

DF Pleb
tried it 5 times, errors stay the same.
I emptied the SAE_history folder.
How can i change the bat number?
When i close and restart 6 it takes the old Model Summary


when i press enter it doesnt stop. it only stops when i close it. And when i start again i cannot fill in the Model Options so it keeps running on the old ones
 

Groggy4

NotSure
Verified Video Creator
When you press "train sae", you got a 2 sec window to press enter for edit training settings.
 

Thief_Kise

DF Vagrant
Verified Video Creator
After loading, you can overwrite the old settings within 2 seconds by pressing enter.

"Press enter to override settings" This (or a similar text with the same meaning) should appear after a while.
 

xxxxxx

DF Pleb
yes oeps sorry.... thanx


Now running, still the error:


Running trainer.

Loading model...
Press enter in 2 seconds to override model settings.
Enable autobackup? (y/n ?:help skip:y) : y
Write preview history? (y/n ?:help skip:y) : y
Choose image for the preview history? (y/n skip:n) :
n
Target iteration (skip:unlimited/default) :
0
Batch_size (?:help skip:4) : 3
Feed faces to network sorted by yaw? (y/n ?:help skip:n) :
n
Flip faces randomly? (y/n ?:help skip:n) :
n
Learn mask? (y/n, ?:help skip:n ) : n
Optimizer mode? ( 1,2,3 ?:help skip:2) :
2
Use pixel loss? (y/n, ?:help skip:n ) :
n
Face style power ( 0.0 .. 100.0 ?:help skip:0.00) :
0.0
Background style power ( 0.0 .. 100.0 ?:help skip:0.00) :
0.0
Color transfer mode apply to src faceset. ( none/rct/lct/mkl/idt/sot, ?:help skip:lct) :
lct
Enable gradient clipping? (y/n, ?:help skip:n) :
n
Using TensorFlow backend.
Loading: 100%|######################################################################| 707/707 [00:02<00:00, 275.97it/s]
Loading: 100%|####################################################################| 2153/2153 [00:06<00:00, 347.16it/s]
=============== Model Summary ===============
== ==
== Model name: SAE ==
== ==
== Current iteration: 1337 ==
== ==
==------------- Model Options -------------==
== ==
== autobackup: True ==
== write_preview_history: True ==
== sort_by_yaw: False ==
== random_flip: False ==
== resolution: 128 ==
== face_type: f ==
== learn_mask: False ==
== optimizer_mode: 2 ==
== archi: df ==
== ae_dims: 512 ==
== e_ch_dims: 42 ==
== d_ch_dims: 21 ==
== ca_weights: False ==
== pixel_loss: False ==
== face_style_power: 0.0 ==
== bg_style_power: 0.0 ==
== ct_mode: lct ==
== clipgrad: False ==
== batch_size: 3 ==
== ==
==-------------- Running On ---------------==
== ==
== Device index: 0 ==
== Name: GeForce GTX 1650 ==
== VRAM: 4.00GB ==
== ==
=============================================
Starting. Press "Enter" to stop training and save model.
2019-11-19 20:34:31.109970: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 888.70M (931869440 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
[20:35:02][#001360][1031ms][0.5914][0.6432]
 

Groggy4

NotSure
Verified Video Creator
Does it happen in both 9.2_SSE and 10.1_AVX? Or have you tried both? Since you got a newer cpu, it is most likely with avx instructions. So would stick with that. Try SAEHD if the error still exist. It is a bit heavier. But another dude with the same gpu ran it the other day without that kind of error.
 

xxxxxx

DF Pleb
tried SAEHD:

And got this:

Running trainer.

Loading model...

Model first run.
Enable autobackup? (y/n ?:help skip:n) : y
Write preview history? (y/n ?:help skip:n) : y
Choose image for the preview history? (y/n skip:n) :
n
Target iteration (skip:unlimited/default) :
0
Batch_size (?:help skip:0) : 3
Feed faces to network sorted by yaw? (y/n ?:help skip:n) :
n
Flip faces randomly? (y/n ?:help skip:y) : n
Src face scale modifier % ( -30...30, ?:help skip:0) :
0
Resolution ( 64-256 ?:help skip:128) :
128
Half, mid full, or full face? (h/mf/f, ?:help skip:f) :
f
Learn mask? (y/n, ?:help skip:y ) : n
Optimizer mode? ( 1,2,3 ?:help skip:1) : 2
AE architecture (df, liae ?:help skip:df) :
df
AutoEncoder dims (32-1024 ?:help skip:256) :
256
Encoder/Decoder dims per channel (10-85 ?:help skip:21) :
21
Enable random warp of samples? ( y/n, ?:help skip:y) :
y
Enable 'true face' training? (y/n, ?:help skip:n) :
n
Face style power ( 0.0 .. 100.0 ?:help skip:0.00) :
0.0
Background style power ( 0.0 .. 100.0 ?:help skip:0.00) :
0.0
Color transfer mode apply to src faceset. ( none/rct/lct/mkl/idt/sot, ?:help skip:none) : lct
Enable gradient clipping? (y/n, ?:help skip:n) :
n
Pretrain the model? (y/n, ?:help skip:n) :
n
Using TensorFlow backend.
Loading: 100%|######################################################################| 707/707 [00:02<00:00, 279.37it/s]
Loading: 100%|####################################################################| 2153/2153 [00:06<00:00, 337.80it/s]
=============== Model Summary ===============
== ==
== Model name: SAEHD ==
== ==
== Current iteration: 0 ==
== ==
==------------- Model Options -------------==
== ==
== autobackup: True ==
== write_preview_history: True ==
== sort_by_yaw: False ==
== random_flip: False ==
== resolution: 128 ==
== face_type: f ==
== learn_mask: False ==
== optimizer_mode: 2 ==
== archi: df ==
== ae_dims: 256 ==
== ed_ch_dims: 21 ==
== random_warp: True ==
== true_face_training: False ==
== face_style_power: 0.0 ==
== bg_style_power: 0.0 ==
== ct_mode: lct ==
== clipgrad: False ==
== batch_size: 3 ==
== ==
==-------------- Running On ---------------==
== ==
== Device index: 0 ==
== Name: GeForce GTX 1650 ==
== VRAM: 4.00GB ==
== ==
=============================================
Starting. Press "Enter" to stop training and save model.
2019-11-19 20:59:21.190321: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 20:59:21.195319: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 20:59:21.362532: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 20:59:21.367670: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 20:59:21.484941: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 20:59:21.490809: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 20:59:21.606699: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 20:59:21.611748: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 20:59:22.387182: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 20:59:22.392778: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 20:59:22.552610: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 20:59:22.558262: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
Initializing CA weights: 100%|#########################################################| 49/49 [00:26<00:00, 1.83it/s]
[21:00:03][#000001][7904ms][1.6433][1.8285]
2019-11-19 21:00:06.014347: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:06.021663: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:06.033582: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:06.038253: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:12.061493: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:12.067286: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:12.075365: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:12.083915: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:14.554102: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:14.561832: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:14.567725: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:14.572716: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:15.584295: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:15.590224: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:15.596002: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:15.602867: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:16.610831: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:16.617159: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:16.622018: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:17.636669: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:17.643308: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:17.652749: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:17.659804: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:21.682219: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:21.693971: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:21.701035: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:21.706978: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:29.197058: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:29.204212: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:30.212181: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:30.219907: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:30.224580: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:34.279381: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:34.286618: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:34.294335: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:34.299584: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:36.785453: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:36.793056: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:36.797259: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:37.795604: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:37.802442: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:37.809632: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:37.815547: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:43.918522: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:43.924463: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:43.932385: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:43.938255: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:48.445748: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-11-19 21:00:48.452145: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory


I closed it and tried it on DeepFaceLab_CUDA_9.2_SSE_build_11_12_2019:

Got this:

Running trainer.

Loading model...

Model first run.
Enable autobackup? (y/n ?:help skip:n) : y
Write preview history? (y/n ?:help skip:n) : y
Choose image for the preview history? (y/n skip:n) :
n
Target iteration (skip:unlimited/default) :
0
Batch_size (?:help skip:0) : 3
Feed faces to network sorted by yaw? (y/n ?:help skip:n) :
n
Flip faces randomly? (y/n ?:help skip:y) : n
Src face scale modifier % ( -30...30, ?:help skip:0) :
0
Resolution ( 64-256 ?:help skip:128) :
128
Half or Full face? (h/f, ?:help skip:f) :
f
Learn mask? (y/n, ?:help skip:y ) : n
Optimizer mode? ( 1,2,3 ?:help skip:1) : 2
AE architecture (df, liae ?:help skip:df) :
df
AutoEncoder dims (32-1024 ?:help skip:512) :
512
Encoder dims per channel (21-85 ?:help skip:42) :
42
Decoder dims per channel (10-85 ?:help skip:21) :
21
Use CA weights? (y/n, ?:help skip:n ) :
n
Use pixel loss? (y/n, ?:help skip:n ) :
n
Face style power ( 0.0 .. 100.0 ?:help skip:0.00) :
0.0
Background style power ( 0.0 .. 100.0 ?:help skip:0.00) :
0.0
Color transfer mode apply to src faceset. ( none/rct/lct/mkl/idt, ?:help skip:none) : lct
Enable gradient clipping? (y/n, ?:help skip:n) :
n
Pretrain the model? (y/n, ?:help skip:n) :
n
Using TensorFlow backend.
Loading: 100%|######################################################################| 632/632 [00:02<00:00, 267.93it/s]
Loading: 100%|####################################################################| 2175/2175 [00:06<00:00, 330.72it/s]
=============== Model Summary ===============
== ==
== Model name: SAE ==
== ==
== Current iteration: 0 ==
== ==
==------------- Model Options -------------==
== ==
== autobackup: True ==
== write_preview_history: True ==
== sort_by_yaw: False ==
== random_flip: False ==
== resolution: 128 ==
== face_type: f ==
== learn_mask: False ==
== optimizer_mode: 2 ==
== archi: df ==
== ae_dims: 512 ==
== e_ch_dims: 42 ==
== d_ch_dims: 21 ==
== ca_weights: False ==
== pixel_loss: False ==
== face_style_power: 0.0 ==
== bg_style_power: 0.0 ==
== ct_mode: lct ==
== clipgrad: False ==
== batch_size: 3 ==
== ==
==-------------- Running On ---------------==
== ==
== Device index: 0 ==
== Name: GeForce GTX 1650 ==
== VRAM: 4.00GB ==
== ==
=============================================
Starting. Press "Enter" to stop training and save model.
2019-11-19 21:08:00.532045: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 892.70M (936063744 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
[21:08:05][#000001][6139ms][1.5933][1.5592]
[21:10:08][#000108][1046ms][1.1483][1.3257]
 

Groggy4

NotSure
Verified Video Creator
Guess something could be interfering after you removed the cuda 10.1 install. Not sure what you should do. Probably gotta reinstall gpu drivers somehow. Maybe try the "studio driver" in geforce experience.


Forgot to say. If you install studio driver. Don't do a quick install. But the adjusted install. Then you get a choice to do a clean install by removing any other drivers and profiles. Could help. But as always, not entirely sure.
 

xxxxxx

DF Pleb
First i reinstalled Geforce Game Ready Driver, tried another run step 6, didnt work.
Then i downloaded NVIDIA Studio Driver from the Nvidia website, i couldnt via Geforce Experience, and reinstalled it again ones it was in GeForce Experience to do a adjusted install for a clean install.
Then tried DeepFaceLab_CUDA_10.1_AVX. Got this error:

Running trainer.

Loading model...
Press enter in 2 seconds to override model settings.Using TensorFlow backend.
Loading: 100%|#######################################################################| 707/707 [00:12<00:00, 55.90it/s]
Loading: 100%|#####################################################################| 2153/2153 [00:33<00:00, 65.16it/s]
=============== Model Summary ===============
== ==
== Model name: SAE ==
== ==
== Current iteration: 2116 ==
== ==
==------------- Model Options -------------==
== ==
== autobackup: True ==
== write_preview_history: True ==
== sort_by_yaw: False ==
== random_flip: False ==
== resolution: 128 ==
== face_type: f ==
== learn_mask: False ==
== optimizer_mode: 2 ==
== archi: df ==
== ae_dims: 512 ==
== e_ch_dims: 42 ==
== d_ch_dims: 21 ==
== ca_weights: False ==
== pixel_loss: False ==
== face_style_power: 0.0 ==
== bg_style_power: 0.0 ==
== ct_mode: lct ==
== clipgrad: False ==
== batch_size: 3 ==
== ==
==-------------- Running On ---------------==
== ==
== Device index: 0 ==
== Name: GeForce GTX 1650 ==
== VRAM: 4.00GB ==
== ==
=============================================
Starting. Press "Enter" to stop training and save model.
2019-11-19 22:57:38.901694: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 888.70M (931869440 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
[23:12:42][#002920][1056ms][0.6009][0.6827]
[23:15:37][#003078][1031ms][0.5671][0.5662]
 
Top