Mr DeepFakes Forums
  • New and improved dark forum theme!
  • Guests can now comment on videos on the tube.
   
dennisbgi7Issue with google colab
#1
Hi, so I am having an issue with the google colab. I imported the workspace and started training as intructed in the guide. However, for the last couple of hours, it seems to be stuck at one place 
Code:
/content
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
Creating workspace archive ...
Archive created!
Time to end session: 12 hours
Running trainer.

Loading model...

Model first run. Enter model options as default for each run.
Write preview history? (y/n ?:help skip:n) : n
Target iteration (skip:unlimited/default) :
0
Batch_size (?:help skip:0) : 8
Feed faces to network sorted by yaw? (y/n ?:help skip:n) : y
Flip faces randomly? (y/n ?:help skip:y) : n
Src face scale modifier % ( -30...30, ?:help skip:0) :
0
Resolution ( 64-256 ?:help skip:128) :
128
Half or Full face? (h/f, ?:help skip:f) : f
Learn mask? (y/n, ?:help skip:y) : y
Optimizer mode? ( 1,2,3 ?:help skip:1) : 2
AE architecture (df, liae ?:help skip:df) : df
AutoEncoder dims (32-1024 ?:help skip:512) : 786
Encoder dims per channel (21-85 ?:help skip:42) :
42
Decoder dims per channel (10-85 ?:help skip:21) :
21
Use multiscale decoder? (y/n, ?:help skip:n) : y
Use CA weights? (y/n, ?:help skip: n ) :
n
Use pixel loss? (y/n, ?:help skip: n ) :
n
Face style power ( 0.0 .. 100.0 ?:help skip:0.00) :
0.0
Background style power ( 0.0 .. 100.0 ?:help skip:0.00) :
0.0
Apply random color transfer to src faceset? (y/n, ?:help skip:n) :
n
Pretrain the model? (y/n, ?:help skip:n) :
n
Using TensorFlow backend.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
Loading: 100% 1111/1111 [00:02<00:00, 426.89it/s]
Sorting: 100% 64/64 [00:00<00:00, 2193.37it/s]
Loading: 100% 4556/4556 [00:06<00:00, 711.55it/s]
Sorting: 100% 64/64 [00:00<00:00, 481.35it/s]
===== Model summary =====
== Model name: SAE
==
== Current iteration: 0
==
== Model options:
== |== batch_size : 8
== |== sort_by_yaw : True
== |== random_flip : False
== |== resolution : 128
== |== face_type : f
== |== learn_mask : True
== |== optimizer_mode : 2
== |== archi : df
== |== ae_dims : 786
== |== e_ch_dims : 42
== |== d_ch_dims : 21
== |== multiscale_decoder : True
== |== ca_weights : False
== |== pixel_loss : False
== |== face_style_power : 0.0
== |== bg_style_power : 0.0
== |== apply_random_ct : False
== Running on:
== |== [0 : Tesla T4]
=========================
Starting. Press "Enter" to stop training and save model.
/usr/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 1 leaked semaphores to clean up at shutdown
 len(cache))
2019-06-11 03:02:25.987149: E tensorflow/stream_executor/cuda/cuda_driver.cc:868] failed to alloc 4294967296 bytes on host: CUDA_ERROR_INVALID_VALUE: invalid argument
[03:02:35][#000001][26.90s][5.4991][3.2766]

Anything that I might be doing wrong? what could be the issue?
#2
Colab cant handle as much as your RTX2060. Your out of resources. Try some of these:
Lowering batch size
Lowering ae_dims
Increase optimizer mode 3.

Keeping in mind that if you start training on that card with aggressive settings you probably wont be able to get them to work when trying to offload the processing to Colab.
#3
(06-15-2019, 09:44 PM)goldenp Wrote: You are not allowed to view links. Register or Login to view.Colab cant handle as much as your RTX2060.  Your out of resources.  Try some of  these:
Lowering batch size
Lowering ae_dims
Increase optimizer mode 3.

Keeping in mind that if you start training on that card with aggressive settings you probably wont be able to get them to work when trying to offload the processing to Colab.

But Colab is using Tesla T4, which is much much higher than an RTX 2060?
#4
I have an RTX 2060 as well and I can push it harder than the T4 on Colab. I though the same as you, but it wasn't the case when I put it to the test.

Forum Jump:

Users browsing this thread: 1 Guest(s)