Akanee said:
dpfks said:
I agree, I never use 256, but then again I don't do just photos
You right all.
I found that too low too.
I return to 128 cause i don't think there will be a big difference with 128 honestly.
I asked you that question cause with photos we meet a problem more recurent that you can found for video it's the closest shot.
I mean often on photos the face is very near the camera and i always ask myself if SAE with DF or mays LIAE will do good result on definition or resolution terms.
it's like my nightmare, you know..
I do photos all the time. Its fucking tedious. However, Ive found some things that help speed up the process and make better results.
I used to go photo by photo sizing them all so they would fit within my 128-224 face resolutions. 1 by 1 by 1. Sometimes up to 4000 at a time. It would take me about 6-7 hours to go through 2000 photos. Way too long. Here is my process now.
Make 2-4 copies of each photo, make it so the naming convention is something like:
photo1, photo1_1, photo1_2, photo2_3
photo2, photo2_1, photo2_2, etc
So now you have 3-4 copies of each photo. In Premiere import all of them and make a new sequence with just photox.
Now make a new sequence after it, with just photo_x_1, then another with photox_2. So on and so forth so that you have multiple sequences of the same photos.
Then for sequence 1, leave it. Sequence 2, select all frames and scale -50%. Sequence 3, select all and scale 80%. Sequence 4, select all and scale 125%. So on and so forth.
Lastly, if you want you can even make contrast, color, denoise changes to each even more sequences.
Yes, this means you sometimes end up with 12,000-16,000 frames. Yes, it will take time finding faces, but far far less time than you will spend manually manipulating each frame/photo. However, if you already have a well trained src model, it doesnt matter. If your model is near perfect, it only takes maybe an hour of running for it to match the new dst. Now you convert all photos (yes all 12k-20k) in both rct, then lct. What you end up with is the most amount of possible good photos that you can use, without having to do hardly any pre-processing or post-processing. Since some face sizes and different colors work better than others, using this method you are covering almost every possible output for each photo, to ensure you have a good end product.
tl;dr probably just dont do photos, its ridiculously tedious, but teaches you a lot about everything
titan_rw said:
I figured I'd throw this in here since this thread has a lot of info about the memory optimizer modes in DFL.
If you use mode 2 or 3, PCI-E bandwidth is king. Especially when using mode 2.
I juggled around the pci-e cards in my computer, and got my Titan card which was running at x8, up to x16. This got me 5-15% faster iterations, depending on batch size, and mode 2 or 3.
This makes sense thinking back on it. From what I gather: mode 1 just runs everything in vram all the time. Mode 2 juggles stuff between vram and system ram in order to allow bigger models, or higher batch size. Depending on the speed of the gpu, how much vram it has, and the size of the model, this swapping of system ram to graphics ram can be limited by the bus speed. Mode 3 seems to be less affected as I'm assuming it uses the CPU for some of the processing so there's more time available for bus transfers.
I was seeing 80-90% peak bus usage at x8, but only 60-70% at x16. These are peaks, not sustained continuous usage. But if I'm seeing 70% peak at x16, that would be the equivalent to a 140% peak at x8. Obviously that's not possible, so training is slowed down for a fraction of a second while the pci-e transfer finishes.
Im still not understanding the iteration times. It seems like a while back I could push to 20 batch sizes at 128res and still be at like 1,800ms. Every since optimizer, and yes I realize the trade off of offloading memory, but even something like 145 res, optimizer 2, is sometimes like 3 seconds! I try to do everything I can to get rid of optimizer, but doesnt seem like there is a way to anymore.
My baseline default training model settings are now:
batch size: 4-5
resolution: 224
ae_dimms: 777
e_ch_dims: 55
d_ch_dims: 33
optimizer: 2
iteration = 2,600ms
sometimes when Im going to let it run for a few hours straight Ill set it to batch size of 6-7 and optimizer to 3, but then iterations are like 4-5 seconds.
Im still trying to figure out if (after say 25k) larger batch sizes at slower times are better, or smaller batch sizes at higher times. Hard to determine whats better.