MrDeepFakes Forums
  • New and improved dark forum theme!
  • Guests can now comment on videos on the tube.
   
tutsmybarrehFrequently asked questions - DeepFaceLab 1.0/2.0
#1
NOTE: Following FAQ was written with DFL 1.0 in mind which is no longer supported so some of the information in here is outdated and features are missing in DFL 2.0 but overall you can still use it to find answers for some of the more frequent issues you will encounter while making deepfakes with DFL 2.0.

Use ctrl+f to find what you are looking for, scroll to the bottom for some tips and links to some useful stuff.

1. Q: What are Deepfakes?


A: Deepfakes are fake videos that swap an individual's face with another face using artificial intelligence (AI) or machine learning.

2. Q: How are Deepfakes made?

A: The process of creating deepfakes is largely simplified by apps that were created to streamline the process for users that do not know any coding or commands in Python. The two most active apps that are currently being updated are You are not allowed to view links. Register or Login to view. and You are not allowed to view links. Register or Login to view..

3.Q: What is the FakeApp everyone is talking about?

A: In the early days of deepfakes, there was an app called FakeApp that first introduced the Internet to deepfakes.
However this app is no longer being developed, and despite advances to other deepfake applications, the media is misinformed and believes this is the app that is currently being used by deepfakers. We DO NOT recommend using FakeApp.

4. Q: How long does it take to make a deepfake?

A: Depending on how long your target video is, and how large your data (facesets) are, it may take anywhere between 1-7 days to make a convincing deepfake.

5. Q: Can you make a deepfake video with just a few pictures?

A: In general, the answer will be no. The recommended way to make a faceset to make a decent deepfake is to use videos.
The more angles and facial expressions the better. Sure you can try to make a deepfake video with just a couple hundred photos, it will work, but the results will be less convincing.

6. Q: What is the ideal faceset size?

A: For the data_src (celebrity) faceset, I recommend anywhere between 1000-5000 images.
The number does not matter, as long as you have a wide range of different angles and facial expressions.

7. Q: Why are my deepfakes turning out blurry?

A: It is most likely due to lack of training time.
It is recommended to train over 100k iterations. It may also be that you're facesets are not properly aligned, so make sure they are clean. See the guide section.

8. Q: Why is my model not blinking in the final deepfake video?

A: This is most likely due to the lack of closed eyes images in your data_src. Make sure you have a decent amount of facial expressions and angles to match the scene your data_dst is in.

9. Q: When should I stop training?

A: There is no correct answer, but the general consensus is to use the preview window to judge when to stop training and convert.
There is no exact iteration number or loss value where you should stop. I would recommend at least 100,000 iterations, but use the preview window to judge, if faces are sharp and your previews are not improving anymore then your model probably won't learn much more, it all depends on quality of your src and dst datasets, if you have blurry images in your src then they may cause model to not improve/train as well as if there were no blurry images in that dataset, always ensure you only use high quality, sharp images and videos for both src and dst for optimal results.

10. Q: Training isn't starting, I'm getting OOM/out of memory errors.

A: Either your GPU is not supported, you have wrong DFL version or you are running out of VRAM.

1. Make sure you've downloaded the right version of DFL for your GPU:
- CUDA10.1SSE or for Nvidia GTX GPUs and CPUs with AVX support
- CUDA10.1AVX for Nvidia GTX/RTX GPUs and CPUs with AVX support
- OpenCLSSE for AMD and Intel GPUs

2. Check if your GPU is supported, for CUDA DFL requires CUDA compute capability of 3.5:
You are not allowed to view links. Register or Login to view.

3. If you are getting OOM errors it means you are running out of VRAM, there are various settings you can change in some models to fix that:

a) decrease batch size (All models) - lower batch size means the model loads less images at the same time into memory, thus using less of it but it means that you will have to train longer to achieve the same result than with higher batch size, extremely low batch size like 2-4 may also lead to less accurate model but that is just a speculation and not 100% confirmed to be true.

b) use different optimizer modes (SAE/SAEHD) - mode 2 and 3 will utilize RAM to offload some of the data from GPU VRAM at the cost of the training speed

c) don't use style powers (SAE/SAEHD) - enabling them increases iteration/training time and uses more vram, if you have trouble running them but don't want to decrease batch size, set them to 0

d) don't use pixel loss (SAE) - pixel loss has small performance hit, you may try disabling it if it causes you to run out of VRAM/get OOM errors.

e) don't use learned mask (SAE/SAEHD) - learning masks during training increases vram usage and slows down training/increases iteration time.

f) turn off true face training (SAEHD) - true face increases VRAM usage and slows down training/increases iteration time.

-----

Bear in mind that these 2 setting bellow cannot be changed later, only when creating new model:

g) run models with reduced resolution - even with all the optimization you can do and disabling various features you may still not be able to run your desired resolution, just decrease it till you can run it (by the factor of 16)

h) decrease autoencoder and encoder/decoder_ch dimensions (not recommended) - decreasing them can make it possible to run additional features/higher resolution models but at the cost of accuracy, for more info about them scroll down to question #41


4. For other errors, try restarting your PC.

11. Q: My GPU is supported, I've set lower batch size and I have the right DFL version and yet I'm still getting errors.

A: Either you have driver (GPU) issue, your model is broken/not compatible with the version of DFL you are using or your datasets (data_src and data_dst) are broken/incompatible/missing.

1. Try using different model or a new one (backup model files from model folder, delete them and start over again with new one), if that doesn't fix errors, it may mean that there is issue with your datasets (be it missing files or datasets extracted with different software. Use google or forums search function to find bits of that error, someone may have already posted about it and there may be a fix for that, otherwise go to DFL github page and report or look for solution there: You are not allowed to view links. Register or Login to view.

2. If you did all of it and you are still getting errors, restart your pc and try again.
3. Still nothing? You've checked forums, google and github and found nothing? Check your drivers, cuda, etc for issuses. If none of it makes a difference, then you can create a new thread (or report bug on github page).

12. Q: I'm getting some error about layers "You are trying to load a weight file containing X layers into a model with Y layers" while trying to train my models in new version of DFL.

A: DFL underwent some changes, in particular the SAE model (both DF and LIAE architecture) so if you are trying to use your old trained models they will require retraining or creating a new one.


1. To retrain your old models copy all files to a model folder and delete all files with word decoder in them, it will create new decoder files upon starting training, it's a bit faster than starting from 0 but it will still take some time to get it back to previous state.
2. Or start a new one.

If you want to still use old models you need to use old DFL version, to get one, first download DFL for you GPU:

- CUDA9.2SSE or 10.1SSE for Nvidia GTX GPUs
- CUDA10.1AVX for Nvidia RTX GPUs
- OpenCLSSE for AMD and Intel GPUs

And then go to github page and download this: You are not allowed to view links. Register or Login to view.
This will download older version of files that are located inside _internal/DeepFaceLab, you just need to delete them all and replace with files from the .zip archive you just downloaded from github, this will efectivelly downgrade your DFL to a version from 07.09, right before SAE model update.

13. Q: I often see overkill on number of frames in celebrity datasets. What are the best tools for automating the process of removing duplicates beyond the DFL bat files to shrink to "minimum frames, maximum variation" as you might call it?

A: There are probably some tools that could try to detect similar faces and delete them. I just do this manually. Also If it's under 5000 thousand I'd just leave as it is, unless you have 10k-15k of pictures I wouldn't bother about deleting similar duplicate ones, it won't hurt anything and some of those similar frames may just feature some unique detail that may help achieve better results.
App suggested by @666VR999 for further dataset cleanup: You are not allowed to view links. Register or Login to view. You are not allowed to view links. Register or Login to view.

14. Q: I was training my model that already had several thousands iterations but the faces in preview window suddenly turned black/white/look weird, my loss values went up/are at zero.

A: Your model has collapsed, it means you cannot use it anymore and you have to start all over or if you had backups, use them.
To prevent model collapsing use gradient clip when starting training. Things that cause collapses can be:
- Face/Background style powers
- Pixel Loss
- other reasons: some models can collapse even without pixel loss/style powers so always enable gradient clipping

15. Q: Hello, I'm new here and I want to make deepfakes, which software should I use/how do you use fakeapp/I'm getting errors with fakeapp/where are the links for old faceswap software?

A: Use DFL or FaceSwap, all other software is old, outdated and not supported anymore.

DFL guide and downloads: You are not allowed to view links. Register or Login to view.
Face swap guide and downloads: You are not allowed to view links. Register or Login to view.
We recommend DFL as it's easier to use and usually gives better results, also there is more support for DFL on the web.

16. Q: I don't want to retrain my models but I don't have access to older/previous DFL version before changes to SAE models, where can I download it?

A: Here You are not allowed to view links. Register or Login to view.

Or here: You are not allowed to view links. Register or Login to view. (07.09 release, 10.1SSE, fixed converter, new color transfer modes, old SAE model).

17. Q: How do you make such good looking deepfakes? Is there a guide?

A: There is.

Check our guide section for tutorials on creating deepfakes: You are not allowed to view links. Register or Login to view.
Finding porn star matches/celebrity porn star lookalikes/doppelgangers: You are not allowed to view links. Register or Login to view.
Celebrity datasets: You are not allowed to view links. Register or Login to view.
Other guides: You are not allowed to view links. Register or Login to view. 

18. Q:
If I train a model of Celeb A (data_src) and use Celeb B (data_dst) as the destination can I use the same model of Celeb A to swap with a new Celeb C? Can I reuse models?

A: Yes, it is actually recommended to reuse you models.

The way you do it is simple, you just find a new video you want to use as destination, extract frames, align them and you can starrt training with the same model and the same data_src/source dataset/faceset of your celebrity. Reusing models gives much more realistic models.

19. Q: Should I pretrain my models?

A: As with reusing, yes, you should pretrain, you can do it in two ways:

1. Either use the built in pretrain function inside DFL which you can select when starting a new model.
2. Or use your own datasets and just train your models as you would normally do.

20. Q: I'm getting an error: is not a dfl image file required for training in DeepFaceLab

A: It means that the pictures inside data_src/aligned and/or data_dst are not valid for training in DFL.

This can be caused be several things:

1. You are using one of the shared datasets of a celebrity, chances are they were made in a different software than DFL or in older version of it, even though the look like aligned faces (256x256 images) they may be just pictures extracted in different app that stored landmarks/alignment data in different way. To fix them all you need to is to just run alignment process on them, just place them into a data_src folder (not "aligned" folder inside it) and align them again by using "4) data_src extract faces S3FD".

2. You edited faces/images inside aligned folder of data_src or data_dst in gimp/photoshop after aligning. When you edit those images you overwrite landmarks/alignments data that is stored inside them (all you can do is change their names, other changes, like editing in photoshop, changing metadata or rotating in windows will cause you to loose that data. If you want to edit pictures in any way you must do it before aligning them. To fix these images, put them back into the base folder (data_src/dst, not "aligned" inside them), change the name of original aligned folder to something else so you don't loose them when aligning new ones (like aligned_1) and run alignment process on them, then just combine both folders and replace these old "bad" images with new properly extracted ones.

3. You have regular, non extracted/aligned images in your data_src(or dst)/aligned folder.

21. Q: I'm getting errors during conversion: no faces found for X.jpg/png, copying without faces.

A: It means that for X frame in data_dst no faces were extracted into aligned folder.


This may be because there were actually no faces visible in that frame (which is normal) or they were visible but due to an angle at which they were or obstruction they were not detected.

To fix that you need to extract those faces manually by deleting images corresponding to that frame from the folder aligned_debug located inside data_dst and then running "5) data_dst extract faces MANUAL RE-EXTRACT DELETED RESULTS DEBUG", it will scan that folder for deleted images and display a preview where you can manually detect the faces. There are on screen instructions how to do it.
After that those faces will be extracted to aligned folder, take note during manual aligning that the red box is covering all the landmarks and is in good position/rotation, if it isn't or is rotated it may cause the aligned face to be rotated and unusable (it will produce blurry face on that frame after converting).

Overall you should make sure that you have as many faces aligned and properly extracted BEFORE starting to train.
Read this celebrity dataset guide and dfl guide for how to prepare datasets before training (manual extraction and cleanup of badly aligned faces):
You are not allowed to view links. Register or Login to view.
You are not allowed to view links. Register or Login to view.

22. Q: I'm getting errors: Warning: several faces detected. Highly recommended to treat them separately and Warning: several faces detected. Directional Blur will not be used. during conversion

A: It's caused by multiple faces within your data_dst/aligned folder.


When DFL extractcs/aligns faces from data_dst it does it to all faces it can find on given video frame/picture. If it does detect multiple it creates files that look like this:
0001_0.jpg 0001_1.jpg 0001_2.jpg (in case of detecting 3 faces).

In that case you need to get rid of all faces and only leave files that end with _0, ideally you want to delete them before training (so your model doesn't learn unnecessary faces) but if you already trained it you can just delete them before converting.

1. One issue with this is that DFL doesn't always group the same faces with the same prefix so some of the faces with _0 may be the right ones and at the same time on other frames it may have detected them differently so _0 is the wrong face (of someone else) and _1 is the right one. If that's the case you will have to move all the _1/_2/etc faces and also all the bad _0 faces to a different folder, then delete all faces except for the right ones (ignoring the _0/1/2 prefix) and then deleting those frames from aligned_debug and extracting those frames manually with "5) data_dst extract faces MANUAL RE-EXTRACT DELETED RESULTS DEBUG".

2. If all faces with _0 are good faces and other prefixes contain wrong faces, just delete them.

3. Third possibility is when DFL detects the same face twice, it may happen that they would both look correct, then you just delete _1 but if it happens that _1 is good and _0 is for example rotated/smaller (so just badly aligned) you still need to do manual extraction by deleting that frame from aligned_debug and running manual extraction with "5) data_dst extract faces MANUAL RE-EXTRACT DELETED RESULTS DEBUG".

23. Q: After converting/merging using converter/merger I see original/dst faces on some or all merged frames.

A: Make sure your converter mode is set to overlay or any other mode except for "original" and make sure you've aligned faces from all frames of your data_dst.mp4 file.

If you only see original faces on some frames, it's because they were not detected/aligned from those corresponding frames, it may happen due to various reasons: extreme angle where it's hard to see the face, blur/motion blur, obstructions, etc. Overall you want to always have all faces from your data_dst.mp4 aligned
To fix that, just go through aligned_debug folder and if you see a frame without landmarks on the face - delete it, you can browse this folder quicker by using 5.1) data_dst view aligned_debug results
Delete image files from data_dst/aligned_debug folder that correspond to frames on which you see original faces during conversion/in the merged folder and re-extract them manually using 5) data_dst extract faces MANUAL RE-EXTRACT DELETED ALIGNED_DEBUG.

if you don't see any faces at all/only see original faces (always check with overlay or raw mode) you may missing entire content of data_dst/aligned folder - in that case just extract/align them all again.

If you did all of it, you have your data_dst/aligned folder with all the aligned faces, converter is set to the right mode and you still only see the original faces, then you are probably experiencing some error we don't know about - in that case restart your pc or try different version of DFL.

24. Q: I'm getting dark/bright outline/shadow under/around edge of the face/mask when converting my deepfake.

A: It's an issue with masking when using certain modes (hist match) and some color transfer modes and blur/erode.

Modes that cause this issue are "Hist Match", no matter which one they all have a feature called "Masked Hist Match" which you can toggle with letter "Z" and changing it will result in different look but the shadow under it will disappear, as for color transfer modes which cause it are all that end with -m (mix-m, sot-m, idt-m, mkl-m).
To avoid them you can decrease mask size by increasing value for "Erode Mask" and then adjust "Blur Mask" so that it doesn't cause more of this shadow/outline but enough to disguise the seam.

25. Q: I don't know which model to use for training, which model is the best?

A: It depends, read the guides and experiment yourself.
We recommend either H128 or SAE with DF or LIAE architecture.
Most fakes are done with SAE DF, then H128 and SAE LIAE. There is also a new model called TrueFace but it's still heavily experimental and shouldn't be used as it will undergo many changes in the future (probably).

26. Q: What do those 0.2513 0.5612 numbers mean when training?

A: These are loss values. They indicate how well model is trained.
But you shouldn't focus on them unless you see sudden spikes in their value (up or down) after they already settled around some value, instead focus on preview windows and look for details like teeth separation, beauty marks, nose, eyes, if they are sharp and look good, then you don't have to worry about anything. Remember to always use gradient clipping when training SAE models to prevent model collapse.

27. Q: What are the ideal loss values, how low/high loss values should be?

A: It all depends on the type of model, settings, datasets and various different factors.
They are also not equal between models and using for example pixel loss will cause them to drop immediately but it doesn't mean you can treat loss values of pixel loss enabled LIAE SAE model to a basic H128 one. They can also change with DFL/model updates but as of right now (27.09.19) for example:
- for SAE DF model without pixel loss, without style powers, without learned mask, with CA weights enabled, no multi scale decoder (as it was removed from SAE models on 13.09.19 update) loss values should be under 0.2, that goes for both src and dst loss values.

28. Q: My model has collapsed, can I somehow recover it?

A: No, you need to start over, or use backup if you made them.

29. Q: Which mask type should I use? Learned? Dst? Fan-dst? Or maybe something else? What do these mask types mean, which mask type is best? How to mask out face obstructions (like hands, fingers and other things over the face).

A: There are many masking options built into DFL and which is the best and should be used depends on what you are trying to achieve.

First of all to use learned mask you need to enable it during model startup (when starting training of a new model), if you had it disabled during training and try to select learned mask in converter it will look the same as dst mask option.

learned: is a mask that tries to get best results by learning the shape of faces during training.

dst: is a mask derived from the shape of dst faces (based on landmarks embedded into aligned face images during aligning/extraction process).

FAN-prd: is a mask that is based on the predicted face shape (the final one) without taking dst shape into consideration. It's also trained to remove obstructions from face, like hands, fingers, glasses, etc. Can look bad in some cases. Should keep the shape of SRC face more than of DST.

FAN-dst: is a mask that is based on the DST face shape without taking predicted face shape into consideration. It's also trained to remove obstructions from face, like hands, fingers, glasses, etc. Generally provides very stable and consistent mask but will keep the shape of DST face more than of SRC.

FAN-prd*FAN-dst: is a mask that is based on the predicted and dst face shape, trying to average mask between both. It's also trained to remove obstructions from face, like hands, fingers, glasses, etc.

learned*FAN-prd*FAN-dst: as above + what it learned during training, technically it is the most superrior and accurate masking method but slowest to convert and requires having learned mask enabled during training which has small performance penalty.

I personally recommend either using FAN-dst or FAN-prd*FAN-dst or learned*FAN-prd*FAN-dst, don't recommend FAN-prd as it produces weird/bad masks, dst can be used too if you don't have any obstructions over faces in your data_dst and there are no extreme side angles.
You can also try using learned instead of dst (again, only when there are no obstructions of the face, blowjob/stuff in mouth is also an obstruction so if you are doing BJ deepfake, use FAN masks).

TIP: Using learn mask in training will cause it to be a bit slower, you can't disable or enable it after you've already started training but you can hack it by replacing .dat file in your model folder (always do backups in case something breaks).

To disable learn mask, create a new model with learn mask disabled and copy it's .dat file (let it train for few iterations and save it) to replace the old .dat file in the learn mask enabled model and delete SAE_decoder_dstm/srcm.h5 files as they won't be needed (they store the learned mask information).
If you want to enable learn mask in a model that has it disabled, do the same, backup old model, create new with learn mask enabled and replace .dat and also copy over the dstm/srcm decoder files to your model.

UPDATE: Newest version of DFL allows to enable/disable learned mask in SAE and SAEHD models without the need to modify any files.

30. Q: I often see overkill on number of images in celebrity datasets. What are the best tools for automating the process of removing duplicates beyond the built in DFL features to get as many variations of faces without having very large number of images in source/src/celebrity dataset?

A: There are probably some tools that could try to detect similar faces and delete them. I just do this manually.

If it's under 5000 thousand I'd just leave as it is, unless you have 10k-15k of pictures I wouldn't bother about deleting similar duplicate ones, it won't hurt anything and some of those similar frames may just feature some unique detail that may help you to achieve better results.
App suggested by @666VR999 for further dataset cleanup: You are not allowed to view links. Register or Login to view. You are not allowed to view links. Register or Login to view.

31. Q: What to do if you trained with a celebtity faceset and you want to add more faces/images/frames to it? How to add more variety to existing src/source/celebrity dataset?

A: Safest way is to change the name of the entire "data_src" folder to anything else or temporarily moving it somewhere else, then just extract frames from new data_src.mp4 file or if you already have the frames extracted and some pictures ready, create a new folder data_src, copy them inside it and run data_src extraction/aligning process, then just copy aligned images from the old data_src/aligned folder into the new one and upon being asked by windows to replace or skip, select the option to rename files so you keep all of them and not end up replacing old ones with new ones.

32. Q: Does the dst faceset/data_dst.mp4 also need to be sharp and high quality? Can some faces in dst faceset/dataset/data_dst be a bit blurry/have shadows, etc? What to do with blurry faces in my data_dst/aligned folder

A: You want your data_dst to be as sharp and free of any motion blur as possible. Blurry faces in data_dst can cause a couple issues:

- first is that some of the faces in certain frames will not get detected - this will cause original faces to be shown on these frames when converting/merging because they couldn't be properly aligned during extraction.
- second is that others may be incorrectly aligned - this will cause final faces on this frames to be rotated/blurry and just look all wrong and similar to other blurry faces will have to be manually aligned to be used in training and conversion.
- third - even with manual aligning in some cases it may not be possible to correctly detect/align faces which again - will cause original faces to be visible on frames that were to blurry or contained motion blur.
- faces that contain motion blur or are blurry (not sharp) that are correctly aligned may still produce bad results because the models that are used in training cannot understand motion blur, certain parts of the face like mouth when blurred out may appear bigger/wider or just different and the model (H128/SAE, really any training model) will interpret this as a change of the shape/look of that part and thus both the predicted and the final faked face will look unnatural. You should remove those blurry faces from training dataset (data_dst/aligned folder) and put them aside somewhere else and then copy them back into data_dst/aligned folder before converting so that we get the swapped face to show up on frames corresponding to those blurry faces.

You want both your SRC datasets and DST datasets to be as sharp and high quality as possible.
Small amount of blurriness on some frames shouldn't cause many issues. As for shadows, this depends on how much shadow we are talking about, small, light shadows will probably not be visible, you can get good results with shadows on faces but to much will also look bad, you want your faces to be lit as evenly as possible with as little of harsh/sharp and dark shadows as possible.

33. Q: I'm getting error reference_file not found when I try to convert my deepfake back into mp4 with 8) converted to mp4.

[b]A: You are missing data_dst.mp4 file in your workspace folder, check if it wasn't deleted:

[/b]Reason why you need it is that even though you separated it into individual frames with 3.2) extract images from video data_dst FULL FPS all there is inside data_dst folder is frames of the video, you also need sound, which is taken from the original data_dst.mp4 file.

34. Q: I accidentally deleted my data_dst.mp4 file and cannot recover it, can I still convert merged/converted frames into an mp4 video?

A: Yes, in case you've permanently deleted data_dst.mp4 and you have no way of recovering it or rendering identical file you can still convert it back into mp4 (without sound) manually by using ffmpeg and a proper command:

- start by going into folder ...:\_internal\ffmpeg and copy ffmpeg.exe
- paste it into the merged folder
- open up command line by pressing windows key + r (run) and typing cmd or searching it up after pressing windows key and typing cmd/cmd.exe
- copy address of your merged folder (example: D:\DFL\workspace\data_dst\merged)
- in the command line type the letter of your drive, as in example above that would be "d:" (without quotation marks) and press enter
- line D:\> should appear, next type "cd: FULL_ADDRESS", example: "cd: D:\workspace\data_dst\merged"
- you should now see your entire address like this: D:\DFL\workspace\data_dst\merged>
- enter this command:

ffmpeg -r xx -i %d.jpg -vcodec libx264 -crf 20  -pix_fmt yuv420p result.mp4

- xx is framerate
- d is a number representing amount of numbers in the file name so if your merged frames have names like 15024.jpg that would be 5, if it's 5235.jpg it is 4, etc.
If your images are pngs, change .jpg to .png
- crf is quality setting, best to be left at 20.
If your merged file names have some letters in front like out12345.jpg add "out" before the % sign.

Example command for converting frames named "out_2315.png" into an 30 fps .mp4 file named "deepfake".

ffmpeg -r 30 -i out%4.png -vcodec libx264 -crf 20  -pix_fmt yuv420p deepfake.mp4

35. Q: Can you pause conversion and resume it later? Can you save conversion settings? My conversion failed/I got error during conversion and it's stuck at %, can I start it again and convert from last successfully converted/merged frame?


A: Yes, by default interactive converter creates session file in the "model" folder that saves both progress and settings.

If you want to just pause the training you can hit enter (same as starting conversion) and it will pause. If however you need to turn it off completely/restart pc, etc you exit from conversion with esc and wait for it to save your progress, next time you launch conversion, after selecting interactive converter (Y/N) - Y you'll get a prompt asking if you want to use the save/session file and resume the progress, converter will load with the right settings at the right frame.

If your conversion failed and it didn't save the progress you will have to resume it manually, you do it by first backing up your data_dst folder and then deleting all extracted frames inside data_dst as well as all images from aligned and aligned_debug folders inside data_dst (making sure that you backup old files so you can easily bring them back afterwards) that correspond to frames already converted/merged inside folder "merged". Then just start converter, enter settings you used before and convert rest of frames, then combine new merged frames with old ones from the backup data_dst folder and convert to .mp4 as usual.

36. Q: Faces in preview during training look good but after converting them they look bad. I see parts of the original face (chin, eyebrows, double face outline).

A: Faces in preview are the raw output of the AI that then need to be composited over the original footage.
Because of it, when faces have different shapes, or are slightly smaller/bigger you may see parts of the original face around/outside the mask that DFL converter creates.
To fix it you need to change conversion settings, start by:

- adjusting the mask type

For types refer to question: Which mask type should I use? Learned? Dst? Fan-dst? Or maybe something else? What do these mask types mean, which mask type is best? How to mask out face obstructions (like hands, fingers and other things over the face).

- adjust mask erosion (size) and blur (feathering, smoothing the edge)

- adjust face size (scale)

NOTE: Negative erosion increases the mask size (covers more), positive decreases it.

37. Q: Final result/deepfake has weird artifacts, face changes colors, color bleed from background and make it flicker/darken/change color in the corners/on the edges when using Seamless mode.

A: You are using seamless mode. DON'T. Instead:

- use overlay or any other mode besides seamless.
- decrease size of the mask/face so it doesn't "touch" areas outside and doesn't as a result get the color of background/area outside of the face/head by increasing "Erode Mask" value.
- smooth out the edge of the mask/face by increasing "Blur Mask" value which may hide some of the color changes, also helps make the face seem more... "seamless"  when you decrease mask size.

I recommend to NOT use seamless unless it's absolutely needed and even then I recommend stopping on every major angle and camera shift/light change to see if it doesn't cause those artifacts.

38. Q: I'm thinking about using SAEHD, is SAEHD better than SAE? Is SAEHD more demanding than SAE?

A: According to reports of users and my tests, yes to both, SAEHD seems like a superior model to SAE because of it's reworked features like new true face mode, random warp for better face generalization and changes to how pixel loss.


@Kippax explained how new pixel loss works in SAEHD model:

"I looked at the code and can confirm a "hybrid" pixel loss value is on. SAE uses DSSIM (image structural similarity) when off, and simple pixel difference when it is on.
You are not allowed to view links. Register or Login to view.
To explain what the differences are, for normal SAE, when pixel loss is off it looks at the structural difference between the two images. When pixel loss is on, it looks at how different the images are. The latter allows for more detail, but can be a bit chaotic at the start, which would result in model instability and possible mode collapse (that's why we used to recommended turning it on at the end of the training).

For SAEHD, it uses both. It sums up the value of DSSIM and pixel difference, however it only weights pixel loss 1/5th as it was before. You might have noticed pixel loss values always being smaller than when it is off; this is a good thing. It means early on in the model, it relies on DSSIM more; but as the loss reduces, it incrementally transitions to more and more pixel loss.
You are not allowed to view links. Register or Login to view.
IMO there is no reason to use SAE anymore if you can run SAEHD, even at a much lower batch rate. SAEHD is full of structural improvements and optimizations that make training more efficient. The only reason for training SAE would be legacy / backwards compatibility with old models."

My personal testing of performance with different features (true face, random warp, learned mask) enabled/disabled across different versions of DFL (9.2 SSE, 9.2 AVX, 10.1 AVX) and different model resolutions and batch sizes:

1st set of tests - base settings and setup:
Settings: Batch size 6, model - SAE HD, resolution 160, Optimizer mode 3, Gradient clipping enabled from start.
Hardware: 6GB VRAM GPU (GTX 1060), 16GB RAM, i7 4770k 4,4GHz, DFL 9.2 AVX. 

Settings that are changing: random warp, trueface, learned mask.

All tests are performed under identical conditions, no other programs running in background, always starting with new model, same datasets.
DFL version is the newest stable release with the newest commit from Github (e013cb0f6baff3e52c61e2612a95afe69d44a941).

Test 1 - ALL DISABLED: random warp disabled, true face disabled, learned mask disabled: 3 OOMs on startup, then 2900ms.
Test 2 - ALL ENABLED: random warp enabled, true face enabled, learned mask enabled: OOM, fails to start.

Test 3 - RANDOM WARP ONLY: random warp enabled, true face disabled, learned mask disabled: 3 OOMs on startup, then 2900ms
Test 4 - TRUE FACE ONLY: random warp disabled, true face enabled, learned mask disabled: OOM, fails to start.
Test 5 - LEARNED MASK ONLY: random warp disabled, true face disabled, learned mask enabled: 4 OOMs, 4000ms unstable

Test 6 - RANDOM WARP + TRUEFACE: random warp enabled, true face enabled, learned mask disabled: OOM, fails to start
Test 7 - RANDOM WARP + LEARNED MASK: random warp enabled, true face disabled, learned mask enabled: OOM, fails to start

I didn't and won't test true face with learned mask because both have such big performance hit it would probably have to be trained at 128 res with even lower batch size which would probably be a waste of time since all the improvements would be hard to notice if model was trained at such low batch size and lower resolution.

I didn't really have to run all the variations since it's clear you can't run true face at these settings and learned mask can be run but is unstable so it's the same as if it didn't work at all.
Learned mask must have big VRAM usage and slows down training considerably but it's not enough to completely overload GPU so it still starts, whereas true face must have even higher VRAM usage so it doesn't start at all.
It seems like random warp has almost no performance penalty, at least I cannot see much in terms of vram usage, maybe 1-2% slower training (margin of error) which is a good news.

Running all those features sadly is not possible on 6GB GPU, at least not at this resolution and batch size of 6.
However learned mask isn't that important and can always be enabled for a while, same with other features so it's possible to have them all enabled, just not at the same time during training and should work at lower batch and/or resolution, speaking of lower res:

Res 144, batch size 6, default dims, gradient clipping enabled, full face, DF SAEHD, OP mode 3:

Test 8 - RANDOM WARP + TRUE FACE: random warp enabled, true face enabled, learned mask disabled: no OOMs, 2800ms
Test 9 - RANDOM WARP ONLY: random warp enabled, true face disabled, learned mask disabled: no OOMs, 2450ms
Test 10 - TRUE FACE ONLY: random warp disabled, true face enabled, learned mask disabled: no OOMs, 2750ms
Test 11 - ALL DISABLED: random warp disabled, true face disabled, learned mask disabled: no OOMs, 2450ms (same as with random warp enabled)
Test 12 - ALL DISABLED + higher batch size - 8: OOM, fails to start

Looks like random warp only adds 50ms per iteration, margin of error, everything disabled has the same performance as random warp only, maybe a small VRAM hit so with true face enabled it's affecting training speed a bit.

It looks like max batch size with 6GB GPU is 6 for 144 with random warp + true face and for 160 with only random warp.

3rd set of tests was to check if SAEHD (despite supposedly being more demanding) runs better than old SAE because of lower default dims:
Settings: Batch size 12, RES 128, SAEHD DF, default dims (512, 21), OP mode 3, gradient clipping enabled, all other options (true face, random warp, learned mask, style power, face flip, etc) disabled:

Note: both CA weights and pixel loss settings are not available in SAEHD, I'm assuming CA weights are enabled by default because they always load when starting new SAEHD model and pixel loss is either removed or something was changed about it.

Test 13 - Batch size 12, SAEHD: OOMs, fails to start - seems like it is indeed more demanding as those settings run without issues on regular SAE model.
Test 14 - Batch size 10, SAEHD: no OOMs, 2550ms but unstable and crashes.
Test 15 - Batch size 12: SAE (same settings, same datasets, CA weights enabled, pixel loss disabled) DFL 9.2 AVX: no OOMs, 2100ms.
Test 16 - Batch size 12: SAE, older DFL version with updated SAE model, 9.2 SSE: no OOMs, 2100ms.

This confirms that there is virtually no difference in performance between AVX and SSE version of 9.2 version of DFL for regular, non RTX GPUs (without tensor cores) and that SAEHD is indeed more demanding than regular SAE even at the same settings, despite having lower dims value, but then there are changes to CA weights and other aspects (no access to CA weights or pixel loss settings).

Performance would probably be slower on 10.1 AVX but we can't tell without testing so... here it goes:

Test 17 - Batch size 12: SAE, 10.1 AVX: no OOMs, 2100ms, same as 9.2 SSE and 9.2 AVX.
Test 18 - Batch size 6, SAE HD, DF, RES 160, rest as usual, no additional options enabled (random warp, true face, learned mask) same as Test 1: 3 OOMs on start, then 2900 ms (same as 9.2 AVX)
Test 19 - Batch size 6, SAE HD, DF, RES 144, rest as usual, random warp and true face enabled (learned mask disabled) same as Test 8: no OOMs, 2800ms (same as 9.2 AVX).

And it looks like performance is roughly the same between all version, that's assuming your CPU supports AVX (i7 4770k in my case).

39. Q: What's the difference between half face, mid (medium) face and full face face_type modes?


A: You are not allowed to view links. Register or Login to view.

Full face is a recommended face_type mode to get as much coverage of face as possible, it is also a default mode for SAE/SAEHD model.
Half face mode was a default face_type mode in H64 and H128 models and it's also possible to train SAE/SAEHD models as half face.

Mid (medium) face is a new mode that covers around 30% larger area than half face.

In general half face models are harder to mask properly and can cut out parts of face (like eyebrows, bottom of mouth, chin) but provide better detail because more resolution is used to resolve details on face (because it's bigger as you can see in picture above).
Mid face mode is supposed to be a compromise between coverage and detail.


40. Q: What is the best GPU for deepfakes? I want to upgrade my gpu, which one should I get?

A: Answer to this will change as deepfaking software gets further developed and GPUs become more powerful but for now the best GPU is the one that has most VRAM and is generally fast.

For performance figures check our SAE spreadsheet: You are not allowed to view links. Register or Login to view.

[/url]
Right now NVIDIA is faster than AMD
For start: GPUs with 6GBs of memory: GTX 1060 6GB (resolution up to 128-144)
For higher model resolutions: GPUs with 8GBs of memory: GTX 1070/1070Ti/1080, RTX 2060 Super/2070/2070 Super/2080 Super (resolution up to 192)
For even higher model resolutions/faster training: GPUs with 11GBs of memory or more: GTX 1080Ti 11GB, RTX 2080Ti 11GB, Titan RTX 24GB, QUADRO RTX 5000 16GB, QUADRO RTX 6000 24GB (resolution up to 256, max for SAE)

Bear in mind that resolution of the model is depends on the batch size and type of model/software you are running, figures above are not at all accurate, you may be able to run 256 model on 1080 with 8GB but only with SAE without any additional features and at low batch size, slow at that, same goes the other way you may be only able to run 192 on Titan RTX if you enable all features of SAEHD and run at high batch size.

41. Q: What do the ae_dims, e_ch_dims and d_ch_dims mean? Can I decrease them to get more performance/lower VRAM usage? Can I increase them to get better quality? How to change them? By what factor/how much can I change them?


A: Ae_dims and e/d_ch_dims are settings that control auto encoder/encoder/decoder dimensions.

They can be changed to both increase performance or increase quality but they should be left on default values unless you really need to change them.

Following information is not 100% accurate, it is taken from a DFL manual written by it's developer @
iperov, who is Russian so it's translated with Google translate and interpreted by me - a person that knows almost nothing about neural networks. Some of the terms are most likely not right but it should still give you a rough idea what each option does so buckle up and get ready for some technically fuckery:

a) AE_DIMS controls auto encoder dimension size - it affects the capability of the network to learn/recognize all features of the face, it's expressions, etc.
Too low value could cause certain features to not be learned properly (like open/closed eyes, eyes looking in different directions, open/closed mouth, teeth, etc).
Default settings for SAE DF - 512, SAEHD DF - 256, SAE LIAE - 256, SAEHD LIAE - 256.

b) E_CH_DIMS controls encoder dimensions - it affects the capability of the network to learn/recognize large patterns (face structure).
Default settings for SAE DF - 42, SAEHD DF - 21, SAE LIAE - 42, SAEHD LIAE - 21.

c) D_CH_DIMS controls decoder dimensions - it affects the capability of the network to learn/recognize small patterns (facial details).
Default settings for SAE DF - 21, SAEHD DF - 21, SAE LIAE - 21, SAEHD LIAE - 21.

When changing them you don't have to technically use any factors like 16 but I'd recommend to change ae_dims value by this factor and not go below 21 with the e/d_ch_dims, as for raising not sure there is a point but you can do it, but it will increase vram usage and make the training run much slower.

SAEHD has merged e/d_ch_dims into one setting and is the same for both as you can see in the defaults above.

TIPS:

1. You may start training with smaller batch size and increase it after some time, it will help generalize your model faster (learn the basic shapes, etc) and then you can increase the batch size.

Following part is a citation from GAN-er, from the DFL manual:

"The reason is that a large batch size will give you a more accurate descent direction but it will also be costlier to calculate, and when you just start, you care mostly about the general direction; no need to sacrifice speed for precision at that point. There are plenty of sources discussing the batch size, as an example you can check this one:
https://stats.stackexchange.com/questions/164876/tradeoff-batch-size-vs-number-of-iterations-to-train-a-neuralnetwork ".

2. Other way to help with model generalization and speed up the initial part of training is to enable random warp of samples (random_warp) and disable it once model learns the face, disabling it will make all the detail appear faster, it's a feature that is enabled by default in SAE but toggable in SAEHD, if you want to be able to disable it in regular SAE and don't want to train new SAEHD model you can use @fapper93 modification to add this and bunch of other SAEHD features like True Face to SAE: [url=https://mrdeepfakes.com/forums/thread-sae-modified-sae-w-trueface-random-warp]https://mrdeepfakes.com/forums/thread-sae-modified-sae-w-trueface-random-warp

3. Another tip from GAN-er - if your loss values are stuck for a while or you are getting some artifacts in previews you may try to flip the src and dst datasets for a bit and then revert them back again (basically rename data_src to data_dst and vice versa).

4. Another thing you could try to fix stuck losses/artifacts on some faces is to enable random_flip for a bit, it will flip the faces horizontally (mirror flip) and possibly give AI some more data to work on, disable it afterwards and keep training with it disable since faces aren't symmetrical and having it enabled for longer may lead to incorrectly looking face swap.

5. Most important tip... Repeated by many but still often ignored, it doesn't matter which model you use, if you have bad src and dst dataset your deepfake will look bad, as GAN-er describes it "garbage in, garbage out". Ensure you only keep sharp, high resolution, well lit, free of motion blur, noise, compression artifacts and properly aligned images in your SRC, use high quality and properly extracted/aligned DST and you will get good results even with H64 (but actually don't use it, stick to SAE/SAEHD and maybe H128 if you really have to).
There is a reason why the very first Gal Gadot deepfake that started it all still looks better than some attempts by various users done with SAE.

More tips from manual:

6. Ensure that you don't have more SRC images than DST or your model may not train/converge properly, enable "Feed faces to network sorted by yaw" in that instance, this setting can be toggle on/off during training.

7. If the SRC face is wider than DST it may not train/converge properly, if the difference is big enough and you have no other option, use "Src face scale modifier %" with the negative value (-5 for example)

8. DF architecture (SAE/SAEHD models) ensures a more direct swap, thus the final face should look more like SRC, LIAE uses morphing so it may look less like SRC but is able to slightly adapt to differently shaped face of DST compared to SRC (but not by much). Also for LIAE it's recommended to use DST that looks more like SRC.

9. Higher batch size - better face generalization, higher ae/e/d_ch dimensions - better face quality (large and small details)

10. Face style power - trains/transfers style of face complexion, lighting, make-up. High values or prolonged training with high values may lead to artifacts, it's recommended to stick to value of 0.1
Background style power - trains/transfers style of face contour/edge and background (usefull if you are planning on increasing mask size to show more area around of the face). High values or prolonged training with high values may lead to artifacts, it's recommended to stick to value of 0.1

"How best to train SAE with style? There is no better solution, it all depends on the stage. Experiment style with values ranging from 10.0, and then reducing the value to 0.1-2.0 after 15-25k iterations. Enable write preview history and track changes. Make a backup file every model 10k iterations. You can roll back the model files and to change the values, if something went wrong in the preview stories."

11. You can use a powershell command to quickly batch rename files that have _0 (or other prefixes), this way you can for example:

- sort your data_dst by any method (histogram, blur, yaw, etc) to find bad frames
- then copy them to a new folder
- rename original "aligned" to something else (like "aligned_1") so you can rename the new folder with bad faces to "aligned"
- then use 5.3.other) data_dst util recover original filename,
- after it finishes go to the "aligned" folder where you will have all the bad faces you found with original name and some prefix like _0 / _1
- hold ctrl+shift while right clicking, open powershell and use this command:
get-childitem *.jpg | foreach {rename-item $_ $_.name.replace("_0","")}

- if you have more files with different prefixes, just run the command again by changing _0 to any other prefix you may have like _1:
get-childitem *.jpg | foreach {rename-item $_ $_.name.replace("_1","")}

- this way you can just copy those bad aligned frames into "aligned_debug", then you just click replace and then delete them while they are highlighted (useful if you happen to have a lots of bad alignments)
- at the end delete the bad frames folder "aligned" and rename "aligned1" back to the original name, run 5.3.other) data_dst util recover original filename to sort by original order and then run 5) data_dst extract faces MANUAL RE-EXTRACT DELETED RESULTS DEBUG to extract faces you've just delete from "aligned_debug".

You are not allowed to view links. Register or Login to view.
Raising money for a new GPU, if you enjoy my fakes or my work on forums consider donating via bitcoin, tokens or paypal/patreon, any amount helps!
Paypal/Patreon: You are not allowed to view links. Register or Login to view.
Bitcoin: 1C3dq9zF2DhXKeu969EYmP9UTvHobKKNKF
Want to request a paid deepfake or have any questions reagarding the forums or deepfake creation using DeepFaceLab? Write me a message.
TMB-DF on the main website - You are not allowed to view links. Register or Login to view.
#2
42. Q: Can you change face type/size after I already started training at half face (and want to change to something else like mid-half face or full face)?


A: No, you can't. Only way of changing it is to start training a new model.

We recommend to just download one of the shared pretrained models, it will save you a lot of time, you can find them here: You are not allowed to view links. Register or Login to view.
Raising money for a new GPU, if you enjoy my fakes or my work on forums consider donating via bitcoin, tokens or paypal/patreon, any amount helps!
Paypal/Patreon: You are not allowed to view links. Register or Login to view.
Bitcoin: 1C3dq9zF2DhXKeu969EYmP9UTvHobKKNKF
Want to request a paid deepfake or have any questions reagarding the forums or deepfake creation using DeepFaceLab? Write me a message.
TMB-DF on the main website - You are not allowed to view links. Register or Login to view.

Forum Jump:

Users browsing this thread: 2 Guest(s)