MrDeepFakes Forums
  • New and improved dark forum theme!
  • Guests can now comment on videos on the tube.
   
tutsmybarreh[GUIDE] - DeepFaceLab 2.0 EXPLAINED AND TUTORIALS (recommended)
#1
DeepFaceLab 2.0 Guide/Tutorial

[Image: eWsS3rBh.jpg]

NOTE: This thread is meant to be just a guide.

The only types of posts/questions allowed here are ones about the guide itself - suggestions on what to add to it, etc.
If you have a question about techniques/workflows, suggestions about features in DFL 2.0, want to complain, report bugs or just talk in general and share something about the process of making deepfakes in DFL 2.0 please post in this thread:

You are not allowed to view links. Register or Login to view.

What is DeepFaceLab 2.0?

DeepFaceLab 2.0 is a tool/app utilizing machine learning to swap faces in videos.

What's the difference between 1.0 and 2.0? What's new in DFL 2.0?

At the core DFL 2.0 is very similar to 1.0 but it was rewritten and optimized to run much faster and offer better quality at the cost of compatibility.
Because of it, AMD cards are no longer supported and new models (based on SAE/SAEHD and Quick96) are incompatible with previous versions. However any datasets that have been extracted with later versions of DFL 1.0 can be still used in 2.0.

SAEHD DFL 2.0 Spreadsheet with users model settings: You are not allowed to view links. Register or Login to view.
DFL 2.0  pretrained models: You are not allowed to view links. Register or Login to view.


He is a list of main features and changes in 2.0:
  • Available as standalone app with zero dependencies for all windows versions.
  • Includes 2 models: SAEHD (4 architectures) and Quick 96.
  • Support for multi-GPU training.
  • Increased performance during faceset (dataset) extraction, training and merging thanks to better optimization (compared to DFL 1.0)
  • Faceset enhancer tool - for upscaling/enhancing detail of source faceset (dataset).
  • New GAN Power option - Generative Adversarial Network training, which enhances details of the face.
  • New TrueFace Power option - variable face discrimination for better likeness to the source.
  • Ability to choose which GPU to use for each step (extraction, training, merging).
  • Ability to quickly rename, delete and create new models within the command line window.
  • Merging process now also outputs mask files for post process work in external video editing software with option to render it out as black and white video.
  • Face landmark/position data embedded within dataset/faceset image files with option to extract embedded info for dataset modifications.
  • Training preview window.
  • Interactive converter.
  • Debug (face landmark preview) option for source and destination (data_src/dst) datasets.
  • Facese (dataset) extraction with S3FD and/or manual extraction.
  • Training at any resolution in increments of 16. Possibility of training models at resolutions up to 256.
DeepFaceLab 2.0 is compatible with NVIDIA GPUs and CPUs, no AMD support anymore, if you want to train on AMD GPUs - DFL 1.0 can do it but it's no longer supported/updated.
DFL 2.0 requires Nvidia GPUs that support at least CUDA Compute Compability version 3.0
CUDA Compute Capability list: You are not allowed to view links. Register or Login to view.

DOWNLOAD:

The GitHub page of DFL 2.0 can be found here (contains newest version as well as all current updates): You are not allowed to view links. Register or Login to view.
Stable releases can be found here:
You are not allowed to view links. Register or Login to view.
You are not allowed to view links. Register or Login to view.

NOTE FOR GOOGLE DRIVE: If you get info about download quota exceeded, right click -> add to my drive then in your drive make a copy of it (right click -> copy) and download that new copy.

If you don't have an NVIDIA GPU and your CPU doesn't let you train in any reasonable time or you don't want to use DFL 1.0 with your AMD GPU you may consider trying out Google Cloud Computing service Google Colab and our DFL implementation on it: You are not allowed to view links. Register or Login to view.

------------------------------------------------------------------------------------------------------------------------------------------------------

Explanation of all DFL functions:

DeepFaceLab 2.0 consists of selection of .bat files used to extract, train and merge (previously called convert) which are 3 main steps of creating a deepfake, they are located in the main folder along with two subfolders:
  • _internal (that's where all the files necessary for DFLs to work are)
  • workspace (this is where your models, videos, facesets (datasets) and final video outputs are
------------------------------------------------------------------------------------------------------------------------------------------------------

Before we go into the main guide part here is some terminology (folders are written with "quotations")

Faceset (dataset) - is a set of images that have been extracted (or aligned with landmarks) from frames (extracted from video) or photos.

There are two datasets being used in DFL 2.0 and they are data_dst and data_src:

- data_dst is a dataset containing aligned face images (256x256) that are extracted from the target (destination) video data_dst.mp4 and they contain information about the shape of faces, their features (eyes,mouth,nose,eyebrows) and their position on the image, upon extraction/aligning process 2 folders are created within the "data_dst" folder:

"aligned" containing images of faces, 256x256 in size (with the alignment data)

"aligned_debug" which contains original frames with landmarks overlayed on faces which is used to identify correctly/incorrectly aligned faces (and it doesn't take a part in training or merging process).
After cleaning up dataset (of false positives, incorrectly aligned faces and fixing them) it can be deleted to save space.

- data_src is a dataset containing images of faces that are extracted either from data_src.mp4 file (that can be interview, movie, trailer, etc) or from images of your source - basically these are extracted faces of the person we want to put onto the body/head of the other person (onto the target/destination video).
By default upon extraction this folder only contains the "aligned" folder but "aligned_debug" folder can be also generated (you get to choose during extraction).

Before you get to extract faces however you must have something to extract them from:

- for data_dst you should prepare the target (destination) video and name it data_dst.mp4
- for data_src you should either prepare the source video (as in examples above) and name it data_src.mp4 or prepare images in jpg or png format.
The process of extracting frames from video is also called extraction so for the rest of the guide/tutorial I'll be referring to both processes as either "face extraction/alignment" and "frame extraction".

As mentioned at the beginning all of that data is stored in the "workspace" folder, that's where both data_src/dst.mp4 files, both "data_src/dst" folders are (with extracted frames and "aligned"/"aligned_debug" folders for extracted/aligned faces) and the "model" folder where model files are stored.

------------------------------------------------------------------------------------------------------------------------------------------------------

Options are grouped based on the function they do.

1. Workspace cleanup/deletion:

1) Clear Workspace - self explanatory, it deletes all data from the "workspace" folder, feel free to delete this .bat file to prevent accidental removal of important files you will be storing in the "workspace" folder

2. Frames extraction from source video (data_src.mp4):

2) Extract images from video data_src - extracts frames from data_src.mp4 video file and puts them into automatically created "data_src" folder, available options:
- FPS - skip for videos default frame rate, enter numerical value for other frame rate (for example entering 5 will only render the video as it was 5 frames per second, meaning less frames will be extracted)
- JPG/PNG - choose the format of extracted frames, jpgs are smaller and generally have good enough quality so they are recommended, pngs are large and don't offer significantly higher quality but they are an option.

3. Video cutting (optional):

3) cut video (drop video on me) - allows to quickly cut any video to desired length by dropping it onto that .bat file. Useful if you don't have video editing software and want to quickly cut the video, options:
From time - start of the video
End time - end of the video
Audio track - leave at default
Bitrate - let's you change bitrate (quality) of the video - also best to leave at default

3. Frames extraction from destination video (data_dst.mp4):

3) extract images from video data_dst FULL FPS - extracts frames from data_dst.mp4 video file and puts them into automatically created "data_dst" folder, available options:
- JPG/PNG - same as in 2).

4. Data_src faces extraction/alignment:

First stage of preparing source dataset is to align the landmarks and produce 512x512 face images from the extracted frames located inside "data_src" folder.

There are 4 methods
For Full Face, Mid-Half Face and Half Face training:
4) data_src extract full_face S3FD - automated extraction using S3FD.
4) data_src extract full_face MANUAL - manual extraction.

For Whole Face training (also works with full face, mid-half face andd half face training):
4) data_src extract whole_face S3FD  - for full face, mid-half face and half face models.
4) data_src extract whole_face MANUAL - for whole face, mid-half face and half face models.

Available options are:
- choosing which gpu (or cpu) to use for faces extraction/alignment process.
- choosing whether to generate "aligned_debug" folder or not.

4. Data_src cleanup:

After that is finished next step is to clean the source faceset/dataset of false positives/incorrectly aligned faces, for a detailed info check this thread: You are not allowed to view links. Register or Login to view.

4.1) data_src view aligned result - opens up external app that allows to quickly go through the contents of "data_src/aligned" folder for false positives and incorrectly aligned source faces as well as faces of other people so you can delete them.

4.2) data_src sort - contains various sorting algorithms to help you find unwanted faces, these are the available options:

[0] blur
[1] face yaw direction
[2] face pitch direction
[3] face rect size in source image
[4] histogram similarity
[5] histogram dissimilarity
[6] brightness
[7] hue
[8] amount of black pixels
[9] original filename
[10] one face in image
[11] absolute pixel difference
[12] best faces
[13] best faces faster

4.2) data_src util add landmarks debug images - let's you generate "aligned_debug" folder after extracting faces (if you wanted to have it but forgot or didn't select the right option in the first place.

4.2) data_src util faceset enhance - uses special machine learning algorithm to upscale/enhance the look of faces in your dataset, useful if your dataset is a bit blurry or you want to make a sharp one have even more detail/texture.

4.2) data_src util faceset metadata restore and 4.2) data_src util faceset metadata save - let's you save and later restore embedded alignment data from your source faceset/dataset so you can edit some face images after you extracted them (for example sharpen them, edit out glasses, skin blemishes, color correct) without loosing alignment data and also so you don't need to re-extract them again.

EDITING ANY IMAGES FROM "ALIGNED" FOLDER WITHOUT THIS STEP WILL REMOVE THAT ALIGNMENT DATA AND THOSE PICTURES WON'T BE USABLE IN TRAINING, WHEN EDITING KEEP THE NAMES THE SAME, NO FLIPPING/ROTATION IS ALLOWED, ONLY SIMPLE EDITS LIKE COLOR CORRECTION, ETC.

4.2) data_src util faceset pack and 4.2) data_src util faceset unpack - packs/unpacks all faces from "aligned" folder into/from one file.

4.2.other) data_src util recover original filename - reverts names of face images back to original order/filename (after sorting).

5. Data_dst preparation:

Here steps are pretty much the same as with source dataset, with few exceptions, let's start with faces extraction/alignment process.
We still only have Manual and S3FD extraction method but there is also one that combines both and a special manual extraction mode, "aligned_debug" folder is generated always.

For Full Face, Mid-Half Face and Half Face training:
5) data_dst extract full_face MANUAL RE-EXTRACT DELETED ALIGNED_DEBUG - this one is used to manually align/extract faces for frames that were deleted from "aligned_debug" folder.
More on that in next step - Data_dst cleanup.
5) data_dst extract full_face MANUAL - manual extraction.
5) data_dst extract full_face S3FD + manual fix - automated extraction + manual one for frames where algorithm couldn't properly detect faces.
5) data_dst extract full_face S3FD - automated extraction using S3FD algorithm.

For Whole Face training (also works with full face, mid-half face andd half face training):
5) data_dst extract whole_face MANUAL RE-EXTRACT DELETED ALIGNED_DEBUG
5) data_dst extract whole_face MANUAL
5) data_dst extract whole_face S3FD + manual fix
5) data_dst extract whole_face S3FD

Available options are:
- choosing which gpu (or cpu) to use for faces extraction/alignment process.

5. Data_dst cleanup:

After we aligned data_dst faces we have to clean them up, similar to how we did it with source faceset/dataset we have a selection of sorting methods which I'm not going to explain as they work exactly the same as ones for src.
However cleaning up the destination dataset is different than source because we want to have all the faces aligned for all the frames where they are present. There are couple of tools at our disposal for that:

5.1) data_dst view aligned results - let's you view the contents of "aligned" folder using external app (built into DFL) which offers quicker thumbnail generation than default windows explorer
5.1) data_dst view aligned_debug results - let's you quickly browse contents of "aligned_debug" folder to locate and delete any frames where our target person face has incorrectly aligned landmarks or where landmarks weren't placed at all (which means face wasn't detected at all). In general you use this to find if all your faces are properly extracted and aligned (if landmarks on some frames aren't lining up with the shape of the face or eyes/nose/mouth/eyebrows or are missing - they should be deleted so we can later manually re-extract/align them).
5.2) data_dst sort - same as with source faceset/dataset, this tool let's you sort all aligned faces within "data_dst/aligned" folder so that's it's easier to locate incorrectly aligned faces, false positives and faces of other people we don't want to train our model on/swap faces onto.
5.2) data_dst util faceset pack and 5.2) data_dst util faceset unpack - same as with source, let's you quickly pack entire dataset into one file.
5.2) data_dst util recover original filename - same as with source, restores original names/order of all aligned faces after sorting.
5.3) data_dst mask editor - Allows you to manually edit mask of the data_dst aligned faces (so you can exclude parts of the face from showing up after merging/converting - where mask isn't present on the face, parts of the original face/frame will be visible) - optional feature.

Additionally mask editor has an option called Default eyebrows expand modifier - it let's you expand the mask above the eyebrows automatically without the need to manually edit mask for each face but it can cause issues on side profiles where the expansion will also cover background (recommended only for frontal angles and moderate side angles).

Example:

[Image: Tmy5tACh.jpg]

Results of edited mask training + merging (conversion with dst mask):

[Image: wNewLwjh.jpg]

It's a very tedious and time consuming process, instead if you want to get rid of obstructions in the deepfake you may want to give FANseg masks a go during merging/conversion. Instead you're more likely to just use FANseg conversions instead during merging process.

In the converter (or interactive converter which we recommend) you can select various mask modes like fan-prd, fan-dst, fan-prd * fan-dst, learned * fan-prd * fan-dst) which can be used to automatically mask out obstructions from faces (like glasses, hands that are covering/obstructing data_dst faces).

Here is an example of FANseg mode masking the hand:

[Image: M1gQaZfh.jpg]

For the whole face models either manual masking or using Xseg is required.
Guide on manual masking: You are not allowed to view links. Register or Login to view.
Guide on training Xseg model: You are not allowed to view links. Register or Login to view.

Back to the cleanup, now that you know your tools here are examples of few different workflows for cleaning up the data_dst dataset.

1. This first method I use when I'm working with a long clip where detection rate is high and I'm okay with missing few faces and I just want to fix those that were incorrectly aligned:

You start by sorting face using 5.2) data_dst sort and select sorting by histogram, this will generally sort faces by their similarity in color/structure so it's likely to group similar ones together and separate any images that may contain rotated/zoomed in/out faces, as well as false positives and faces of other people and put them at either beginning/end of the list.

You should first delete all false positives and unwanted faces. Now that you've done this you cut out all incorrectly detected faces and place them into a separate folder, reason for this is that we next need to use 5.1) data_dst view aligned_debug results and find all frames where the landmarks are missing or are incorrectly placed on the faces of our target person, by setting those incorrectly aligned faces aside we can do few things with them that will let just copy them to the "aligned_debug" folder, replace those frames with them and while they are still highlighted in windows explorer hit delete to remove them, there will still be some that we will have to locate manually (such as all faces that weren't detected at all) but doing it this way can save you a lot of time, especially if there are a lot of incorrectly aligned faces in a long clip where they usually will be somewhere in the middle among correctly aligned ones and it might be hard to notice them. Here is more detailed version of this method

(TIP #11 from FAQ): You are not allowed to view links. Register or Login to view.

Quote:- sort your data_dst by any method (histogram, blur, yaw, etc) to find bad frames
- then copy them to a "new folder"
- rename original "aligned" to something else (like "aligned_1") so you can rename the "new folder" with bad faces to "aligned"
- then use 5.3.other) data_dst util recover original filename,
- after it finishes go to the "aligned" folder where you will have all the bad faces you found with original name and some prefix like _0 / _1
- hold shift while right clicking, open powershell and use this command:
get-childitem *.jpg | foreach {rename-item $_ $_.name.replace("_0","")}

- if you have more files with different prefixes, just run the command again by changing _0 to any other prefix you may have like _1:
get-childitem *.jpg | foreach {rename-item $_ $_.name.replace("_1","")}

- this way you can just copy those bad aligned frames into "aligned_debug", then you just click replace and then delete them while they are highlighted (useful if you happen to have lots of bad alignments)
- at the end delete the bad frames folder "aligned" and rename "aligned1" back to the original name.


No matter if you used my technique or found them all manually, you should now run 5.2) data_dst util recover original filename to recover original names/orders of the face images and then run 5) data_dst extract faces MANUAL RE-EXTRACT DELETED ALIGNED_DEBUG to extract faces you've just delete from "aligned_debug". After that's done you have your data_dst dataset cleaned up, with all faces correctly extracted (including partially visible one) and ready to train.

2. If you however have a long clip and a lot of incorrectly aligned faces or the clip is shorter the following method is much faster.
You start by removing all bad faces from your "data_dst/aligned" folder, you can make finding them faster by first sorting them by histogram similarity, then after you've removed all bad faces (no matter if it was false positive, incorrect alignment or other person, you delete all that are bad or don't look right) you revert them back to original filename, in the meanwhile create a new copy of the "aligned_debug" folder, while that's copying, go back to "aligned" folder and use the powershell command from method #1 to remove _0 prefixes from your dataset, if _1 prefix faces are present (due to transition where on a frame both faces are visible) manually check if both were correctly aligned, if not, check which one wasn't, rename good one to _1 and remove the bad one, aftert that, run the command, wait for it to rename all files to have no prefix at the end (with exception for good faces with _1 prefix) and then copy them all over to the "aligned_debug - copy" folder, replace, then delete, you will be left with all frames from which faces weren't extracted or were incorrectly. Now you can go through this folder much quicker, delete all frames where no faces are visible (that you want to extract) and at the end you will be left with only the frames from which you want to extract new faces using MANUAL RE-EXTRACT DELETED ALIGNED_DEBUG. Copy them to original "aligned_debug" folder, replace and delete. While doing the copy/replace/delete action make sure to not click anywhere on the folder, if you do the new replaced files won't be highlighted and you won't be able to quickly delete them, which is the point of all of this.

More detailed info is in the FAQ (which you should definitely read, has tons of common questions, bug fixes, tips, etc):
You are not allowed to view links. Register or Login to view.

And in this thread there are some details on how to create source datasets, what to keep, what to delete and also in general how to clean source dataset (pretty much the same was as target/destination dataset) and how/where to share them with other users: You are not allowed to view links. Register or Login to view.

6. Training:

There are currently 2 models to choose from for training:

SAEHD (6GB+): High Definition Styled AutoEncoder model - high end model for high end GPUs with at least 6GB of VRAM.

Features:
- runs at any resolution in increments of 16 up to 512x512 pixels
- half face, mid-half face, full face and whole face mode
- 4 architectures: DF, LIAE, DFHD, LIAEHD
- Adjustable batch size
- Adjustable model auto encoder, encoder, decoder and mask decoder dimensions
- Adjustable auto backup
- Preview history
- Target iteration
- Random face yaw flip setting
- Mask learning
- GPU Optimizer
- Learning dropout
- Random warp
- Adjustable GAN training power setting
- Adjustable True Face training power setting
- Adjustable Face and Background Style power setting
- Color transfer
- Gradient Clipping
- Pretrain mode

Quick96 (2-4GB): Simple model derived from SAE model - dedicated for low end GPUs with 2-4GB of VRAM.

Features:
- runs at 96x96 pixels resolutions
- full face mode
- batch size 4

Both models can generate good deepfakes but obviously SAEHD is the preferred and more powerful one.
If you want to test out your ideas Quick96 isn't a bad idea but of course you can still run SAEHD at the same setting or go even lower.
If you want to see what other people can achieve with various graphics cards, check this spreadsheet out where users can share their model settings:
You are not allowed to view links. Register or Login to view.You are not allowed to view links. Register or Login to view.

After you've checked other peoples settings and decided whether you preffer fast training or you want to wait and run a heavier model you start it up using either one of those:

6) train SAEHD
6) train Quick96

Since Quick96 is not adjustable you will see the command window pop up and ask only 1 question - CPU or GPU (if you have more then it will let you choose either one of them or train with both).

SAEHD however will present you with more options to adjust, you already know what are the features, now here is a more detailed explanation of them in order they are presented to the user upon starting of the training:

Autobackup every N hour ( 0..24 ?:help ) : self explanatory - let's you enable automatic backups of your model every N hours. Leaving it at 0 (default) will disable auto backups. Default value is 0 (disabled).

Target iteration : will stop training after certain amount of iterations is reached, for example if you want to train you model to only 100.000 iterations you should enter a value of 100000. Leaving it at 0 will make it run until you stop it manually. Default value is 0 (disabled).

Flip faces randomly ( y/n ?:help ) : Useful option in cases where you don't have all necessary angles of the persons face (source dataset) that you want to swap onto the target. For example if you have a target/destination video with person looking straight and to the right and your source only has faces looking straight and to the left you should enable this feature but bear in mind that because no face is symmetrical results may look less like src and also features on the source face (like beauty marks, scars, moles, etc.) will be mirrored. Default value is n (disabled).

Batch_size ( ?:help ) : Batch size settings affects how many faces are being compared to each other every each iteration. Lowest value is 2 and you can go as high as your GPU will allow which is affected by VRAM. The higher your models resolution, dimensions and the more features you enable the more VRAM will be needed and thus lower batch size will be possible.
How to guess what batch size to use? You can either use trial and error or help yourself by taking a look at what other people can achieve on their GPUs by checking out the DFL 2.0 spreadsheet: You are not allowed to view links. Register or Login to view.[url=https://mrdeepfakes.com/forums/thread-dfl-2-0-user-model-settings-spreadsheet]https://mrdeepfakes.com/forums/thread-dfl-2-0-user-model-settings-spreadsheet

Resolution ( 64-512 ?:help ) : here you set your models resolution, bear in mind this option cannot be changed during training. It affects the resolution of swapped faces, the higher model resolution - the more detailed the learned face will be but also training will be much heavier and longer. Resolution can be increased in increments of 16 from 64x64 to 512x512.

Face type ( h/mf/f/wf ?:help ) : this option let's you set the area of the face you want to train, there are 4 options - half face, mid-half face, full face and whole face.
Full face trains on the most of the face area, half face only trains from mouth to eyebrows but can in some cases cut of top or bottom of the face and mid half face offers 30% bigger area trained than half face while also prevents most of the undesirable cut off of the face from occurring (but it can still happen). It's recommended to use full face or whole face for most flexibility but half face and mid half face offer better details because at the same model resolution more pixels are being used to resolve detail of the face (because it's larger/more zoomed in). Whole face model is new and offers way to train actual whole face, including bits you'd normally not thing you can swap, it pretty much can train the whole head as it covers all areas, even a bit of hair. Also during training has an option to prioritize training of the actual face or of the whole face (masked training).

[Image: UNDadcN.jpg]

Example of whole face model training face swap:

[Image: ldlVgZH.png]
AE architecture ( dfhd/liaehd/df/liae ?:help ) : This option let's you choose between 2 main learning architectures DF and LIAE as well as their HD version which offers more better quality at the cost of performance.

DF and LIAE architectures in DFL 2.0 SAEHD are based on the implementation of DF and LIAE models from DFL 1.0 SAE model.
Whereas DFHD and LIAEHD architectures in DFL 2.0 SAEHD are based on the implementation of DF and LIAE models from DFL 1.0 SAEHD model.
The essentail difference between HD and non-HD version of architectures is increased number of layers in the HD model variants.

DF: This model architecture provides a more direct face swap, doesn't morph faces but requires that the source and target/destination face/head have similar face shape.
This model works best on frontal shots and requires that your source dataset has all the required angles, can produce worse results on side profiles.

LIAE: This model architecture isn't as strict when it comes to face/head shape similarity between source and target/destination but this model does morph the faces so it's recommended to have actual face features (eyes, nose, mouth, overall face structure) similar between source and target/destination. This model offers worse resemblance to source on frontal shots but can handle side profiles much better.

Below is comparison between DFHD, DF, LIAEHD and LIAE model architectures, trained on the same hardware, same resolution and other parameters using the same source and destination datasets.

[Image: onwo0z8.jpg]

Thanks to @kkdlux for making the comparison (Top row: ORIGINAL/LIAEHD/DFHD, Bottom row: LIAE/DF): You are not allowed to view links. Register or Login to view.

The next 4 options control models neural network dimensions which affect models ability to learn, modifying these can have big impact on performance and quality of the learned faces so they should be left at default.

AutoEncoder dimensions ( 32-1024 ?:help ) : Auto encoder dimensions settings, affects overall ability of the model to learn faces.
Encoder dimensions ( 16-256 ?:help ) : Encoder dimensions settings, affects ability of the model to learn general structure of the faces.
Decoder dimensions ( 16-256 ?:help ) : Decoder dimensions settings, affects ability of the model to learn fine detail.
Decoder mask dimensions ( 16-256 ?:help ) : Mask decoder dimensions settings, affects quality of the learned mask when training with Learn mask enabled. Does not affect the quality of training.

The changes in performance when changing each setting (with exception of Decoder mask dimensions) can have varying effects on performance and it's not possible to measure effect of each one on performance and quality without extensive training. DFL creator @iperov set those at certain default values that should offer optimal results and good compromise between training speed and quality.

Also when changing one parameter the other ones should be changed as well to keep the relations between them similar (for example if you drop Encoder and Decoder dimensions from 64 to 48 you could also decrease AutoEncoder dimension from 256 to 192-240). Values should be changed by a factor of 2. Feel free to experiment with various settings but if you want a better quality you're better off raising resolution than changing these. If you want stable operation, keep them at default.

Learn mask ( y/n ?:help ) : Enabling this setting will cause your model to start learning the shapes of the faces to generate a mask that can be then used during merging. Masks are essential part of the deepfake process that let the merger place the new learne/deepfaked faces over original footage. By default merger uses dst mask that is generated during faces extraction/alignment process of your data_dst. If you don't enable this feature and select learned mask in the converter/during merging it will still use dst mask. Learned mask are generally better than default dst masks but using this feature has big impact on performance and VRAM usage so it's best to first train the model to a certain degree or fully and enable the mask only for a brief time (5-6k iterations) at the end or somewhere during training (can be enabled and disabled multiple times). Learned mask has no effect on the face quality, only on the mask. Learned mask can be used on it's own or in combination with FANseg mask modes. Default value is n (disabled).

Eyes priority ( y/n ?:help ) : Attempts to fix problems with eye training especially on HD architecture variants like DFHD and LIAEHD by forcing the neural network to train eyes with higher priority.
Bear in mind that it does not guarantee the right eye direction, it only affects the details of the eyes and area around them. Example (before and after):
[Image: YQHOuSR.jpg]

Place models and optimizer on GPU ( y/n ?:help ) : Enabling GPU optimizer puts all the load on your GPU which greatly improves performance (iteration time) but will lead to lower batch size, disabling this feature (False) will offload some work (optimizer) to CPU which decreases load on GPU (and VRAM usage) letting you achieve slightly higher batch size or run more taxing models (higher resolution or model dimensions) at the cost of training speed (longer iteration time).
Basically if you get OOM (out of memory) errors you should disable this feature and thus some work will be offloaded to your CPU and some data from GPUs VRAM to system RAM - you will be able to run your model without OOM errors and/or at higher batch size but at the cost of lowered performance. Default value is y (enabled).

Use learning rate dropout ( y/n ?:help ) : This feature should be only enabled at the very end of training and should never be enabled if features like random warp of samples or flip faces randomly. Once your model is fairly well trained and sharp, you've disabled random warp of samples, it will let you get a bit more detail and sharpness at less iterations that it would normally take without it. Use with caution, enabling before the model is fully trained may cause it to never improve until you disable it and let the training go on with this features disabled. Default value is n (disabled).

Enable random warp of samples ( y/n ?:help ) : Random warp of samples is a feature that used to be enabled all the time in the old SAE models of DFL 1.0 but now is optional, it's used to generalize a model so that it properly learns all the basic shapes, face features, structure of the face, expressions and so on but as long as it's enabled the model may have trouble learning the fine detail - because of it it's recommended to keep this feature enabled as long as your faces are still improving (by looking at decreasing loss values and preview window), once the face are trained fully and you want to get some more detail you should disable it and in few hundred-thousand iterations you should start to see more detail and with this feature disabled you carry on with training. Default value is y (enabled).

GAN power ( 0.0 .. 10.0 ?:help ) : GAN stands for Generative Adversarial Network and in case of DFL 2.0 it is implemented as an additional way of training on your datasets to get more detailed/sharp faces. This option is adjustable on a scale from 0.0 to 10.0 and it should only be enabled once the model is more or less done training (after you've disabled random warp of samples). It's recommeded to start at low value before going all the way to max to test out if the feature gives good results as it heavily depends on having a good and clean source dataset. If you get bad results you need to disable it and enable random warp of samples for some time so that the model can recover. Consider making a backup before enabling this feature. Default value is 0.0 (disabled).

Here is an example before and after enabling GAN training:

Before:
You are not allowed to view links. Register or Login to view.
After:
[Image: CYAJmJx.jpg]
If it's hard to notice the difference in the 1st example open it up in a new window.

'True face' power. ( 0.0000 .. 1.0 ?:help ) : True face training with a variable power settings let's you set the model discriminator to higher or lower value, what this does is it tries to make the final face look more like src, as with GAN this feature should only be enabled once random warp is disabled and model is more or less fully trained and here you should also start at a low value and make sure your source dataset is clean and correctly aligned, if you get bad results you need to disable it and enable random warp of samples for some time so that the model can recover. Consider making a backup before enabling this feature. Default value is 0.0 (disabled).

Here is an example:
[Image: czScS9q.png]

Face style power ( 0.0..100.0 ?:help ) and Background style power ( 0.0..100.0 ?:help ) : This variable setting controls style transfer of either face or background part of the image, it is used to transfer the style of your target/destination faces (data_dst) over to the final learned face which can improve quality and look of the final result after merging but high values can cause learned face to look more like data_dst than data_src.

It's recommended to use values up to 10 and decrease them during training down to 1 or even 0.1.
This feature has big performance impact and using it will increase iteration time and may require you to lower your batch size or disable gpu optimizer (Place models and optimizer on GPU). Consider making a backup before enabling this feature.

Examples of things that this option may do is transfer the style/color of lips, eyes, makeup, etc from data_dst to the final learned face and also carry over some features of the face (skin color, some textures or facial features). The stronger the settings - the more things or style will be transferred from data_dst to the final learned face. Default value is 0.0 (disabled).

Color transfer for src faceset ( none/rct/lct/mkl/idt/sot ?:help ) : this features is used to match the colors of your data_src to the data_dst so that the final result has similar skin color/tone to the data_dst and the final result after training doesn't change colors when face moves around (which may happen if various face angles were taken from various sources that contained different light conditions or were color graded differently). There are several options to choose from:

- rct (reinhard color transfer): based on: You are not allowed to view links. Register or Login to view.
- lct (linear color transfer): Matches the color distribution of the target image to that of the source image using a linear transform.
- mkl (Monge-Kantorovitch linear): based on: You are not allowed to view links. Register or Login to view.
- idt (Iterative Distribution Transfer): based on: You are not allowed to view links. Register or Login to view.
- sot (sliced optimal transfer): based on: You are not allowed to view links. Register or Login to view.

Examples: (coming soon)

Enable gradient clipping ( y/n ?:help ) : This feature is implemented to prevent so called model collapsing/corruption which may occur when using various features of DFL 2.0. It has small performance impact so if you really don't want to use it you must enable auto backups as a collapsed model cannot recover and must be scraped and training must be started all over. Default value is n (disabled) but since the performance impact is so low and it can save you a lot of time by preventing model collapse I recommend enabling it always on all models.

Enable pretraining mode ( y/n ?:help ) : Enables pretraining process that uses a dataset of random peoples faces to initially train your model, after training it to around 50k-100k iterations such model can be then resused when starting training with proper data_src and data_dst you want to train, it saves time because you don't have to start training all over from 0 every time and it's recommended to either pretrain a model using this feature, by making your own data_src and data_dst with random faces of people or by grabbing a pretrained model from our forum:
You are not allowed to view links. Register or Login to view.
Default value is n (disabled).
NOTE: The pretrain option can be enabled at any time but it's recommended to pretrain a model only once at the start (to around 200-400k iterations).

7. Merging:

After you're done training your model it's time to merge learned face over original frames to form final video (convert).

For that we have 2 converters corresponding to 2 available models:

7) merge SAEHD
7) merge Quick96

Upon selecting any of those a command line window will appear with several prompts.

1st one will ask you if you want to use an interactive converter, default value is y (enabled) and it's recommended to use it over the regular one because it has all the features and also an interactive preview where you see the effects of all changes you make when changing various options and enabling/disabling various features
Use interactive merger? ( y/n ) :

2nd one will ask you which model you want to use:
Choose one of saved models, or enter a name to create a new model.
[r] : rename
[d] : delete
[0] : df160 - latest
:

3rd one will ask you which GPU/GPUs or CPU you want to use for the merging (conversion) process:
Choose one or several GPU idxs (separated by comma).
[CPU] : CPU
[0] : GeForce GTX 1060 6GB
[0] Which GPU indexes to choose? :

Pressing enter will use default value (0).

After that's done you will see a command line window with current settings as well as preview window which shows all the controls needed to operate the interactive converter/merger.

Here is a quick look at both the command line window and converter preview window:
[Image: BT6vAzW.png]

Converter features many options that you can use to change the mask type, it's size, feathering/blur, you can add additional color transfer and sharpen/enhance final trained face even further.

Here is the list of all merger/converter features explained:

1. Main overlay modes:
- original: displays original frame without swapped face
- overlay: simple overlays learned face over the frame
- hist-match: overlays the learned face and tires to match it based on histogram (has 2 modes: normal and masked hist match, toggable with Z button)
- seamless: uses opencv poisson seamless clone function to blend new learned face over the head in the original frame
- seamless hist match: combines both hist-match and seamless.
- raw-rgb: overlays raw learned face without any masking

NOTE: Seamless modes can cause flickering, it's recommended to use overlay.

2. Hist match threshold: controls strength of the histogram matching in hist-match and seamless hist-match overlay mode.
Q - increases value
A - decreases value


3. Erode mask: controls the size of a mask.
W - increases mask erosion (smaller mask)
S - decreases mask erosion (bigger mask)


4. Blur mask: blurs/feathers the edge of the mask for smoother transition
E - increases blur
D - decreases blur


5. Motion blur: upon entering initial parameters (interactive converter, model, GPU/CPU) merger/converter loads all frames and data_dst aligned data, while it's doing it, it calculates motion vectors that are being used to create effect of motion blur which this setting controls, it let's you add it in places where face moves around but high values may blur the face even with small movement. The option only works if on set of faces is present in the "data_dst/aligned" folder - if during cleanup you had some faces with _1 prefixes (even if only faces of one person are present) the effect won't work, same goes if there is a mirror that reflects target persons face, in such case you cannot use motion blur and the only way to add it is to train each set of faces separately.
R - increases motion blur
F - decreases motion blur


6. Super resolution: uses similar algorithm as data_src dataset/faceset enhancer, it can add some more definitions to areas such as teeth, eyes and enhance detail/texture of the learned face.
T - increases the enhancement effect
G - decreases the enhancement effect


7. Blur/sharpen: blurs or sharpens the learned face using box or gaussian method.
Y - sharpens the face
H - blurs the face
N - box/gaussian mode switch


8. Face scale: scales learned face to be larger or smaller.
U - scales learned face down
J - scales learned face up

9. Mask modes: there are 6 masking modes:

dst: uses masks derived from the shape of the landmarks generated during data_dst faceset/dataset extraction.
learned mask: uses masks learned during training as described in step 6. If learned mask was disabled it will use dst mask instead.
fan-prd: 1st FANseg masking method, it predicts the mask shape during merging and takes obstructions (hands, glasses, other objects covering face) into account to masks them out.
fan-dst: 2nd FANseg masking method, it predicts the mask shape during merging by taking dst mask shape into account + obstructions.
fan-prd + fan-dst: 3rd FANseg masking method, combines fan-prd and fan-dst method.
fan-prd + fan-dst + learned: combines fan-prd, fan-dst and learned mask method.

The fastest masking method is dst but it cannot exclude obstructions, learned mask is better in terms of shapes but also cannot exclude them, fan-dst is a bit slower but can exclude obstructions and generally is good enough in most cases, fan-prd can be a bit unpredictable so it's not recommended, fan-dst+prd doesn't offer much better masks than dst and 6th option that combines fan-prd, fan-dst and learned mask is the best one but also the slowest and requires you to also train with learn mask on.

10. Color transfer modes: similar to color transfer during training, you can use this feature to better match skin color of the learned face to the original frame for more seamless and realistic face swap. There are 8 different modes:

RCT
LCT
MKL
MKL-M
IDT
IDT-M
SOT - M
MIX-M


examples coming soon.

11. Image degrade modes: there are 3 settings that you can use to affect the look of the original frame (without affecting the swapped face):
Denoise - denoises image making it slightly blurry (I - increases effect, K - decrease effect)
Bicubic - blurs the image using bicubic method (O - increases effect, L - decrease effect)
Color - decreases color bit depth (P - increases effect, ; - decrease effect)

Additional controls:

TAB button - switch between main preview window and help screen.
Bear in mind you can only change parameters in the main preview window, pressing any other buttons on the help screen won't change them.
-/_ and =/+ buttons are used to scale the preview window.
Use caps lock to change the increment from 1 to 10 (affects all numerical values).

To save/override settings for all next frames from current one press shift + / key.
To save/override settings for all previous frames from current one press shift + M key.
To start merging of all frames press shift + > key.
To go back to the 1st frame press shift + < key.
To only convert next frame press > key.
To go back 1 frame press < key.

8. Conversion of frames back into video:

After you merged/convert all the faces and you will have a folder named "merged" inside your "data_dst" folder containing all frames that makeup the video.
Last step is to convert them back into video and combine with original audio track from data_dst.mp4 file.

To do so you will use one of 4 provided .bat files that will use FFMPEG to combine all the frames into a video in one of the following formats - avi, mp4, loseless mp4 or loseless mov.

- 8) merged to avi
- 8) merged to mov lossless
- 8) merged to mp4 lossless
- 8) merged to mp4

And that's it! After you've done all these steps you should have a file called result.xxx (avi/mp4/moc) which is your deepfake video.

------------------------------------------------------------------------------------------------------------------------------------------------------

Expect more updates to this guide in the future, for now that's it. If you have more questions that weren't covered in this guide or in any of the following threads:

DFL 1.0 Guide: You are not allowed to view links. Register or Login to view.
DFL 1.0/2.0 FAQ: You are not allowed to view links. Register or Login to view.
Celebrity/Source creation guide: You are not allowed to view links. Register or Login to view.
DFL 2.0 General overview thread: You are not allowed to view links. Register or Login to view.
Or in any other thread in the questions sections: You are not allowed to view links. Register or Login to view.

Feel free to post it here, any questions regarding common issues won't be answered, keep the thread clean and spam free, if you have a serious issue with the software, either create a new thread in the questions section or check the github for reported bugs/issues.

Current issues/bugs:

Most of the issues you can find on the github page: You are not allowed to view links. Register or Login to view.
Before you create a new thread about a issue/bug you have please check github page to see if it wasn't already reported and if it was fixed or there is a temporary solution.
Threads about issues go in this forum section: You are not allowed to view links. Register or Login to view.
If I helped you in any way or you enjoy my deepfakes, please consider a small donation via bitcoin, tokens or paypal/patreon.
Paypal/Patreon: You are not allowed to view links. Register or Login to view.
Bitcoin: 1C3dq9zF2DhXKeu969EYmP9UTvHobKKNKF
Want to request a paid deepfake or have any questions reagarding the forums or deepfake creation using DeepFaceLab? Write me a message.
TMB-DF on the main website - You are not allowed to view links. Register or Login to view.
#2
CHANGELOG:

Code:
Official repository: https://github.com/iperov/DeepFaceLab

Please consider a donation.

============ USER SUPPORT ============ 

MrDeepFakes Guides and tutorials: https://mrdeepfakes.com/forums/forum-guides-and-tutorials

Telegram Chat (English/Russian): https://t.me/DeepFaceLab_official

Telegram Chat (English only): https://t.me/DeepFaceLab_official_en

Reddit ・r/GifFakes/: https://www.reddit.com/r/GifFakes/new/

Reddit ・r/SFWdeepfakes/: https://www.reddit.com/r/SFWdeepfakes/new/

============ CHANGELOG ============ 

== 15.03.2020 ==

global fixes

SAEHD: removed option learn_mask, it is now enabled by default

removed liaech arhi

removed support of extracted(aligned) PNG faces. Use old builds to convert from PNG to JPG.



added XSeg model.

with XSeg model you can train your own mask segmentator of dst(and src) faces 
that will be used in merger for whole_face.

Instead of using a pretrained model (which does not exist),
you control which part of faces should be masked.


Workflow is not easy, but at the moment it is the best solution 
for obtaining the best quality of whole_face's deepfakes using minimum effort
without rotoscoping in AfterEffects.

new scripts:
    XSeg) data_dst edit.bat
    XSeg) data_dst merge.bat
    XSeg) data_dst split.bat
    XSeg) data_src edit.bat
    XSeg) data_src merge.bat
    XSeg) data_src split.bat
    XSeg) train.bat

Usage:
    unpack dst faceset if packed

    run XSeg) data_dst split.bat
        this scripts extracts (previously saved) .json data from jpg faces to use in label tool.

    run XSeg) data_dst edit.bat
        new tool 'labelme' is used

        use polygon (CTRL-N) to mask the face
            name polygon "1" (one symbol) as include polygon
            name polygon "0" (one symbol) as exclude polygon
    
            'exclude polygons' will be applied after all 'include polygons'

        Hot keys:
        ctrl-N            create polygon
        ctrl-J            edit polygon
        A/D             navigate between frames
        ctrl + mousewheel     image zoom
        mousewheel        vertical scroll
        alt+mousewheel        horizontal scroll
    
        repeat for 10/50/100 faces, 
            you don't need to mask every frame of dst, 
            only frames where the face is different significantly,
            for example:
                closed eyes
                changed head direction
                changed light    
            the more various faces you mask, the more quality you will get

            Start masking from the upper left area and follow the clockwise direction.
            Keep the same logic of masking for all frames, for example:
                the same approximated jaw line of the side faces, where the jaw is not visible
                the same hair line
            Mask the obstructions using polygon with name "0".

    run XSeg) data_dst merge.bat
        this script merges .json data of polygons into jpg faces, 
        therefore faceset can be sorted or packed as usual.
        
    run XSeg) train.bat
        train the model

        Check the faces of 'XSeg dst faces' preview.

        if some faces have wrong or glitchy mask, then repeat steps:
            split
            run edit
            find these glitchy faces and mask them
            merge
            train further or restart training from scratch

Restart training of XSeg model is only possible by deleting all 'model\XSeg_*' files.

If you want to get the mask of the predicted face in merger,
you should repeat the same steps for src faceset.

New mask modes available in merger for whole_face:

XSeg-prd      - XSeg mask of predicted face    -> faces from src faceset should be labeled
XSeg-dst      - XSeg mask of dst face        -> faces from dst faceset should be labeled
XSeg-prd*XSeg-dst - the smallest area of both

if workspace\model folder contains trained XSeg model, then merger will use it,
otherwise you will get transparent mask by using XSeg-* modes.

Some screenshots:
label tool: https://i.imgur.com/aY6QGw1.jpg
trainer   : https://i.imgur.com/NM1Kn3s.jpg
merger    : https://i.imgur.com/glUzFQ8.jpg

example of the fake using 13 segmented dst faces
          : https://i.imgur.com/wmvyizU.gifv

== 07.03.2020 ==

returned back
3.optional) denoise data_dst images.bat
    Apply it if dst video is very sharp.

    Denoise dst images before face extraction.
    This technique helps neural network not to learn the noise.
    The result is less pixel shake of the predicted face.
    

SAEHD:

added new experimental archi
'liaech' - made by @chervonij. Based on liae, but produces more src-like face.

lr_dropout is now disabled in pretraining mode. 

Sorter:

added sort by "face rect size in source image"
small faces from source image will be placed at the end

added sort by "best faces faster"
same as sort by "best faces" 
but faces will be sorted by source-rect-area instead of blur. 

== 28.02.2020 ==

Extractor:

image size for all faces is now 512

fix RuntimeWarning during the extraction process

SAEHD:

max resolution is now 512

fix hd arhitectures. Some decoder's weights haven't trained before.

new optimized training: 
for every <batch_size*16> samples, 
model collects <batch_size> samples with the highest error and learns them again
therefore hard samples will be trained more often

'models_opt_on_gpu' option is now available for multigpus (before only for 1 gpu)

fix 'autobackup_hour'

== 23.02.2020 ==

SAEHD: pretrain option is now available for whole_face type

fix sort by abs difference
fix sort by yaw/pitch/best for whole_face's

== 21.02.2020 ==

Trainer: decreased time of initialization

Merger: fixed some color flickering in overlay+rct mode

SAEHD:

added option Eyes priority (y/n)

    Helps to fix eye problems during training like "alien eyes" 
    and wrong eyes direction ( especially on HD architectures ) 
    by forcing the neural network to train eyes with higher priority. 
    before/after https://i.imgur.com/YQHOuSR.jpg

added experimental face type 'whole_face'

    Basic usage instruction: https://i.imgur.com/w7LkId2.jpg
    
    'whole_face' requires skill in Adobe After Effects.

    For using whole_face you have to extract whole_face's by using
    4) data_src extract whole_face
    and
    5) data_dst extract whole_face
    Images will be extracted in 512 resolution, so they can be used for regular full_face's and half_face's.
    
    'whole_face' covers whole area of face include forehead in training square, 
    but training mask is still 'full_face'
    therefore it requires manual final masking and composing in Adobe After Effects.

added option 'masked_training'
    This option is available only for 'whole_face' type. 
    Default is ON.
    Masked training clips training area to full_face mask, 
    thus network will train the faces properly.  
    When the face is trained enough, disable this option to train all area of the frame. 
    Merge with 'raw-rgb' mode, then use Adobe After Effects to manually mask, tune color, and compose whole face include forehead.

== 03.02.2020 ==

"Enable autobackup" option is replaced by
"Autobackup every N hour" 0..24 (default 0 disabled), Autobackup model files with preview every N hour

Merger:

'show alpha mask' now on 'V' button

'super resolution mode' is replaced by
'super resolution power' (0..100) which can be modified via 'T' 'G' buttons

default erode/blur values are 0.

new multiple faces detection log: https://i.imgur.com/0XObjsB.jpg

now uses all available CPU cores ( before max 6 ) 
so the more processors, the faster the process will be.

== 01.02.2020 ==

Merger: 

increased speed

improved quality

SAEHD: default archi is now 'df'

== 30.01.2020 ==

removed use_float16 option

fix MultiGPU training

== 29.01.2020 ==

MultiGPU training:
fixed CUDNN_STREAM errors.
speed is significantly increased.

Trainer: added key 'b' : creates a backup even if the autobackup is disabled.

== 28.01.2020 ==

optimized face sample generator, CPU load is significantly reduced

fix of update preview for history after disabling the pretrain mode


SAEHD: 

added new option
GAN power 0.0 .. 10.0
    Train the network in Generative Adversarial manner. 
    Forces the neural network to learn small details of the face. 
    You can enable/disable this option at any time, 
    but better to enable it when the network is trained enough.
    Typical value is 1.0
    GAN power with pretrain mode will not work.

Example of enabling GAN on 81k iters +5k iters
https://i.imgur.com/OdXHLhU.jpg
https://i.imgur.com/CYAJmJx.jpg


dfhd: default Decoder dimensions are now 48
the preview for 256 res is now correctly displayed

fixed model naming/renaming/removing


Improvements for those involved in post-processing in AfterEffects:

Codec is reverted back to x264 in order to properly use in AfterEffects and video players.

Merger now always outputs the mask to workspace\data_dst\merged_mask

removed raw modes except raw-rgb
raw-rgb mode now outputs selected face mask_mode (before square mask)

'export alpha mask' button is replaced by 'show alpha mask'.
You can view the alpha mask without recompute the frames.

8) 'merged *.bat' now also output 'result_mask.' video file.
8) 'merged lossless' now uses x264 lossless codec (before PNG codec)
result_mask video file is always lossless.

Thus you can use result_mask video file as mask layer in the AfterEffects.


== 25.01.2020 ==

Upgraded to TF version 1.13.2

Removed the wait at first launch for most graphics cards.

Increased speed of training by 10-20%, but you have to retrain all models from scratch.

SAEHD: 

added option 'use float16'
    Experimental option. Reduces the model size by half. 
    Increases the speed of training. 
    Decreases the accuracy of the model. 
    The model may collapse or not train.
    Model may not learn the mask in large resolutions.
    You enable/disable this option at any time.

true_face_training option is replaced by
"True face power". 0.0000 .. 1.0
Experimental option. Discriminates the result face to be more like the src face. Higher value - stronger discrimination.
Comparison - https://i.imgur.com/czScS9q.png

== 23.01.2020 ==

SAEHD: fixed clipgrad option

== 22.01.2020 == BREAKING CHANGES !!!

Getting rid of the weakest link - AMD cards support.
All neural network codebase transferred to pure low-level TensorFlow backend, therefore
removed AMD/Intel cards support, now DFL works only on NVIDIA cards or CPU.

old DFL marked as 1.0 still available for download, but it will no longer be supported.

global code refactoring, fixes and optimizations

Extractor:

now you can choose on which GPUs (or CPU) to process 

improved stability for < 4GB GPUs

increased speed of multi gpu initializing

now works in one pass (except manual mode)
so you won't lose the processed data if something goes wrong before the old 3rd pass

Faceset enhancer:

now you can choose on which GPUs (or CPU) to process 

Trainer:

now you can choose on which GPUs (or CPU) to train the model. 
Multi-gpu training is now supported.
Select identical cards, otherwise fast GPU will wait slow GPU every iteration.

now remembers the previous option input as default with the current workspace/model/ folder.

the number of sample generators now matches the available number of processors

saved models now have names instead of GPU indexes.
Therefore you can switch GPUs for every saved model.
Trainer offers to choose latest saved model by default.
You can rename or delete any model using the dialog.

models now save the optimizer weights in the model folder to continue training properly

removed all models except SAEHD, Quick96

trained model files from DFL 1.0 cannot be reused

AVATAR model is also removed.
How to create AVATAR like in this video? https://www.youtube.com/watch?v=4GdWD0yxvqw
1) capture yourself with your own speech repeating same head direction as celeb in target video
2) train regular deepfake model with celeb faces from target video as src, and your face as dst
3) merge celeb face onto your face with raw-rgb mode
4) compose masked mouth with target video in AfterEffects

SAEHD:

now has 3 options: Encoder dimensions, Decoder dimensions, Decoder mask dimensions

now has 4 arhis: dfhd (default), liaehd, df, liae
df and liae are from SAE model, but use features from SAEHD model (such as combined loss and disable random warp)

dfhd/liaehd - changed encoder/decoder architectures

decoder model is combined with mask decoder model
mask training is combined with face training, 
result is reduced time per iteration and decreased vram usage by optimizer

"Initialize CA weights" now works faster and integrated to "Initialize models" progress bar

removed optimizer_mode option

added option 'Place models and optimizer on GPU?'
  When you train on one GPU, by default model and optimizer weights are placed on GPU to accelerate the process. 
  You can place they on CPU to free up extra VRAM, thus you can set larger model parameters.
  This option is unavailable in MultiGPU mode.

pretraining now does not use rgb channel shuffling
pretraining now can be continued
when pre-training is disabled:
1) iters and loss history are reset to 1 
2) in df/dfhd archis, only the inter part of the encoder is reset (before encoder+inter)
   thus the fake will train faster with a pretrained df model

Merger ( renamed from Converter ):

now you can choose on which GPUs (or CPU) to process 

new hot key combinations to navigate and override frame's configs

super resolution upscaler "RankSRGAN" is replaced by "FaceEnhancer"

FAN-x mask mode now works on GPU while merging (before on CPU),
therefore all models (Main face model + FAN-x + FaceEnhancer) 
now work on GPU while merging, and work properly even on 2GB GPU.

Quick96:

now automatically uses pretrained model

Sorter:

removed all sort by *.bat files except one sort.bat
now you have to choose sort method in the dialog

Other:

all console dialogs are now more convenient

XnViewMP is updated to 0.94.1 version

ffmpeg is updated to 4.2.1 version

ffmpeg: video codec is changed to x265

_internal/vscode.bat starts VSCode IDE where you can view and edit DeepFaceLab source code.

removed russian/english manual. Read community manuals and tutorials here
https://mrdeepfakes.com/forums/forum-guides-and-tutorials

new github page design

== 11.01.2020 ==

fix freeze on sample loading

== 08.01.2020 ==

fixes and optimizations in sample generators

fixed Quick96 and removed lr_dropout from SAEHD for OpenCL build.

CUDA build now works on lower-end GPU with 2GB VRAM:
GTX 880M GTX 870M GTX 860M GTX 780M GTX 770M
GTX 765M GTX 760M GTX 680MX GTX 680M GTX 675MX GTX 670MX 
GTX 660M GT 755M GT 750M GT 650M GT 745M GT 645M GT 740M 
GT 730M GT 640M GT 735M GT 730M GTX 770 GTX 760 GTX 750 Ti 
GTX 750 GTX 690 GTX 680 GTX 670 GTX 660 Ti GTX 660 GTX 650 Ti GTX 650 GT 740

== 29.12.2019 ==

fix faceset enhancer for faces that contain edited mask

fix long load when using various gpus in the same DFL folder

fix extract unaligned faces 

avatar: avatar_type is now only head by default

== 28.12.2019 ==

FacesetEnhancer now asks to merge aligned_enhanced/ to aligned/

fix 0 faces detected in manual extractor

Quick96, SAEHD: optimized architecture. You have to restart training.

Now there are only two builds: CUDA (based on 9.2) and Opencl.

== 26.12.2019 ==

fixed mask editor

added FacesetEnhancer
4.2.other) data_src util faceset enhance best GPU.bat
4.2.other) data_src util faceset enhance multi GPU.bat

FacesetEnhancer greatly increases details in your source face set,
same as Gigapixel enhancer, but in fully automatic mode.
In OpenCL build works on CPU only.

before/after https://i.imgur.com/TAMoVs6.png

== 23.12.2019 ==

Extractor: 2nd pass now faster on frames where faces are not found

all models: removed options 'src_scale_mod', and 'sort samples by yaw as target'
If you want, you can manually remove unnecessary angles from src faceset after sort by yaw.

Optimized sample generators (CPU workers). Now they consume less amount of RAM and work faster.

added 
4.2.other) data_src/dst util faceset pack.bat
    Packs /aligned/ samples into one /aligned/samples.pak file.
    After that, all faces will be deleted.

4.2.other) data_src/dst util faceset unpack.bat
    unpacks faces from /aligned/samples.pak to /aligned/ dir.
    After that, samples.pak will be deleted.

Packed faceset load and work faster.


== 20.12.2019 ==

fix 3rd pass of extractor for some systems

More stable and precise version of the face transformation matrix

SAEHD: lr_dropout now as an option, and disabled by default
When the face is trained enough, you can enable this option to get extra sharpness for less amount of iterations


added
4.2.other) data_src util faceset metadata save.bat
    saves metadata of data_src\aligned\ faces into data_src\aligned\meta.dat

4.2.other) data_src util faceset metadata restore.bat
    restore metadata from 'meta.dat' to images
    if image size different from original, then it will be automatically resized

You can greatly enhance face details of src faceset by using Topaz Gigapixel software.
example before/after https://i.imgur.com/Gwee99L.jpg
Download it from torrent https://rutracker.org/forum/viewtopic.php?t=5757118
Example of workflow:

1) run 'data_src util faceset metadata save.bat'
2) launch Topaz Gigapixel
3) open 'data_src\aligned\' and select all images
4) set output folder to 'data_src\aligned_topaz' (create folder in save dialog)
5) set settings as on screenshot https://i.imgur.com/kAVWMQG.jpg
    you can choose 2x, 4x, or 6x upscale rate
6) start process images and wait full process
7) rename folders:
    data_src\aligned        ->  data_src\aligned_original
    data_src\aligned_topaz  ->  data_src\aligned
8) copy 'data_src\aligned_original\meta.dat' to 'data_src\aligned\'
9) run 'data_src util faceset metadata restore.bat'
    images will be downscaled back to original size (256x256) preserving details
    metadata will be restored
10) now your new enhanced faceset is ready to use !





== 15.12.2019 ==

SAEHD,Quick96:
improved model generalization, overall accuracy and sharpness 
by using new 'Learning rate dropout' technique from the paper https://arxiv.org/abs/1912.00144
An example of a loss histogram where this function is enabled after the red arrow:
https://i.imgur.com/3olskOd.jpg


== 12.12.2019 ==

removed FacesetRelighter due to low quality of the result 

added sort by absdiff
This is sort method by absolute per pixel difference between all faces.
options:
Sort by similar? ( y/n ?:help skip:y ) :
if you choose 'n', then most dissimilar faces will be placed first.

'sort by final' renamed to 'sort by best'

OpenCL: fix extractor for some amd cards

== 14.11.2019 ==

Converter: added new color transfer mode: mix-m

== 13.11.2019 ==

SAE,SAEHD,Converter:
added sot-m color transfer

Converter:
removed seamless2 mode

FacesetRelighter:
Added intensity parameter to the manual picker.
'One random direction' and 'predefined 7 directions' use random intensity from 0.3 to 0.6.

== 12.11.2019 ==

FacesetRelighter fixes and improvements:

now you have 3 ways:
1) define light directions manually (not for google colab)
   watch demo https://youtu.be/79xz7yEO5Jw
2) relight faceset with one random direction
3) relight faceset with predefined 7 directions

== 11.11.2019 ==

added FacesetRelighter:
Synthesize new faces from existing ones by relighting them using DeepPortraitRelighter network.
With the relighted faces neural network will better reproduce face shadows.

Therefore you can synthsize shadowed faces from fully lit faceset.
https://i.imgur.com/wxcmQoi.jpg

as a result, better fakes on dark faces:
https://i.imgur.com/5xXIbz5.jpg

operate via
data_x add relighted faces.bat
data_x delete relighted faces.bat

in OpenCL build Relighter runs on CPU

== 09.11.2019 ==

extractor: removed "increased speed of S3FD" for compatibility reasons

converter: 
fixed crashes
removed useless 'ebs' color transfer
changed keys for color degrade

added image degrade via denoise - same as denoise extracted data_dst.bat , 
but you can control this option directly in the interactive converter

added image degrade via bicubic downscale/upscale 

SAEHD: 
default ae_dims for df now 256. It is safe to train SAEHD on 256 ae_dims and higher resolution.
Example of recent fake: https://youtu.be/_lxOGLj-MC8

added Quick96 model.
This is the fastest model for low-end 2GB+ NVidia and 4GB+ AMD cards. 
Model has zero options and trains a 96pix fullface.
It is good for quick deepfake demo.
Example of the preview trained in 15 minutes on RTX2080Ti:
https://i.imgur.com/oRMvZFP.jpg

== 27.10.2019 ==

Extractor: fix for AMD cards

== 26.10.2019 ==

red square of face alignment now contains the arrow that shows the up direction of an image

fix alignment of side faces
Before https://i.imgur.com/pEoZ6Mu.mp4
after https://i.imgur.com/wO2Guo7.mp4

fix message when no training data provided

== 23.10.2019 ==

enhanced sort by final: now faces are evenly distributed not only in the direction of yaw, 
but also in pitch

added 'sort by vggface': sorting by face similarity using VGGFace model. 
Requires 4GB+ VRAM and internet connection for the first run.


== 19.10.2019 ==

fix extractor bug for 11GB+ cards

== 15.10.2019 ==

removed fix "fixed bug when the same face could be detected twice"

SAE/SAEHD:
removed option 'apply random ct'

added option 
   Color transfer mode apply to src faceset. ( none/rct/lct/mkl/idt, ?:help skip: none )
   Change color distribution of src samples close to dst samples. Try all modes to find the best.
before was lct mode, but sometime it does not work properly for some facesets.


== 14.10.2019 ==

fixed bug when the same face could be detected twice

Extractor now produces a less shaked face. but second pass is now slower by 25%
before/after: https://imgur.com/L77puLH

SAE, SAEHD: 'random flip' and 'learn mask' options now can be overridden.
It is recommended to start training for first 20k iters always with 'learn_mask'

SAEHD: added option Enable random warp of samples, default is on
Random warp is required to generalize facial expressions of both faces. 
When the face is trained enough, you can disable it to get extra sharpness for less amount of iterations.

== 10.10.2019 ==

fixed wrong NVIDIA GPU detection in extraction and training processes

increased speed of S3FD 1st pass extraction for GPU with >= 11GB vram.

== 09.10.2019 ==

fixed wrong NVIDIA GPU indexes in a systems with two or more GPU
fixed wrong NVIDIA GPU detection on the laptops

removed TrueFace model.

added SAEHD model ( High Definition Styled AutoEncoder )
Compare with SAE: https://i.imgur.com/3QJAHj7.jpg
This is a new heavyweight model for high-end cards to achieve maximum possible deepfake quality in 2020.

Differences from SAE:
+ new encoder produces more stable face and less scale jitter
+ new decoder produces subpixel clear result
+ pixel loss and dssim loss are merged together to achieve both training speed and pixel trueness
+ by default networks will be initialized with CA weights, but only after first successful iteration
  therefore you can test network size and batch size before weights initialization process
+ new neural network optimizer consumes less VRAM than before
+ added option <Enable 'true face' training>
  The result face will be more like src and will get extra sharpness.
  Enable it for last 30k iterations before conversion.
+ encoder and decoder dims are merged to one parameter encoder/decoder dims
+ added mid-full face, which covers 30% more area than half face.  

example of the preview trained on RTX2080TI, 128 resolution, 512-21 dims, 8 batch size, 700ms per iteration:
without trueface            : https://i.imgur.com/MPPKWil.jpg
with trueface    +23k iters : https://i.imgur.com/dV5Ofo9.jpg

== 24.09.2019 ==

fix TrueFace model, required retraining

== 21.09.2019 ==

fix avatar model

== 19.09.2019 ==

SAE : WARNING, RETRAIN IS REQUIRED ! 
fixed model sizes from previous update. 
avoided bug in ML framework(keras) that forces to train the model on random noise.

Converter: added blur on the same keys as sharpness

Added new model 'TrueFace'. Only for NVIDIA cards.
This is a GAN model ported from https://github.com/NVlabs/FUNIT
Model produces near zero morphing and high detail face.
Model has higher failure rate than other models.
It does not learn the mask, so fan-x mask modes should be used in the converter.
Keep src and dst faceset in same lighting conditions. 

== 13.09.2019 ==

Converter: added new color transfer modes: mkl, mkl-m, idt, idt-m

SAE: removed multiscale decoder, because it's not effective

== 07.09.2019 ==

Extractor: fixed bug with grayscale images.

Converter:

Session is now saved to the model folder.

blur and erode ranges are increased to -400+400

hist-match-bw is now replaced with seamless2 mode.

Added 'ebs' color transfer mode (works only on Windows).

FANSEG model (used in FAN-x mask modes) is retrained with new model configuration
and now produces better precision and less jitter

== 30.08.2019 ==

interactive converter now saves the session.
if input frames are changed (amount or filenames)
then interactive converter automatically starts a new session.
if model is more trained then all frames will be recomputed again with their saved configs.

== 28.08.2019 ==

removed landmarks of lips which are used in face aligning
result is less scale jittering
before  https://i.imgur.com/gJaW5Y4.gifv 
after   https://i.imgur.com/Vq7gvhY.gifv

converter: fixed merged\ filenames, now they are 100% same as input from data_dst\

converted to X.bat : now properly eats any filenames from merged\ dir as input

== 27.08.2019 ==

fixed converter navigation logic and output filenames in merge folder

added EbSynth program. It is located in _internal\EbSynth\ folder
Start it via 10) EbSynth.bat
It starts with sample project loaded from _internal\EbSynth\SampleProject
EbSynth is mainly used to create painted video, but with EbSynth you can fix some weird frames produced by deepfake process.
before: https://i.imgur.com/9xnLAL4.gifv 
after:  https://i.imgur.com/f0Lbiwf.gifv
official tutorial for EbSynth : https://www.youtube.com/watch?v=0RLtHuu5jV4

== 26.08.2019 ==

updated pdf manuals for AVATAR model.

Avatar converter: added super resolution option.

All converters:
fixes and optimizations
super resolution DCSCN network is now replaced by RankSRGAN
added new option sharpen_mode and sharpen_amount

== 25.08.2019 ==

Converter: FAN-dst mask mode now works for half face models.

AVATAR Model: default avatar_type option on first startup is now HEAD. 
Head produces much more stable result than source.

updated usage of AVATAR model:
Usage:
1) place data_src.mp4 10-20min square resolution video of news reporter sitting at the table with static background,
   other faces should not appear in frames.
2) process "extract images from video data_src.bat" with FULL fps
3) place data_dst.mp4 square resolution video of face who will control the src face
4) process "extract images from video data_dst FULL FPS.bat"
5) process "data_src mark faces S3FD best GPU.bat"
6) process "data_dst extract unaligned faces S3FD best GPU.bat"
7) train AVATAR.bat stage 1, tune batch size to maximum for your card (32 for 6GB), train to 50k+ iters.
8) train AVATAR.bat stage 2, tune batch size to maximum for your card (4 for 6GB), train to decent sharpness.
9) convert AVATAR.bat
10) converted to mp4.bat

== 24.08.2019 ==

Added interactive converter.
With interactive converter you can change any parameter of any frame and see the result in real time.

Converter: added motion_blur_power param. 
Motion blur is applied by precomputed motion vectors. 
So the moving face will look more realistic.

RecycleGAN model is removed.

Added experimental AVATAR model. Minimum required VRAM is 6GB for NVIDIA and 12GB for AMD.


== 16.08.2019 ==

fixed error "Failed to get convolution algorithm" on some systems
fixed error "dll load failed" on some systems

model summary is now better formatted

Expanded eyebrows line of face masks. It does not affect mask of FAN-x converter mode.
ConverterMasked: added mask gradient of bottom area, same as side gradient

== 23.07.2019 ==

OpenCL : update versions of internal libraries

== 20.06.2019 ==

Trainer: added option for all models
Enable autobackup? (y/n ?:help skip:%s) : 
Autobackup model files with preview every hour for last 15 hours. Latest backup located in model/<>_autobackups/01

SAE: added option only for CUDA builds:
Enable gradient clipping? (y/n, ?:help skip:%s) : 
Gradient clipping reduces chance of model collapse, sacrificing speed of training.

== 02.06.2019 ==

fix error on typing uppercase values

== 24.05.2019 ==

OpenCL : fix FAN-x converter

== 20.05.2019 ==

OpenCL : fixed bug when analysing ops was repeated after each save of the model

== 10.05.2019 ==

fixed work of model pretraining

== 08.05.2019 ==

SAE: added new option 
Apply random color transfer to src faceset? (y/n, ?:help skip:%s) : 
Increase variativity of src samples by apply LCT color transfer from random dst samples.
It is like 'face_style' learning, but more precise color transfer and without risk of model collapse, 
also it does not require additional GPU resources, but the training time may be longer, due to the src faceset is becoming more diverse.

== 05.05.2019 ==

OpenCL: SAE model now works properly

== 05.03.2019 ==

fixes

SAE: additional info in help for options:

Use pixel loss - Enabling this option too early increases the chance of model collapse.
Face style power - Enabling this option increases the chance of model collapse.
Background style power - Enabling this option increases the chance of model collapse.


== 05.01.2019 == 

SAE: added option 'Pretrain the model?'

Pretrain the model with large amount of various faces. 
This technique may help to train the fake with overly different face shapes and light conditions of src/dst data. 
Face will be look more like a morphed. To reduce the morph effect, 
some model files will be initialized but not be updated after pretrain: LIAE: inter_AB.h5 DF: encoder.h5. 
The longer you pretrain the model the more morphed face will look. After that, save and run the training again.


== 04.28.2019 ==

fix 3rd pass extractor hang on AMD 8+ core processors

Converter: fixed error with degrade color after applying 'lct' color transfer

added option at first run for all models: Choose image for the preview history? (y/n skip:n)
Controls: [p] - next, [enter] - confirm.

fixed error with option sort by yaw. Remember, do not use sort by yaw if the dst face has hair that covers the jaw.

== 04.24.2019 ==

SAE: finally the collapses were fixed

added option 'Use CA weights? (y/n, ?:help skip: %s ) : 
Initialize network with 'Convolution Aware' weights from paper https://arxiv.org/abs/1702.06295.
This may help to achieve a higher accuracy model, but consumes a time at first run.

== 04.23.2019 ==

SAE: training should be restarted
remove option 'Remove gray border' because it makes the model very resource intensive.

== 04.21.2019 ==

SAE: 
fix multiscale decoder.
training with liae archi should be restarted

changed help for 'sort by yaw' option:
NN will not learn src face directions that don't match dst face directions. Do not use if the dst face has hair that covers the jaw.


== 04.20.2019 ==

fixed work with NVIDIA cards in TCC mode

Converter: improved FAN-x masking mode.
Now it excludes face obstructions such as hair, fingers, glasses, microphones, etc.
example https://i.imgur.com/x4qroPp.gifv
It works only for full face models, because there were glitches in half face version.

Fanseg is trained by using manually refined by MaskEditor >3000 various faces with obstructions.
Accuracy of fanseg to handle complex obstructions can be improved by adding more samples to dataset, but I have no time for that :(
Dataset is located in the official mega.nz folder.
If your fake has some complex obstructions that incorrectly recognized by fanseg,
you can add manually masked samples from your fake to the dataset
and retrain it by using --model DEV_FANSEG argument in bat file. Read more info in dataset archive.
Minimum recommended VRAM is 6GB and batch size 24 to train fanseg.
Result model\FANSeg_256_full_face.h5 should be placed to DeepFacelab\facelib\ folder

Google Colab now works on Tesla T4 16GB.
With Google Colaboratory you can freely train your model for 12 hours per session, then reset session and continue with last save.
more info how to work with Colab: https://github.com/chervonij/DFL-Colab

== 04.07.2019 == 

Extractor: added warning if aligned folder contains files that will be deleted.

Converter subprocesses limited to maximum 6

== 04.06.2019 ==

added experimental mask editor. 
It is created to improve FANSeg model, but you can try to use it in fakes.
But remember: it does not guarantee quality improvement.
usage:
run 5.4) data_dst mask editor.bat
edit the mask of dst faces with obstructions
train SAE either with 'learn mask' or with 'style values'
Screenshot of mask editor: https://i.imgur.com/SaVpxVn.jpg
result of training and merging using edited mask: https://i.imgur.com/QJi9Myd.jpg
Complex masks are harder to train.

SAE: 
previous SAE model will not work with this update.
Greatly decreased chance of model collapse. 
Increased model accuracy.
Residual blocks now default and this option has been removed.
Improved 'learn mask'.
Added masked preview (switch by space key)

Converter: 
fixed rct/lct in seamless mode
added mask mode (6) learned*FAN-prd*FAN-dst

changed help message for pixel loss:
Pixel loss may help to enhance fine details and stabilize face color. Use it only if quality does not improve over time.

fixed ctrl-c exit in no-preview mode

== 03.31.2019 ==

Converter: fix blur region of seamless.

== 03.30.2019 == 

fixed seamless face jitter
removed options Suppress seamless jitter, seamless erode mask modifier.
seamlessed face now properly uses blur modifier
added option 'FAN-prd&dst' - using multiplied FAN prd and dst mask,

== 03.29.2019 ==

Converter: refactorings and optimizations
added new option
Apply super resolution? (y/n skip:n) : Enhance details by applying DCSCN network.
before/after gif - https://i.imgur.com/jJA71Vy.gif

== 03.26.2019 ==

SAE: removed lightweight encoder.
optimizer mode now can be overriden each run

Trainer: the loss line now shows the average loss values after saving

Converter: fixed bug with copying files without faces.

XNViewMP : updated version

fixed cut video.bat for paths with spaces

== 03.24.2019 ==

old SAE model will not work with this update.

Fixed bug when SAE can be collapsed during a time. 

SAE: removed CA weights and encoder/decoder dims.

added new options:

Encoder dims per channel (21-85 ?:help skip:%d) 
More encoder dims help to recognize more facial features, but require more VRAM. You can fine-tune model size to fit your GPU.

Decoder dims per channel (11-85 ?:help skip:%d) 
More decoder dims help to get better details, but require more VRAM. You can fine-tune model size to fit your GPU.

Add residual blocks to decoder? (y/n, ?:help skip:n) : 
These blocks help to get better details, but require more computing time.

Remove gray border? (y/n, ?:help skip:n) : 
Removes gray border of predicted face, but requires more computing resources.


Extract images from video: added option
Output image format? ( jpg png ?:help skip:png ) : 
PNG is lossless, but produces images with size x10 larger than JPG.
JPG extraction is faster, especially on HDD instead of SSD.

== 03.21.2019 ==

OpenCL build: fixed, now works on most video cards again.

old SAE model will not work with this update.
Fixed bug when SAE can be collapsed during a time

Added option
Use CA weights? (y/n, ?:help skip: n ) :
Initialize network with 'Convolution Aware' weights. 
This may help to achieve a higher accuracy model, but consumes time at first run.

Extractor:
removed DLIB extractor
greatly increased accuracy of landmarks extraction, especially with S3FD detector, but speed of 2nd pass now slower.
From this point on, it is recommended to use only the S3FD detector.
before https://i.imgur.com/SPGeJCm.gif
after https://i.imgur.com/VmmAm8p.gif

Converter: added new option to choose type of mask for full-face models.

Mask mode: (1) learned, (2) dst, (3) FAN-prd, (4) FAN-dst (?) help. Default - 1 : 
Learned ・Learned mask, if you choose option 'Learn mask' in model. The contours are fairly smooth, but can be wobbly.
Dst ・raw mask from dst face, wobbly contours.
FAN-prd ・mask from pretrained FAN model from predicted face. Very smooth not shaky countours.
FAN-dst ・mask from pretrained FAN model from dst face. Very smooth not shaky countours.
Advantages of FAN mask: you can get a not wobbly shaky without learning it by model.
Disadvantage of FAN mask: may produce artifacts on the contours if the face is obstructed.

== 03.13.2019 ==

SAE: added new option

Optimizer mode? ( 1,2,3 ?:help skip:1) : 
this option only for NVIDIA cards. Optimizer mode of neural network.
1 - default.
2 - allows you to train x2 bigger network, uses a lot of RAM.
3 - allows you to train x3 bigger network, uses huge amount of RAM and 30% slower.

Epoch term renamed to iteration term.

added showing timestamp in string of training in console

== 03.11.2019 ==

CUDA10.1AVX users - update your video drivers from geforce.com site

face extractor:

added new extractor S3FD - more precise, produces less false-positive faces, accelerated by AMD/IntelHD GPU (while MT is not)

speed of 1st pass with DLIB significantly increased

decreased amount of false-positive faces for all extractors

manual extractor: added 'h' button to hide the help information

fix DFL conflict with system python installation

removed unwanted tensorflow info from console log

updated manual_ru

== 03.07.2019 ==

fixes

upgrade to python 3.6.8

Reorganized structure of DFL folder. Removed unnecessary files and other trash.

Current available builds now:

DeepFaceLabCUDA9.2SSE - for NVIDIA cards up to GTX10x0 series and any 64-bit CPU
DeepFaceLabCUDA10.1AVX - for NVIDIA cards up to RTX and CPU with AVX instructions support
DeepFaceLabOpenCLSSE - for AMD/IntelHD cards and any 64-bit CPU

== 03.04.2019 == 

added
4.2.other) data_src util recover original filename.bat
5.3.other) data_dst util recover original filename.bat

== 03.03.2019 ==

Convertor: fix seamless

== for older changelog see github page ==

NOTE: This thread is meant to be just a guide.


The only types of posts/questions allowed here are ones about the guide itself (suggestions on what to add to it) and also about bugs.

Workflow style tutorial is coming, if you have any suggestions about the current state of the guide, want me to explain some features more or have found a bug (which wasn't already reported/fixed or isn't listed at the end of the guide) you can do so in this thread.

For anything else like questions about techniques/workflows, suggestions about features in DFL 2.0, complaints or just general talk about the process of making deepfakes in DFL 2.0 please post in this thread: You are not allowed to view links. Register or Login to view.
Also remember to check out the FAQ which has collection of some of the most often asked questions, tips and many other useful things.
If I helped you in any way or you enjoy my deepfakes, please consider a small donation via bitcoin, tokens or paypal/patreon.
Paypal/Patreon: You are not allowed to view links. Register or Login to view.
Bitcoin: 1C3dq9zF2DhXKeu969EYmP9UTvHobKKNKF
Want to request a paid deepfake or have any questions reagarding the forums or deepfake creation using DeepFaceLab? Write me a message.
TMB-DF on the main website - You are not allowed to view links. Register or Login to view.
#3
My personal workflow in DFL 2.0

1. First thing I do is prepare my materials for source and destination datases which includes:


- deciding on which celebrity I want to fake
- finding good quality materials that cover wide range of face angles and expressions
- finding a destination video with matching/similar looking person using saved clips, forum threads, mentions/articles on other sites or using sites that use AI to find similar looking people.

After I do all of it I combine all sources (videos) into one clip, edit it to get rid of unnecessary parts and export from Vegas as jpgs so I don't need to render it into new data_src.mp4 or I don't have to extract each video as a separate data_src, thus saving time and making cleanup faster later (because I cut out or mask out faces of other people using simple masks during editing).

While that's rendering I'm already downloading a destination clip I'll be using and once rendering of source jpgs is done I import clip to be used for data_dst into Vegas to also edit it to desired length and choose the right parts of it for faking.

2. After that's done I start extraction of frames and later extraction of faces from prepared data_dst.mp4 and then extraction of faces from source jpgs.

Then comes the cleaning process where I go through all the faces withing "data_src/aligned" and "data_dst/aligned" folders.
For data_src I just use sort by histogram to group similar faces and then remove them by browsing the "data_src/aligned" folder with either windows explorer or by using 4.1) data_src view aligned result
Same is then done with data_dst, first I sort by histogram, once that's done I need to find all non-extracted faces, I recently started doing it in a different way that is a bit faster than my previous method and guarantees 100% of faces extraction.

- I basically go through the "data_dst/aligned" folder and remove all faces that are wrong, that includes incorrectly aligned faces, false positives and other people.
- Then I revert the order/names to original and use my powershell comand to rename all files with _0 prefix to not have it.
To open powershell command window go into the folder with files you want to rename (data_dst/aligned) and while holding shift right click and you'll see an option to open powershell command window.
The command is get-childitem *.jpg | foreach {rename-item $_ $_.name.replace("_0","")}
- While it's processing (leave the command window open and close only once the address of folder get's displayed 2nd time under the command you typed or you see that first and last face no longer has _0 prefix) copy the aligned_debug folder. Once all files are renamed check for _1/_2 files and if there are ones like it remove them, ensuring that file that would have _0 prefix is also correctly algined, if it's not remove the base file and rename file with _ to one without it.
- After that I copy all faces into the new copy of aligned_debug folder and replacing all files, once copying is done I remove it so all that's left are frames for which faces where not detected, I then go through it and remove all frames that obviously have no faces and leave all that have (even if it's partially visible, after all we want to extract all faces, even if obscured or partially out of frame).
- Then you copy ale the files left into original aligned_debug folder, rename and once thats done you delete all the files, that way you can use 5) data_dst extract full_face MANUAL RE-EXTRACT DELETED ALIGNED_DEBUG to extract all non-extracted faces. This is very useful if you are working on long video and using XnViewMP 5.1) data_dst view aligned_debug results is laggy due to the software using up all RAM to load in thumbnails for 20 thousand or so frames, that way you don't even need to use that.

3. Now that both datasets are clean I start training.

I use a 192 model that was trained with random faces prior to proper training. I'm using models between 128 and 192 resolution, default dims, DF architecture with 1060 6GB.
I start training with all options disabled except for random warp which should be enabled for a while to generalize face and make it look more like src which could not happen if you were to start training with random warp disabled but doesn't need to be for long, just long enough faces start to appear fairly clearly. After few thousand iterations random warp is disabled and I carry on training with it disabled. Gradient clipping is enabled all the time to prevent model collapse. After a while of training with random warp off I turn on true face at value between 0.001-0.15 to get the face to look more like src. After some more training I enable GAN but not always, sometimes I don't enable true face and instead train with GAN one even when random warp is still enabled. I use value between 0.05 and 2 for GAN. At the end when I feel like faces look good enough I enable lr_dropout and eyes priority to get last bit of detail and ensure eyes look the best they can. If there is some noise I disable GAN and carry on, if parts of hair/objects from source start appearing I disable TrueFace, if results look bad after some changes I disable all and enable random warp to fix mistakes or what's actually more wise - I make backups before enabling GAN and TrueFace. I pretty much don't use style power but if you want to transfer some color data from dst to src (such as makeup, skin color) you can enable face power style. As for color transfer I try to not use it either because it can cause some issues with colors, if anything I try to match my source to have even colors/brightness across and be fairly close to naturally lit skin tone but if dst has drastically different tone (such as dst is shot at night or around colored lights) color transfer during training is a must, in those cases I usually stick with RCT but if that doesn't work I try other modes, just note that sot-m is a bit heavy on performance.

4. Last step is merging and converting video back into mp4 file.

For merging I use interactive merger/converter, overlay mode, erode at between -10 and 10, mask blur at anywhere from 100 to 200, I don't use motion blur, face scale is usually at 0 but in certain cases I adjust it accordingly, color transfer I usually leave at RCT but in some cases I had better results while using MKL-M, SOT-M or MIX-M, never used IDT/IDT-M (looks bad) and LCT usually makes the face have to much contrast, MKL on the other hand can make odd saturated coloring to the face. I used to use small amount of box sharpening (1-5) with RankSRGAN but now with higher resolution and new FaceEnhancer upscaling algorithm I don't use it. FaceEnhancer I usually use at between 30-50. For mask mode I usually stick with either FAN-DST to handle obstructions but if they are wrong I use Vegas (or any other software that can handle masking/compositing like After Effects) to manually fix some bad frames, I use BorisFX (same as in AE) in Vegas as a plugin to handle masks.
After all that's done I may sometimes do some additional post processing such as adding grain to match the face better with rest of the video, sharpening, color correction.
After merging is done I use the built in features to render out merged frames into result.mp4 file on which I perform the mentioned post processing in Vegas Pro.
If I helped you in any way or you enjoy my deepfakes, please consider a small donation via bitcoin, tokens or paypal/patreon.
Paypal/Patreon: You are not allowed to view links. Register or Login to view.
Bitcoin: 1C3dq9zF2DhXKeu969EYmP9UTvHobKKNKF
Want to request a paid deepfake or have any questions reagarding the forums or deepfake creation using DeepFaceLab? Write me a message.
TMB-DF on the main website - You are not allowed to view links. Register or Login to view.
#4
FAQ: You are not allowed to view links. Register or Login to view.
If I helped you in any way or you enjoy my deepfakes, please consider a small donation via bitcoin, tokens or paypal/patreon.
Paypal/Patreon: You are not allowed to view links. Register or Login to view.
Bitcoin: 1C3dq9zF2DhXKeu969EYmP9UTvHobKKNKF
Want to request a paid deepfake or have any questions reagarding the forums or deepfake creation using DeepFaceLab? Write me a message.
TMB-DF on the main website - You are not allowed to view links. Register or Login to view.

Forum Jump:

Users browsing this thread: DarktempX1, Killertofu88, [email protected], SARU, 13 Guest(s)