I'll add avatar workflow to the guide but in short, src is a longer video of who we want to alter, dst is the actor doing the facial expressions we want to transfer to the src so essentially it's the other way you would do normal face swap where src is the face we want to swap to dst.
SRC should be longer, around 10-20 minutes of footage.
DST is the actor video so that should be as long as the final video is supposed to be.
Then you extract to PNG (jpg doesn't work, don't know why, be prepared for huge file sizes in excesses of 10-20GB per dst/src dataset).
After extracting frames you align them, for src you use "4) data_src mark faces S3FD best GPU" and for dst you use "5) data_dst extract unaligned faces S3FD best GPU".
Then you run avatar training, start with stage 1, batch size max for 6-8GB gpu is gonna be between 32 and 48. for stage 2 it's gonna be 6-8.
Training of each stage may take 2-3 days, it's definitely slower than swapping. Models probably cannot be reused but not sure about that, as for source/head/face I have no idea what these do so more testing needs to be done.
Then you convert as usual with convert avatar.