MrDeepFakes Forums
  • New and improved dark forum theme!
  • Guests can now comment on videos on the tube.
   
Total Likes Received: 12 (0.11 per day | 0.65 percent of total 1843)
(Find All Threads Liked ForFind All Posts Liked For) Total Likes Given: 13 (0.12 per day | 0.69 percent of total 1879)
(Find All Liked ThreadsFind All Liked Posts)

fakerdaker
(LightFaker)
LightFaker

Registration Date: 06-10-2019
Date of Birth: Not Specified
Local Time: 09-23-2019 at 03:15 PM
Status:

fakerdaker's Most Liked Post
Post Subject Numbers of Likes
RE: What is this new feature called gradient clipping ? 4
Thread Subject Forum Name
What is this new feature called gradient clipping ? Questions
Post Message
I just recently started reading about machine learning so maybe I can explain it a bit.

Basically what a neural network does is take some inputs (src images in this case) and creates a function that will produce a desired output (dst img). It makes this conversion function more and more precise the longer its trained. But it also needs some way to know how close the output it generates is to the desired output. Essentially it compares the output it produced to the desired output via what's called a cost function. 

The cost function optimizes this output by minimizing the loss created by this function. It uses something called a gradient, which you would recognize if you've taken multi-variable calculus. If not, the gradient tells you in what direction the functions rate of change is the steepest for a given input. The simple explanation is, it tells how to change the inputs to achieve relative minimum loss.
[Image: You are not allowed to view links. Register or Login to view.][Image: You are not allowed to view links. Register or Login to view.]
In these images imagine x and y are some information about the image that will be modified (there will be way more than just two), and z is the loss created by changing these pieces of information.

The image on the left is the ideal result of our cost function, we adjust our inputs (x,y) until the resulting z(loss) is at the lowest point (relative minimum). The gradient tells us which way our inputs need to be changed.
The image on the right shows what happens when we follow the gradient too far at each point. If we make our steps too big we will keep overshooting our target. But if our steps are very small, it will take a long time to converge.

So I think what gradient clipping does, is it limits the distance you follow that path for each step, not allowing your steps to be too big. I'm guessing the model corruption is a result of "gradient explosion" which is why iperov chose to limit the gradient by "clipping" it. You are not allowed to view links. Register or Login to view.
I'm really new to this topic too, but it sounds like your 'steps' can get way too large and your loss gets stuck at a maximum, which causes corruption?

This site explains it pretty well, but might be difficult to understand if you have no background in calculus.
You are not allowed to view links. Register or Login to view.

I should note, I don't know how this process works for the combined image, but I believe this is what happens when it is generating its version of the source and dst images. If anyone has a better understanding please feel free to correct this as this is my best understanding from self learning.

fakerdaker's Forum Info
Joined:
06-10-2019
Last Visit:
(Hidden)
Total Posts:
17 (0.16 posts per day | 0.17 percent of total posts) Find All Posts
Total Threads:
1 (0.01 threads per day | 0.05 percent of total threads) Find All Threads
Time Spent Online:
(Hidden)
Thanks/Likes:
Given: 13 | Recieved: 12
Members Referred:
0
fakerdaker's Contact Details