Gradient Descent algorithm not only use for minimizing cost function for linear regression but also for minimizing other functions as well.
Algorithm Outline Steps :
Step1 : Start with some Θ0 , Θ1
Step2 : Keep changing Θ0 , Θ1 to reduce J( Θ0 , Θ1) until algorithm end up at minimum.
Algorithm :
Repeat until convergence occur
{
}
Here α is learning rate.
How Θ0 , Θ1 update is shown as :
Correct way of simultaneously update is :
Incorrect way of simultaneously update is :
Pictorial representation of how Θ1 update with +ve/-ve slope :
Θ1 = Θ1 - α (+ve number)
Θ1 = Θ1 - α (-ve number)
Learning rate α for minimizing cost function represents as :
Two things always keep in mind for choosing the learning rate α :
1) If α is too small, gradient descent can be very slow.
2) If α is too large, gradient descent can overshoot the minimum. It may fail to converge or even diverge.
Gradient Descent Algorithm also called as Batch Gradient Descent Algorithm. Because at each step it use all training example.
No comments:
Post a Comment