Gradient Descent..Simply Explained With A Tutorial

In the previous blog Linear Regression, A general overview was given about simple linear regression. Now it’s time to know how to train your simple linear regression model and how to get the line that fits your data set.

Gradient Descent is simply a technique to find the point of minimum error (Sum of squared residuals), which represents the coefficient (a) and intercept (b) of the best-fit line in the line equation y=ax+b

Let’s re-invent the wheel to determine the coefficients of our linear regression model with few lines of code. After that, compare what we did by the output of the linear regression model of scikit learn.

Let’s re-invent the wheel to determine the coefficients of our linear regression model with few lines of code. After that, compare what we did by the output of the linear regression model of scikit learn.

Gradient Descent is simply a technique to find the point of minimum error (Sum of squared residuals), which represents the the coefficient (a) and intercept (b) of best-fit line in the line equation y=ax+b

Let’s re-invent the wheel to determine the coefficients of our linear regression model with few lines of code. After that, compare what we did by the output of linear regression model of scikit learn.

1- Finding coefficients of simple linear regression

First, we will import a simple data set of the salary of a group of engineers versus their years of experience. Our linear model should predict the salary of any new data point of an engineer.
The data set can be downloaded from the following link.

Now let’s see a plot of our data points. It can be shown that there is a linear relationship between Years of experience and salary.

The trails 100 lines that pass through our data points are shown as below.
We have the sum of square residuals for different lines, let’s plot it against the slopes.

The trails 100 lines that passes through our data points is shown as below.

We have the sum of square residuals for different lines, let’s plot it against the slopes.

For this graph, we see that the errors decline gradually till the global minima point, and then the errors rise again.

The point with the minimum sum of square residuals represents the slope of the best-fit line. In order to find the minima, we apply the following code lines to find the index of this point in ss_residuals list and find the corresponding slope of it.

The index of minimum value in ss_residuals list is 70.

Let’s plot out best-fit line with data points to view the result.

Great, we have our best-fit line, but is it the best? We should compare our invented wheel by the modern wheel :)

2- Apply scikit-learn Linear model

In scikit-learn it is much easier, we fit our data set to a model and get the coefficients.

We can see that both coefficients are almost the same. Let’s plot both lines together to compare them.

The score of the model is 0.95, which is a great for a best-fit line.

Cool, but the both lines seem different, but it can be improved by increasing the number of trail lines.

Thank you for reading!

You can reach me at : https://www.linkedin.com/in/bassemessam/

References:

https://www.kaggle.com/karthickveerakumar/salary-data-simple-linear-regression?select=Salary_Data.csv

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store