Linear Regression Algorithm | Linear Regression in Python | Machine Learning Algorithm | Edureka
Articles,  Blog

Linear Regression Algorithm | Linear Regression in Python | Machine Learning Algorithm | Edureka


A linear regression is one
of the easiest algorithm in machine learning. It is a statistical model that attempts to
show the relationship between two variables. with the linear equation. Hello everyone. This is Atul from Edureka
and in today’s session will learn about
linear regression algorithm. But before we drill down
to linear regression algorithm in depth, I’ll give you a quick overview
of today’s agenda. So we’ll start a session with a quick overview
of what is regression as linear regression
is one of a type of regression algorithm. Once we learn about regression, it’s use case the various
types of it next. We’ll learn about the algorithm
from scratch where I’ll teach you it’s mathematical
implementation first, then we’ll drill down
to the coding part and Implement linear
regression using python in today’s session will deal with linear regression algorithm
using least Square method check its goodness of fit or how close the data is to the fitted regression line
using the R square method. And then finally, what will do will optimize
it using the Gradient descent method in the last part
on the coding session. I’ll teach you to implement
linear regression using Python and the coding session
would be divided into two parts. The first part would consist of linear regression
using python from scratch where you will use
the mathematical algorithm that you have learned
in this session. And in the next part
of the coding session will be using scikit-learn
for direct implementation of linear regression. All right. I hope the agenda is clear
to you guys are like so let’s begin our session
with what is regression. Well regression analysis is a form of predictive
modeling technique which investigates
the relationship between a dependent
and independent variable a regression analysis
involves graphing a line over a set of data points that most closely fits
the overall shape of the data or regression shows the changes
in a dependent variable on the y-axis to the changes in the explanatory variable
on the x-axis fine. Now you would ask
what are the uses of regression? Well, they are major
three uses of regression. Let’s has the first
being determining the strength of predictors, ‘s the regression might be used to identify
the strength of the effect that the independent variables
have on the dependent variable. For example, you
can ask question. Like what is the strength
of relationship between sales and marketing spending or what
is the relationship between age and income second is forecasting and effect in this
the regression can be used to forecast effects
or impact of changes. That is the regression analysis
helped us to understand how much the dependent variable
changes with the change in one or more
independent variable fine. For example, you
can ask question like how much additional sale income will I get for each thousand
dollars spent on marketing. So it is Trend forecasting
in this the regression analysis to predict Trends
and future values. The regression analysis can
be used to get Point estimates in this you can ask questions. Like what will be
the price of Bitcoin in next six months, right? So next topic is linear
versus logistic regression. By now, I hope
that you know, what a regression is. So let’s move on
and understand its type. So there are various
kinds of regression like linear regression logistic
regression polynomial regression and others early, but for this session will be focusing on linear
and logistic regression. So let’s move on and let me tell
you what is linear regression. And what is logistic regression then what we’ll do
we’ll compare both of them. All right. So starting with
linear regression in simple linear regression, we are interested in things
like y equal MX plus C. So what we are trying to find
is the correlation between X and Y variable this means that every value of x has
a corresponding value of y in it if it is continuous. All right, however
in logistic regression, we are not fitting our data
to a straight line like linear regression instead
what we are doing. We are mapping Y versus X to a sigmoid function
in logistic regression. What we find out is is y 1 or 0
for this particular value of x That’s we are essentially
deciding true or false value for a given value of x fine. So as a core concept
of linear regression, you can see that the data is
model using a straight line. We’re in the case
of logistic regression. The data is model using
a sigmoid function. The linear regression is used
with continuous variables on the other hand
the logistic regression. It is used with categorical
variable the output or the prediction
of a linear regression is the value of the variable on the other hand the output of prediction of a logistic
regression is the probability of occurrence of the event. Now, how will you
check the accuracy and goodness of fit in case of linear regression
are various methods like measured by loss r squared
adjusted r squared Etc while in the case
of logistic regression you have accuracy precision
recall F1 score, which is nothing but
the harmonic mean of precision and recall next is Roc curve for determining the probability
threshold for classification or the confusion Matrix Etc. There are many. All right, so
Rising the difference between linear and logistic
regression you can say that the type of function you are mapping to is
the main point of difference between linear and logistic
regression a linear regression Maps a continuous X2 a continuous fi
on the other hand a logistic regression
Maps a continuous x to the binary why so we can use logistic
regression to make category or true false decisions from the data find so let’s move on ahead next is linear
regression selection criteria, or you can say when will
you use linear regression? So the first is classification and regression capabilities
regression models predict a continuous variable such as
the sales made on a day or predict the temperature of a city their Reliance
on a polynomial like a straight line
to fit a data set poses a real challenge when it comes towards building
a classification capability. Let’s imagine that you fit
a line with the train points that you have to imagine you add
some more data points to it. But in order to fit it
what you have to do, you have to change
your existing model. That is Maybe. Between the threshold itself. So this will happen
with each new data point you are to the model hence. The linear regression is not
good for classification models. Fine. Next is data quality
each missing value removes one data point that
could optimize the regression in simple linear regression. The outliers can significantly disrupt the outcome
just for now, you can know that if you
remove the outliers your model will become very good. All right. So this is about data quality
next is computational complexity a linear regression is often
not computationally expensive as compared to the decision tree or the clustering algorithm
the order of complexity for n training example and X features usually Falls
in either Big O of x square or bigger of xn
next is comprehensible and transparent the
linear regression are easily comprehensible
and transparent in nature. They can be represented by
a simple mathematical notation to anyone and can be
understood very easily. So these are some
of the criteria based on which you will select
the linear regression algorithm. Alright next is where is linear regression
used first is evaluating Trends and sales estimate. Well linear regression
can be used in business to evaluate Trends
and make estimates or focused. For example, if a company sales have
increased steadily every month for past few years, then conducting a linear
analysis on the sales data with monthly sales on the y axis
and time on the x axis. This will give you a line that predicts the upward Trends
in the sale after creating the trendline the company
could use the slope of the lines to focus
sale in future months. Next is analyzing. The impact of price changes
will linear regression can be used to analyze
the effect of pricing on consumer behavior. For instance. If a company changes the price on a certain
product several times, then it can record the quantity
itself or each price level and then perform
a linear regression with sold quantity as a dependent variable and price
as the independent variable. This would result in a line
that depicts the extent to which the Reduce
their consumption of the product as the price is increasing. So this result would help us
in future pricing decisions. Next is assessment of risk
in financial services and insurance domain for linear regression
can be used to analyze the risk, for example health insurance
company might conduct a linear regression algorithm how it can do it can do it
by plotting the number of claims per customer against its age
and they might discover that the old customers then to make more
health insurance claim. Well the result of such analysis might guide
important business decisions. All right, so by now you
have just a rough idea of what linear regression
algorithm as like what it does where it is used when you should use
it early now, let’s move on and understand
the algorithm and depth So suppose you have independent
variable on the x-axis and dependent variable
on the y-axis. All right suppose this is
the data point on the x axis. The independent variable
is increasing on the x axis. And so does the dependent
variable on the y-axis? So what kind of linear
regression line you would get you would get a positive
linear regression line. All right as the slope would
be positive next is suppose. You have an independent
variable on the x-axis which is increasing and on the other hand the
dependent variable on the y-axis that is decreasing. So what kind of line
will you get in that case? You will get
a negative regression line. In this case as the slope
of the line is negative and this particular line that is line of y equal MX plus C is a line
of linear regression which shows the relationship
between independent variable and dependent variable and this line is only known
as line of linear regression. Okay. So let’s add some data
points to our graph. So these are some observation
or data points on our graphs. Let’s plot some more. Okay. Now all our data points
are plotted now our task is to create a regression line
or the best fit line. All right. Now once our regression
line is drawn now, it’s the task
of production now suppose. This is our estimated value
or the predicted value and this is our actual value. Okay. So what we have to do our main
goal is to reduce this error that is to reduce the distance
between the estimated or the predicted value
and the actual value. The best fit line would be the
one which had the least error or the least difference
in estimated value and the actual value. All right, and other words we
have to minimize the error. This was a brief understanding of linear regression
algorithm soon. We’ll jump towards
mathematical implementation. All right, but for then
let me tell you this. Suppose you draw a graph
with speed on the x-axis and distance covered on the y axis with the time
demeaning constant. If you plot a graph
between the speed travel by the vehicle and the distance traveled
in a fixed unit of time, then you will get
a positive relationship. All right. So suppose the equation
of line as y equal MX plus C. Then in this case Y is
the distance traveled in a fixed duration of time x is the speed of vehicle m
is a positive slope of the line and see is
the y-intercept of the line. All right suppose
the distance remaining constant. You have to plot a graph
between the speed of the vehicle and the time taken
to travel a fixed distance. Then in that case
you will get a line with a negative relationship. All right, the slope
of the line is negative here the equation of line changes
to y equal minus of MX plus C where Y is the time taken
to travel a fixed distance X is the speed of vehicle m is
the negative slope of the line and see is
the y-intercept of the line. All right now, let’s get back
to our Independent variable. So in that term why is
our dependent variable and X that is
our independent variable. Now, let’s move on and see the mathematical implementation
of the things. Alright, so we have x equal 1 2 3 4 5 let’s plot
them on the x-axis. So 0123456 alike
and we have y as 34245. All right. So let’s plot 1 2 3 4 5 on the y-axis. Now, let’s plot our coordinates 1 by 1 so x
equal 1 and y equal 3, so we have here x equal 1 and y equal 3, so there’s a point
1 comma 3 so similarly we have 13243244 and 55. Alright, so moving on ahead. Let’s calculate the mean of X
and Y and plot it on the graph. All right, so mean of X is 1 plus 2 plus 3 plus 4
plus 5 divided by 5. That is 3. All right. Similarly mean of Y
is 3 plus 4 plus. 2 plus 4 plus 5 that is 18. So 18 divided by 5. That is nothing but 3.6 aligned. So next what we’ll do
we’ll plot R Min that is 3 comma 3 .6
on the graph. Okay. So there’s a point 3 comma 3 .6 see our goal is to find
or predict the best fit line using the least Square
Method All right. So in order to find that we first need to find
the equation of line, so let’s find the equation
of our regression line. All right. So let’s suppose this is our regression line
y equal MX plus C. Now, we have
an equation of line. So all we need to do is
find the value of M and see where m equals summation of x minus X bar X Y minus y bar
upon the summation of x minus X bar whole Square
don’t get confused. Let me resolve it for you. Alright, so moving on
ahead as a part of formula. What we are going to do
will calculate x minus X bar. So we have X as 1 minus X bar
as 3 so 1 minus 3 It is minus 2 next we have x
equal to minus its mean 3 that is minus 1 similarly. We have 3 minus 3 is 0 4 –
3 1 5 – 3 2. All right, so x minus X bar. It’s nothing but the distance
of all the point through the line y equal 3 and what does this y minus y bar implies
it implies the distance of all the point from the line x equal 3 .6 fine. So let’s calculate the value
of y minus y bar. So starting with y equal 3 – value of y bar
that is 3.6. So it is three minus 3.6
how much – of 0.6 next is 4 minus 3.6
that is 0.4 next to minus 3.6 that is – of 1.6. Next is 4 minus 3.6
that is 0.4 again, 5 minus 3.6 that is
one point four. Alright, so now we are done
with Y minus y bar. Fine now next we will calculate
x minus X bar whole Square. So let’s calculate x
minus X bar whole Square so it is minus 2 whole square. That is 4 minus 1 whole square. That is 1 0 squared is
0 1 Square 1 2 square for fine. So now in our table we have x
minus X bar y minus y bar and x minus X bar whole Square. Now what we need. We need the product of x
minus X bar X Y minus y bar. Alright, so let’s see
the product of x minus X bar X Y minus y bar that is minus
of 2 x minus of 0.6 that is 1.2 minus
of 1 multiplied by zero point 4 that is
minus of 0 point 4 0 x minus of 1.6. That is 0 1 multiplied
by zero point four that is 0.4. And next 2 multiplied
by 1 point for that is 2.8. All right. Now almost all the parts
of our formula is done. So now what we need
to do is get the summation of last two columns. All right, so the summation of X minus X bar
whole square is 10 and the summation of x minus X bar X Y minus y bar is 4
so the value of M will be equal to 4 by 10 fine. So let’s put this value
of m equals zero point four and our line y equal MX plus C. So let’s fill all the points
into the equation and find the value of C. So we have y as 3.6 remember
the mean by Ms. 0.4 which we calculated just now X
as the mean value of x that is 3 and we have the equation as 3 point
6 equals 0 point 4 x 3 plus C. Alright that is 3.6 equal
1 Point 2 plus C. So what is the value of C
that is 3.6 minus 1 Point 2. That is 2 point 4. All right. So what we had we had m
equals zero point four see as 2.4 and then finally when we calculate the equation
of the regression line what we get is y equal
zero point four times of X plus two point four. So this is the regression line. All right, so there is how you
are plotting your points. This is your actual point. All right. Now for given m equals
zero point four and SQL 2.4. Let’s predict the value of y
for x equal 1 2 3 4 & 5. So when x equal
1 the predicted value of y will be zero point 4 x one
plus two point four that is 2.8. Similarly when x equal
to predicted value of y will be zero point 4 x 2 + 2 point 4 that equals
to 3 point 2 similarly x equal 3 y will be 3 point 6 x equal 4 y will be 4 point 0 x equal 5 y will be
four point four. So let’s plot them on the graph and the line passing through
all these predicting point and cutting y-axis
at 2.4 as the line of regression now your task
is to calculate the distance between the actual and the predicted value and your job is
to reduce the distance. Like or in other words, you have to reduce the error
between the actual and the predicted value the line with the least error will be
the line of linear regression or regression line and it will also be
the best fit line. Alright, so this is
how things work in computer. So what it do it performs
n number of iteration for different values of M
for different values of M. It will calculate
the equation of line where y equals MX plus C. Right? So as the value
of M changes the line is changing so iteration
will start from one. All right, and it will perform
a number of iteration. So after every iteration what it will do it will
calculate the predicted value according to the line and compare the distance
of actual value to the predicted value and the value of M
for which the distance between the actual and the predicted value is
minimum will be selected as the best fit line. Alright now that we have calculated the best
fit line now it’s time to check the goodness
of fit or to check how good our
model is performing. So in order to Do that. We have a method
called R square method. So what is this R square? Well r-squared value is
a statistical measure of how close the data are to the fitted regression
line in general. It is considered that a high r-squared
value model is a good model, but you can also have a lower squared value
for a good model as well or a higher squared
value for a model that does not fit at all. All right. It is also known as
coefficient of determination or the coefficient
of multiple determination. Let’s move on and see
how a square is calculated. So these are our actual values
plotted on the graph. We had calculated
the predicted values of Y as 2.8 3.2 3.6 4.0 4.4. Remember when we calculated
the predicted values of Y for the equation Y
predicted equals 014 X of X plus two point four for every x
equal 1 2 3 4 & 5 from there. We got the predicted
values of Y. All right. So let’s plot it on the graph. So these are point
and the line passing through these points are nothing
but the regression line. All right. Now what you need to do is you have to check and compare
the distance of actual – mean versus the distance
of predicted – mean. Alright. So basically what you are doing
you are calculating the distance of actual value to the mean
to distance of predicted value to the mean I like so there is nothing
but a square in mathematically you can represent
our Square as summation of Y predicted values
minus y bar whole Square divided by summation of Y minus
y bar whole Square where Y is the actual value y p is the predicted value
and Y Bar is the mean value of y that is nothing but 3.6. So remember, this
is our formula. So next what we’ll do
we’ll calculate y minus y bar. So we have y is 3y bar as
3 point 6 so we’ll calculate it as 3 minus 3.6 that is nothing but
minus of 0.6 similarly for y equal 4
and Y Bar equal 3.6. We have y minus y bar as
zero point 4 then 2 minus 3.6. It is 1 point 6 4 minus
3.6 again zero point four and five minus 3.6 it is 1.4. So we got the value
of y minus y bar. Now what we have to do we
have to take it Square. So we have minus of 0.6 Square
as 0.36 0.4 Square as 0.16 – of 1.6 Square as 2.56 0.4 Square
as 0.16 and 1.4 squared is 1.96 now is a part
of formula what we need. We need our YP
minus y BAR value. So these are VIP values and we have to subtract
it from the mean. No, right. So 2 .8 minus 3.6
that is minus 0.8. Similarly. We will get 3.2 minus 3.6 that is 0.4 and 3.6 minus 3.6
that is 0410 minus 3.6 that is 0.4. Then 4 .4 minus 3.6 that is 0.8. So we calculated the value
of YP minus y bar now, it’s our turn to calculate
the value of y b minus y bar whole Square next. We have –
of 0.8 Square as 0.64 – of 0.4 Square as 0.160 Square
zero zero point four square as again 0.16 and
0.8 Square as 0164. All right. Now as a part of formula what it suggests it suggests
me to take the summation of Y P minus y bar whole square and summation of Y minus
y bar whole Square. All right. Let’s see. So in submitting y minus y bar whole Square
what you get is five point two and summation of Y P minus y bar whole Square you
get one point six. So the value of R square
can be calculated as 1 point 6 upon 5.2 fine. So the result which will get
is approximately equal to 0.3. Well, this is not a good fit. All right, so it suggests that the data points are far
away from the regression line. Alright, so this is how your graph will look
like when R square is 0.3 when you increase the value
of R square to 0.7. So you’ll see that the actual value would like
closer to the regression line when it reaches
to 0.9 Atkins more clothes and when the value
of approximately equals to 1 then the actual values lies
on the regression line itself, for example, in this case if you get a very low value
of R square suppose 0.02. So in that case what will see
that the actual values are very far away from the regression
line or you can say that there are too
many outliers in your data. You cannot focus
anything from the data. All right. So this was all about
the calculation of R square now, you might get a question
like are low values of Square always bad. Well in some field it
is entirely expected that I ask where
value will be low. For example any field that attempts to predict human
behavior such as psychology typically has r-squared values
lower than around 50% through which you can conclude that humans are simply harder to predict on the physical
process furthermore. If you ask what value is low, but you have statistically
significant predictors, then you can still
draw important conclusion about how changes in the predicator values
associated with the changes in the response value regardless of the r-squared the significant coefficient
still represent the mean change in the response for one unit
of change in the predicator while holding other predicate is
in the model constant, obviously this type of information can be
extremely valuable. All right. All right. So this was all about
the theoretical concept now, let’s move on to the coding part and understand the
code in depth. So for implementing
linear regression using python, I’ll be using Anaconda with jupyter notebook
installed on it. So I like there’s
a jupyter notebook and we are using python 3.01 it alright, so we are going
to use a data set consisting of head size and human brain
of different people. All right. So let’s import our data set
percent matplotlib and line. We are importing numpy as NP pandas as speedy
and matplotlib and from a totally we are importing
pipe lot of that as PLT. Alright next we will import
our data head brain dot CSV and store it
in the database table. Let’s execute the Run button
and see the output. So this task symbol, it symbolizes that
it still executing. So there’s a output
our dataset consists of two thirty seven rows
and 4 columns. We have columns as
gender age range head size in centimeter Cube and brain weights
and Graham fine. So there’s our sample data set. This is how it looks it consists
of all these data set. So now that we
have imported our data, so as you can see they are
237 values in the training set so we can find a linear. And Chip between the head size
and the Brain weights. So now what we’ll do
we’ll collect X & Y the X would consist
of the head size values and the Y would consist
of brain with values. So collecting X and Y.
Let’s execute the Run. done next what we’ll do
we need to find the values of b 1 or B naught
or you can say m and C. So we’ll need the mean of X
and Y values first of all what we will do will calculate
the mean of X and Y so mean x equal NP dot Min X. So mean is a predefined function
of Numb by similarly mean underscore y equal
NP dot mean of Y, so what it will return if you’ll return
the mean values of Y next we’ll check
the total number of values. So m equal length of X. Alright, then we’ll use the formula
to calculate the values of b 1 + B naught or MNC. All right, let’s execute the Run button and see
what is the result. So as you can see here on the screen we have got
d 1 as 0 point 2 6 3 and be not as three twenty
five point five seven. Alright, so now
that we have a coefficient. So comparing it with
the equation y equal MX plus C. You can say that brain weight
equals zero point 2 6 3 x head size plus 3 Seven so you can see that the value of M
here is zero point 2 6 3 and the value of C. Here is three twenty
five point five seven. All right, so there’s
our linear model. Now, let’s plot it and see
graphically let’s execute it. So this is how our plot looks
like this model is not so bad, but we need to find out
how good our model is. So in order to find it
there are many methods like root mean Square method
the coefficient of determination or the a square method. So in this tutorial, I have told you
about our score method. So let’s focus on that and see
how good our model is. So let’s calculate
the R square value. All right here SS underscore
T is the total sum of square SS underscore R
is the total sum of square of residuals and R square as the formula is
1 minus total sum of squares upon total sum
of square of residuals. All right next
when you execute it, you will get the value
of R square as 0.63 which is pretty Good now that you have implemented
simple linear regression model using least Square method. Let’s move on and see
how will you implement the model using machine learning
library call scikit-learn. All right. So this scikit-learn is a simple
machine learning library in Python welding machine
learning model are very easy using scikit-learn. So suppose there’s
a python code. So using the scikit-learn
libraries your code shortens to this length like so let’s execute
the Run button and see you will get the same are
to score as Well, this was all for today’s
discussion in case you have any doubt. Feel free to add your query
to the comment section. Thank you. I hope you have enjoyed
listening to this video. Please be kind enough to like it and you can comment any
of your doubts and queries and we will reply them at the earliest. Do look out
for more videos in our playlist and subscribe to Edureka
channel to learn more. Happy learning.

94 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *