The most familiar statistical models provided by Python in Machine Learning is the Linear Regression. The crucial part of Data Science is to understand the Algorithm and how it works. It allows us to understand the relationship between one dependent variable and other (one or more) independent variables.
Understanding regression in simple steps.
Regression analysis is defined as a predictive modeling form that makes us understand the relationship between a dependant and an independent variables. It got many types to study:
- Linear Regression
- Logistic Regression
- Polynomial Regression
- Stepwise Regression
Linear Regression is being used in various fields in business and helps in understanding the market. Using a few examples, we can understand its efficiency in a better way.
- It helps to evaluate sales and estimates the progress.
Linear Regression forecasts the growth and estimates the path of business trends. It gives a graph of how the next season sales based on the previous sales records.
- Understand the Price Change Impacting your business
When the change in product price is the primary goal, the linear regression estimates the impact on the consumers and their behavior over the change. This helps the business to take challenging decisions.
- Analysing Risk Factor
Business is always a risk-taking task. So when the plan is in action, based on the previous records and observations risk can be minimized.
How to use Linear Regression through different techniques in Python
Least Square Method
The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems, i.e., sets of equations in which there are more equations than unknowns (Wikipedia).
Regression Line y = mx + c; here y = Dependant variable, x = Independent variable and c = y-intercept.
Implementation using Python
Let’s understand this by using dataset of head size and brain weight of various people.
# Importing Necessary Libraries
importnumpy as np
importpandas as pd
importmatplotlib.pyplot as plt
plt.rcParams[‘figure.figsize’] =(20.0, 10.0)
# Reading Data
# Collecting X and Y
X =data[‘Head Size(cm^3)’].values
Y =data[‘Brain Weight(grams)’].values
To get the values of m and c, first mean of X and Y should be calculated.
# Mean X and Y
# Total number of values
# Using the formula to calculate m and c
numer +=(X[i] -mean_x) *(Y[i] -mean_y)
denom +=(X[i] -mean_x) **2
m =numer /denom
c =mean_y -(m *mean_x)
# Print coefficients
Now the calculated values will be added to the following equation: brainWeight = c + m*headSize
So we can get the value of y against each value of x. Hence, a graph can be plotted with these values.
# Plotting Values and Regression Line
max_x =np.max(X) +100
min_x =np.min(X) -100
# Calculating line values x and y
x =np.linspace(min_x, max_x, 1000)
y =c +m *x
# Ploting Line
plt.plot(x, y, color=’#52b920′, label=’Regression Line’)
# Ploting Scatter Points
plt.scatter(X, Y, c=’#ef4423′, label=’Scatter Plot’)
plt.xlabel(‘Head Size in cm3’)
plt.ylabel(‘Brain Weight in grams’)
This method displays the closeness of the date to the fitted regression line.
y = actual value
y ͞= mean value of y
yp = predicted value of y
This method doesn’t explain if the regression model is correct. You can have a very low R-square value for a good model, or a top R-square value of models that don’t fit the given data.
Implementation using python
Scikit Learn Method
This is a machine learning technique for finding the Linear Regression.
This method simplifies the effort by using the Libraries of Machine Learning.
These were the techniques used in Python to calculate Linear Regression.