In the machine learning course, from the beginning, we had gone through what is the dependent variable in the machine learning model and what is the independent variable in the machine learning model.
Now, it is time to utilize this variable and perform operation line Simple linear regression. If you haven’t any idea what is regression.
In this post you will what is the history of regression, what is Least Square Method? and many more.
What is Simple Linear Regression?
Simple linear regression means a relationship between dependent and independent variables. The word “simple”, means to study only one predictor or independent variable.
In the upcoming blog post, we will study what is multiple linear regression, in which the word “Multiple” means to study two or more predictors or independent variables.
Let’s start our coding
Simple Linear Regression – Step 1
Necessary files, modules we have already imported
import numpy as np import matplotlib.pyplot as plt import pandas as pd import seaborn as sns from sklearn.model_selection import train_test_split
But, for linear regression we required another library and module so here is the code.
from sklearn.linear_model import LinearRegression
Simple Linear Regression – Step 2
Import Dataset datasetSalSimpleLinear = pd.read_csv('SalaryAndExp.csv') DatasetSalSimpleLinear.head() [To display and check the column name in our dataset] DatasetSalSimpleLinear.info() [To check the data type, number of record present into our dataset.] DatasetSalSimpleLinear.describe()[Will display the min, max, mean, count, std, 25%, 50%, 75% about the column]' DatasetSalSimpleLinear.columns[Will display all the column name of your dataset] xSalarySimpleLinear = datasetSalSimpleLinear.iloc[:, :-1].values ySalarySimpleLinear = datasetSalSimpleLinear.iloc[:, -1].values
Simple Linear Regression – Step 3
Already perform splitting of our dataset. One of the important steps in machine learning model creation. Splitting of our Dataset
Simple Linear Regression – Step 4
Create the object of imported library.
xTrainSimple, xTestSimple, yTrainSimple, yTestSimple = train_test_split (xSalarySimpleLinear, ySalarySimpleLinear, test_size=0.2, random_state=0)>simpleRegres = LinearRegression() simpleRegres.fit(xTrainSimple,yTrainSimple)
The fit method is part of the Linear Regression Class. Fit, in order to fit my model on my training data.
Here we want to fit only training data that why we have passed the xTrainSimple and yTrainSimple record in the above fit method.
I don’t set this equal to any other variable object.
This is already taking effect on the object itself. This means we don’t need to specify.
Simple Linear Regression – Step 5
Now let’s predict the salary. To perform this task.
we need to pass the employee experience details which we have already get into “xTestSimple”.
The predict function returns the data which we stored into the left-hand-side variable. Later we will compare the real salary and predicted salary using this variable.
Simple Linear Regression – Step 6
Plot and compare the real salary and predicted salary for both Training and Test set.
plt.scatter(xTrainSimple,yTrainSimple,color='blue') XTrainSimple :- Number of experience details yTrainSimple : - For salary.
Color: – specify the color which you want to use.
We have already learned the regression in our previous blog post. Regression in machine learning.
So, let’s draw the regression.
In the second argument, we passing the “xTrainSimple”, because the predict function will return us the predicated salary of the training set.
Image1 displays the plot for the Training set predicted salary, while Image2 will display the Test set predicted Salary.
Other Question and query which you can found during your machine learning coding.
We have learned the simple linear regression but here we have another point that needs to discuss like how to interpret “simpleRegres”.
To evaluate the model, and to print the intercept, you need to write the code as shown below.
How to display co-efficient of feature in Simple Linear Regression?
To check out the coefficient, are going to relate to each feature in our dataset.
This will return the coefficient for each feature.
How to use Boston Dataset in Simple Linear Regression?
It is a housing sort of data, actual data from 1970. Very old data. but you can work and perform your coding practice.
Now if you want to perform real data analysis than, Boston dataset is present in sklearn.
from sklearn.datasets import load_boston
Boston is a dictionary with so much information.
To get the description of the Boston data set here you need to write the code.
#To get the data.
#To grab the target, target price in thousand.