Matrix of features machine learning

Matrix of features machine learning

The general information you need to know before you start studying as a data scientist is covered in the previous two blog posts. If you want to learn and get the basic knowledge here I have specified the link.

Machine learning and the fortune of the earth!!!
Getting started as Data Scientist

Through today’s post, we will learn about
1) Using Google Colab how to create a NEW NOTEBOOK.
2) How to use/import important the library because this is the first step in learning machine learning.
3) How to import CSV FILE
4) How to print the value of the CSV File.

Let’s Create a NEW NOTEBOOK Using Google Colab

Step 1) Copy this URL in your browser “”.
Step 2) Click on the file menu and select CREATE NEW NOTEBOOK.
Step 3) Check the below image and rename your file name.
Step 4) If you want to add some TEXT Then click on “+Text” or if you want to add some Code then click on the “+CODE“.
Step 5) Now you are ready to work with Machine learning this is the advantage of Google Colab.

You don’t need to download any files just create a new notebook and you are good to go.

Let’s import the necessary library

import numpy as np 
import matplotlib.pyplot as plt 
// will allow us to plot charts or graph (for data visualizations). 
import pandas as pd

As another step, we will try to bring the data set. In this blog, I hope you have the knowledge to create a CSV file. If not, write in the comment section below and I will give you the information.

Let us now import the data set. To import the dataset, we have already imported “PANDAS”. Here I have created one variable which will help me to work with the DATASET.

dataset = pd.read_csv(‘YOUR_CSV_FILE_NAME.csv‘)

The above line will create a data frame you know all the values inside this dataset, and this data frame will be exactly this data set variable. The next, step is to create two entities.

1) The matrix of features.

2) The dependent variable vector.

Again, why do we need two entities?

Because the way we are going to build our future machinery model expects exactly these two entities in their input.

MP20 Yes
UP 5000Yes

Now you have imported the CSV file based on it we can perform the prediction operation, whether the student will buy or not. So when you work or learn ML or DS, the first column will work as a feature, and the last column will work as a Dependent variable. Here we will create separate 4 columns 1st three will be the matrix of feature columns while the last one will be dependent.

X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, -1].values 

Let’s understand this line of code “X =dataset.iloc[:, :-1].values

iLOC” will allow us to locate indexed. This function will take indexes of the columns we want to extract from the dataset not only the indexes of the column but the indexes of the rows.

In “python “:” means range. To use all the rows, we have to specify the “:”.

:-1” means in the python is the last column, which means we want to use all the columns except the last one.

.values:-” you can easily convert NumPy to pandas by invoking “.values”. Manu machine learning libraries are designed to work on the NumPy array.

Let’s understand this line of code “y = dataset.iloc[:, -1].values

Here we want to use the last column, that’s why we don’t include the range in the second parameter. But simply specify the “-1” to use the last column.

Now we are set and done to run our first example in Machine Learning. Below is the output. MOST IMP. MAKE SURE THAT YOU RUN ALL THE BLOCKS ONE BY ONE. LIKE Click on the Imported library run button then after the second block and the last one otherwise you will receive an error. So make sure to run all the blocks STEP BY STEP.

You can also plot using the matplotlib.pyplot library. Here is down below I have specified the code. It will generate the graph/plot for you based on the record which you have in your dataset.

Leave a Comment