Machine Learning : Feature Scaling
Feature Scaling, when these terms used in machine learning two things came into the picture.
1) Normalization, some data scientists also used the term Max-Min Scaling.
2) Standardization. (X-mean(x)/ standard deviation (x)) (Here is it subtracting each value of your feature by the mean of all the value of feature and then divide it by the standard deviation. This will place all your values of the features between around –3 and +3.)
In our previous blog post, we have already learned why we need to perform the Feature scaling operation after the splitting of our dataset. So, I urge you to read it before going further, to read please click here. You have to read it so, you will get the idea how to apply feature scaling to our model.
Feature scaling is like “Multigrain Roti”, you have to use and give importance to everyone. Based on feature scaling we can describe our machine learning model will be performed well or not. Now the question arises do we need to follow and apply the feature scaling to all our model then the answer is No.
“Normalization”. So, when to use normalization, and how it works?
Before dive deep in theoretically first take look at the equation given above. In normalization subtracting each value of your feature by the minimum value of the feature and then dividing by the difference between the maximum and minimum value of the feature.
When you have a normal distribution in most of the features then at that time, we will use the Normalization, because as you can see in the above normalization equation whenever we perform all our calculations using all features, we always get the value in between 0 and 1.
So, another question arises in your mind can we change the range instead of 0 and 1. Well, you know we use Google Colab to perform our operation, and in most of our blog posts, we have used the Scikit-Learn library. The library has a transformer called MinMaxScaler and it has hyperparameter “feature_range” that let the model change the range if you do not want 0-1 for some reason.
Ok so we have basic knowledge about feature scaling, like what are the technique we have to perform feature scaling, next question which arise in our mind how to apply the feature scaling? Another question which technique we suppose to use?
So, Normalization is used when we have normal distribution in our features, it is recommended when our features following normal distribution. Standardization, works well all the time. Therefore, here I am demonstrating the Standardization. (normalization vs standardization machine learning)
from sklearn.preprocessing import StandardScaler
Fit Method: – It will only compute or get the mean and standard deviation of all the values or features. Then we have transform, that will apply the above formula to indeed transform your value so that they can all be in same scale.
StandardScaler:- StandardScaler will distribute the data in a way so that the mean is zero and standard deviation is one.
Here we also apply the transform for test set, but here the data is new which we use in production later. So, we only apply the transform method. If we apply the sc.fit_transform to the test set we will get the new scaler. Which I guess not necessary no sense at all.
Alright, everything is on the same scale.