What is the activation function in a neural network?
Deep learning is part or subfield of machine learning. The neural network is the most common category of deep learning algorithms.
What is a Neural Network?
In the previous blog post, we learned how to fit data in our model and get the output. Like whether a particular user will buy the product or not.
The same operation will work using a Neural network or deep learning.
Each neuron reads the weighted sum of inputs from the previous layer, from all the neurons applying and activating function, and passes the output to all the neurons in the subsequent layer.
What is the activation function in deep learning?
W1X1 +W2X2 + W3X3 + …
Neuron gets the weighted sum of all the input layers and applies an activation function to get the output which is passed to the next neuron in the next layer.
What is the RELU (Rectified Linear Unit) activation function??
In the hidden layer, we use the REUL (Rectified Linear Unit) activation function, that is, a rectifier, linear unit.
As you can see in the image, one line is highlighted, for some variable up to a point the neuron ignores that input.
Imagine that the user might not buy the product up to age 18.
So, it would not give us any output up to age 18. When the age crosses 18 then the chance to buy the product increases.
When the age of the customer increases or up the chance to buy the product would go up linearly.
That is what the RELU Activation function applied in the hidden layers of the neural network / neural net.
What is SoftMax Activation Function with python?
It is also known as Softargmax or normalized exponential function.
Softargmax is the function that is going to apply at the output layer of the classification model. It gives the probability of the output to buy or not.
Softargmax will read the input here in this example the age of the buyer gets multiplied by a weight, salary multiplied by a weight, but the output would be the probability.
If you apply the SoftMax activation function, whichever class has the higher probability that class is the predicted class, it could be whether the buyer is going to buy or not.
It can be applied to more than two classes also.
CROSS-ENTROPY Loss Calculation
SoftMax is used with CROSS-ENTROPY loss calculation. Cross-entropy defines how different two distributions and the predicted distribution and the actual distribution.
If its value is low means the actual and the predicted value are in sync.
The classifier tries to minimize the difference between the predicted value and actual value by applying the SoftMax activation function and Cross-Entropy loss calculation.
In the neural network, the input data is received from the input layer weighted sum is calculated, an activation function is applied in all neurons, and output data is passed to the next layer again in the next layer weighted sum is calculated.
An activation function is applied and output is passed to the next layer and next layer and so on.
In the end, we get the output.
It could be the probability for classification models or it could be the actual value in a regression model.
This process is continuing until the loss is minimized. When the data moves end to end and from input to the output layer. It is called one EPOCH. We can go through the multiple EPOCH until we get certain desired accuracy.
For the classification model, we use the CROSS-ENTROPY Loss minimization technique to adjust the weights.
In the image, you see the arrow which shows the feedback loop in the neural network. The loss is passed to the input layer so that the weights can be adjusted to minimize the loss.
This is how by adjusting weights, a deep learning artificial neural network learns.
The more the EPOCH, the accuracy goes up and the loss gets minimized because with each EPOCH the neuron learns something new about data, and based on that they keep on adjusting the weights.
Several deep learning libraries are available to construct the neural network. In the upcoming blog post, I am going to give you an idea of each of them.