An activation function is applied to the output of the weighted sum of the inputs. The role of an activation function is to introduce a non-linearity in the decision boundary of the Neural Network. In this chapter of the Pytorch Tutorial, you will learn about the activation functions available in the Pytorch library. You will also learn how you can use these activation functions in your Neural Network.
The activation functions can be applied to a layer of the network in the following ways-
- By using the activation function layer- These are classes which can be used as activation functions.
- By using the activation function definition- These are functions which can be used as activation functions.
We will be looking at both the ways in this chapter one by one.
Activation Function Layer
When using the activation function layer, we add the layer to the architecture of our model just like any other layer. Therefore, the activation function layer is added inside the constructor of the Neural Network class and is used with the torch.nn.Sequential
API.
Importing Activation Function Layer
The activation function layers are present in the torch.nn
module.
from torch import nn
Using the Activation Function Layer
You need to create an instance of the activation function layer that you want to use. Next, you need to provide input to the layer as you would to any other layer. The layer will apply the activation function and give the output.
Example
In this example, we will create a layer by creating an instance of the ReLU
class. We will provide a tensor to the layer as input and see its output.
# Create a tensor
input= torch.tensor([-2, -1, 0, 1, 2])
# Create an instance of the ReLU activation layer
relu_layer1 = nn.ReLU()
# Feeding input tensor to ReLU layer and storing the output
output = relu_layer1(input)
print(output)
# Ouputs- tensor([0, 0, 0, 1, 2])
Activation Function Definitions
Pytorch has implemented many activation functions which can be used directly in your model by calling these functions. These functions are used inside the forward()
method when defining the flow of data through the model.
Importing the Activation Function definition
The activation function layers are present in the torch.nn.functional
module. This module is commonly imported as F
.
from torch.nn import functional as F
Using the Activation Function Definition
You need to provide input to the function as you would to any other function. The function will apply the activation function and give the output.
Example
In this example, we will use the apply the relu()
activation function to an input and see its output.
# Create a tensor
input= torch.tensor([-2, -1, 0, 1, 2])
# Feeding input tensor to relu function and storing the output
output = F.relu(input)
print(output)
# Ouputs- tensor([0, 0, 0, 1, 2])
Activation Functions in Pytorch
The Pytorch library contains most of the commonly used activation functions. These activation functions are available as Layers(or classes) and definitions(or functions). The table below contains some of the most commonly used activation functions and their respective class and function names.
Activation Function Layer Class (in torch.nn) | Activation Function definition (in torch.nn.Functional) | Brief Description |
---|---|---|
Sigmoid | sigmoid() | Computes the sigmoid of the input. Sigmoid function is defined as- σ(x) = 1/(1+exp(-x)) |
Tanh | tanh() | Computes the hyperbolic tangent of the input. Hyperbolic tangent function is defined as- tanh(x) = 2σ(2x)-1 |
ReLU | relu() | Rectified Linear Unit. ReLU function is defined as- relu(x) = max(0,x) |
LeakyReLU | leaky_relu() | Leaky Rectified Linear Unit. Leaky ReLU function is defined as- leakyrelu(x) = max(αx, x) α is the hyper-parameter typically set to 0.01. |
ELU | elu() | Exponential Linear Unit. ELU function is defined as- elu(x) = α(exp(x) – 1) if z<0 elu(x) = x if z>0 |
Softmax | softmax() | Softmax is used as the final layer for multi-class classification problems. It is used to predict the probability of each output class. |
Softplus | softplus() | Softplus is used as a smooth approximation to the ReLU activation function. |