An optimizer is used to iteratively change the parameters of the model. Optimizers are used along with loss function to adjust the parameters of the Neural Network model. In this chapter of the Pytorch Tutorial, you will learn about optimizers available in the Pytorch library and how to use them.
Importing Optimizer
In Pytorch library, the optimizers are located in the torch.optim
module.
from torch import optim
You need to create an instance of the optimizer that you want to use.
Example
In this example, we will create an instance of the SGD
class. SGD
class is used to perform Stochastic Gradient Descent and hence optimize the parameters of the model.
While creating an instance of the optimizer, you need to provide the parameters of the model and learning rate as arguments. Additionally, you might need to provide other hyper-parameters as arguments to the optimizer, depending on the optimizer. For a Stochastic Gradient Descent optimizer, you will need to provide just the parameters and the learning rate.
optimizer = optim.SGD(model.parameters(), lr=0.001)
The torch.optim
module contains most of the commonly used optimizers. Some of the optimizers available in Pytorch are-
Optimizer Class | Optimizer Name |
---|---|
SGD | Stochastic Gradient Descent. Momentum Optimizer can be used by passing the momentum argument to SGD. Nestrov Accelerated Gradient(NAG) Optimizer can be used by passing the nestrov argument as True to SGD. |
Adagrad | Adaptive Gradient Optimizer |
RMSprop | Root Mean Square Propagation Optimizer, but is commonly known as RMSProp. |
Adam | Adaptive Moment Estimation Optimizer, but is commonly known as Adam. |
zero_grad()
In Pytorch, by default, the gradients calculated during the backward pass of the training loop keep on accumulating. Since this is not desirable, it is necessary to set the gradients to 0 in each iteration of the training loop. This is where the zero_grad()
method comes in handy- it sets all the gradients to 0. zero_grad()
method is implemented by all the optimizers in the Pytorch library.
# Setting gradients to 0
optimizer.zero_grad()
step()
Once the gradients are calculated for all the parameters, we need to adjust the model parameters. This is can be done by calling the step()
method. step()
method is implemented by all the optimizers in the Pytorch library.
# Adjusting the parameters of the model
optimizer.step()