Pytorch Optimizers

An optimizer is used to iteratively change the parameters of the model. Optimizers are used along with loss function to adjust the parameters of the Neural Network model. In this chapter of the Pytorch Tutorial, you will learn about optimizers available in the Pytorch library and how to use them.

Importing Optimizer

In Pytorch library, the optimizers are located in the torch.optim module.

from torch import optim

You need to create an instance of the optimizer that you want to use.

Example

In this example, we will create an instance of the SGD class. SGD class is used to perform Stochastic Gradient Descent and hence optimize the parameters of the model.

While creating an instance of the optimizer, you need to provide the parameters of the model and learning rate as arguments. Additionally, you might need to provide other hyper-parameters as arguments to the optimizer, depending on the optimizer. For a Stochastic Gradient Descent optimizer, you will need to provide just the parameters and the learning rate.

optimizer = optim.SGD(model.parameters(), lr=0.001)

The torch.optim module contains most of the commonly used optimizers. Some of the optimizers available in Pytorch are-

Optimizer ClassOptimizer Name
SGDStochastic Gradient Descent. Momentum Optimizer can be used by passing the momentum argument to SGD. Nestrov Accelerated Gradient(NAG) Optimizer can be used by passing the nestrov argument as True to SGD.
AdagradAdaptive Gradient Optimizer
RMSpropRoot Mean Square Propagation Optimizer, but is commonly known as RMSProp.
AdamAdaptive Moment Estimation Optimizer, but is commonly known as Adam.

zero_grad()

In Pytorch, by default, the gradients calculated during the backward pass of the training loop keep on accumulating. Since this is not desirable, it is necessary to set the gradients to 0 in each iteration of the training loop. This is where the zero_grad() method comes in handy- it sets all the gradients to 0. zero_grad() method is implemented by all the optimizers in the Pytorch library.

# Setting gradients to 0
optimizer.zero_grad()

step()

Once the gradients are calculated for all the parameters, we need to adjust the model parameters. This is can be done by calling the step() method. step() method is implemented by all the optimizers in the Pytorch library.

# Adjusting the parameters of the model
optimizer.step()