Training a Model in Pytorch

You have learned about all the different components that are used to train a model using Pytorch. In this chapter of the Pytorch Tutorial, you will learn how to use these different components together for training a model.

A Pytorch Training Loop

In this example, we will look what a basic training loop looks like in Pytorch.

for epoch in range(epochs):
    for batch in train_dataloader:
        optimizer.zero_grad()
        input, label= batch
        output=mynet(input)
        loss=loss_function(output, label)
        loss.backward()
        optimizer.step()

Code Explanation

The code in the above example can look overwhelming. So let’s go through it line by line and simplify it.

for epoch in epochs – Runs the training loop for a given number of times over the dataset.
for batch in train_dataloader – This will cause the loop to run for every batch of the dataset.
optimizer.zero_grad() – Sets the value of all gradients for the optimizer to 0.
input, label = batch – Un-packs the tuple batch and extract the values of inputs as input and labels as label.
output=mynet(input) – Feeds the inputs to mynet and predict the output.
loss=loss_function(output, label) – Calculates the loss for that batch of dataset using
loss.backward() – Does a backpropagation and calculates the backward gradient for each parameter of mynet.
optimizer.step() – Adjusts the models parameters.


Adding Validation in Training Loop

Now that you know how to run a basic training loop in Pytorch, it’s time you learn how you can perform validation on a validation dataset.

The following example shows how you can perform validation while training your neural network.

for epoch in range(epochs):

    # Performing Training for each epoch
    training_loss = 0.
    model.train()

    # The training loop
    for batch in train_dataloader:
        optimizer.zero_grad()
        input, label = batch
        output = mynet(input)
        loss = loss_function(output, input)
        loss.backward()
        optimizer.step()
        training_loss += loss.item()


    # Performing Validation for each epoch
    validation_loss = 0.
    model.eval()

    # The validation loop
    for batch in validation_dataloader:
        input, label = batch
        output = mynet(input)
        loss = loss_function(output, label)
        validation_loss += loss.item()

    # Calculating the average training and validation loss over epoch
    training_loss_avg = training_loss/len(train_dataloader)
    validation_loss_avg = validation_loss/len(validation_dataloader)

    # Printing average training and average validation losses
    print("Epoch: {}".format(epoch))
    print("Training loss: {}".format(training_loss_avg))
    print("Validation loss: {}".format(validation_loss_avg))

The code in this example is mostly similar to the code in previous example. Hence, you should be able to understand most of the code in this example. However, a few parts need explaination-

The purpose of using train() and eval() methods on the model is to set the model in to training and evaluation/testing mode respectively. These methods are used because some layers of the Neural Network such as Batch Normalization, Dropout behave differently when in training mode as compared to that in testing mode. However, the use of train() and eval() methods is not necessary if you are not using any of the layers that behave differently during training and testing.

model.train() – Sets the model mynet into training mode. train() method should be called on the model before training it.
model.eval() – Sets the model mynet into evaluation mode. eval() method should be called on the model before testing it.

Another part that requires some explanation is calculating the training and validation losses. Both training and validation losses are calculated in a similar way. We initialize them to 0, and keep adding the losses for every batch of training or validation dataset. Therefore, training_loss is the total training loss over an epoch and validation_loss is the total validation loss over an epoch. Finally, the average training loss, training_loss_avg is calculated by dividing training_loss by the length of train_dataloader. Similarly, the average validation loss, validation_loss_avg is calculated by dividing validation_loss by the length of validation_dataloader.

Note– The length of a data loader such as train_dataloader, or validation_dataloader is the number of batches returned by it. So, if a dataset contains 1000 samples and a data loader feeds batches of 10 instances per batch to the neural network. Then, the length of the data loader will be 100, which is the number of batches returned by it.


Making Predictions

Once you have performed validations against your model and it has sufficient accuracy, you might want to use model to make predictions. This is simple in Pytorch.

You just have to pass the inputs to out pytorch model, which returns the probabilities associated with each output. To consider the label with the highest probability, make use of the Pytorch’s argmax() method, which returns the most probable prediction.

# Make Predictions
prediction_proba = mynet(input)

# Convert from probabilities to sparse labels
prediction = prediction_proba.argmax()

# Print predictions
print(prediction)

Alternatively, you can also make use of a test loop to get the accuracy of model’s over a test dataset.

# Set the number of correct predictions to 0
num_correct_pred = 0

# Test loop
for batch in test_dataloader:
    input, label = batch
    output = mynet(input)
    _, predictions = torch.max(output.data, 1)
    num_correct_pred += (predictions == label).sum().item()

# Calculating Accuracy
accuracy = num_correct_pred/(len(test_dataloader)*test_dataloader.batch_size)

# Print Accuracy
print("Accuracy: {}".format(accuracy))