What is Imitation Learning?

Imitation Learning is not a recent advancement in Machine Learning, it’s been here for quite a while now. One of the first uses of imitation learning dates back to 1989 when it was used to train ALVINN- one of the first self-driving cars in the world. However, for whatever reason, there isn’t much of a discussion about Imitation Learning. Imitation learning is sometimes considered as a form of Reinforcement Learning, but there are some well-established differences between the two. But that’s a topic for another blog post. In this blog post, I will be talking exclusively about Imitation Learning.

Imitation Learning
Imitations. Image by Tara Winstead

Imitation Learning

Imitation Learning is a form of Supervised Machine Learning in which the aim is to train the agent by demonstrating the desired behavior. Let’s break down that definition a bit. We have the following 3 components in Imitation Learning-

  1. The Environment – The environment can be a real place, however, it mostly is just a simulation.
  2. The Agent – The agent is the one we want to train. It can be any Machine Learning model, but it is mostly a Deep Neural Network. If you are familiar with Reinforcement Learning, then you can correlate this definition of agent to that of Reinforcement Learning.
  3. The Trainer – The trainer is the one who trains the agent. It is mostly a human who demonstrates to the agent how to perform a task.

The trainer trains the agent in the environment during the training episode following which the agent behaves in the environment based on what it has learned during the training phase.

Example

Suppose you want to train a model to race a car in a car racing game. In this case, the game is the environment. The agent will be the neural network that will learn how to play the game progressively over time as the trainer teaches it. The trainer in this case will be a human playing the game, whom the model will try to learn from.

During Training, the human player(the Trainer) will play the game, and the Neural Network(the Agent) will learn from it. Once the training is complete, the agent will be able to play the game on its own.

The Training

Since Imitation Learning is a form of supervised learning, it is trained as any supervised learning algorithm. The agent(a Neural Network model) is given a set of input features(the independent variable) and it is trained to predict a target variable(the dependent variable). During the training phase, when the trainer gives demonstrations to the agent, the agent gets both, the input features and the target variable. The input features are the ones that the model is getting from the environment, the target variable is the one that the model is getting from the trainer(its actions). The training data, therefore, consists of state-action pairs.

Example

Let’s take the example of the same car racing game. In this game, during the training phase, the model is getting the input features from the environment. The input features can be the view of what is in front of the car at a particular moment in the game, the speed of the car, the level of acceleration or braking, etc. This target variable is the action of the trainer at any particular moment. It is this behavior of the trainer that the agent has to learn. So if there is an obstruction in the way of the car, then the trainer will change the path of the car. Based on this demonstration, the agent will learn that a particular input feature(an obstruction) leads to changing the path.

To put it more formally, during the training phase, the agent will observe the state of the environment and the actions take by the trainer and frame a policy based on it.

A few jargons used in the above definition demand an explanation –

  1. State – The state describes the current situation of the environment.
  2. Policy – Policy refers to the strategy that the model has learned based on the training data and based on which the agent will take all future decisions.

I hope with this blog post you got a basic understanding of Imitation Learning. In my future blog posts, I will be talking about types of Imitation Learning, and how it differs from Reinforcement Learning.

References

Imitation Learning Tutorial by Yisong Yue & Hoang M. Le, ICML 2018, here.