FORMULA1

Competitive car racing via reinforcement learning

About

ABOUT

In this project, we train reinforcement learning agents to navigate a racetrack in a simple continuous control task. We first train and compare the performance of 5 single-agent algorithms, and then train a multi-agent algorithm to have agents navigate the course competitively. Check out details of our project below!

Methods

METHODS

In the first phase of our project, we trained and compared several single-agent RL methods. In the second phase, we focused on training a multi-agent method.

Single-agent RL

DQN

DQN is an extension of Q-learning that uses deep neural networks to approximate the Q-function

DDQN

DDQN is a variant of DQN that uses two neural networks in an attempt to stabilize the training process

DDPG

DDPG is a model-free algorithm that combines the actor-critic approach with DQN

A3C

A3C is an actor-critic method with stable convergence in which multiple models are trained in parallel

PPO

PPO is a policy gradient method that uses clipping in the objective function to ensure steady updates

Multi-agent RL

MADDPG

Many real-world applications of RL involve interaction between many different agents. Traditional RL algorithms that were designed for single-agent settings do not perform well in multi-agent domains. For example, the non-stationary environment presents challenges for Q-learning, and policy gradient methods face issues of high variance in multi-agent settings.

In this work, we implement and train MADDPG, a recently developed RL method for a multi-agent setting. MADDPG extend the actor-critic policy gradient methods framework. Here, during the training phase, the critic is given additional information about the policy of other agents, but this information is hidden from the actor. During the execution phase, only the local actors act in a decentralized manner. We implemented and trained two agents competing against each other in the car racing environment using MADDPG.

Single agent

Multi agent

Team Members

FORMULA1

ABOUT

METHODS

Single-agent RL

DQN

DDQN

DDPG

A3C

PPO

Multi-agent RL

MADDPG

TEAM MEMBERS

Mert Albaba

Zalan Fabian

Mustafa Avcu

Trisha Jani

Eray Erturk

Parsa Vahidi