Food Delivery with Q-Learning

×

About this project

This reinforcement learning experiment simulates an agent delivering food.

The agent picks up items, delivers them to stations, and earns rewards. There are some pirate cells where food carried by the agent will be stolen. Carrying food also costs and requires some energy, and the agent can reload its energy (green arc) on some energy station cells. There are also portal pairs that the agent can use.

The agent can take 6 different actions: move in 4 directions, do nothing, or try to teleport (will do nothing if not on a portal cell).

The only information the agent sees at any time are: its position, the position of the food, the fact that it's carrying food or not, and its energy level.

The agent learns with the Q-learning algorithm. In this alogrithm, in order to explore different strategies you have to sometimes take a random action rather choose the action that seems to be the most reawrding from what you've learned so far. This is controlled by a probability epsilon.

There is a rendering mode to watch smoothly what the agent is doing at the moment.

Increase the simulation steps per frame slider for faster learning! (I put the default value small to be gentle for some devices)