Final project for Reinforcement Learning course
Authors: Lorenzo Basile, Irene Brugnara
Source files:
gridworld.py
contains the implementation of the classGridworld
, defining a two-dimensional grid environment in which an agent moves from an initial cell to a target cell. The position of the target cell is not known, but at each time step the agent receives a random binary signal from the target depending on its distance from the target. The methodgridworld_search
implements a search algorithm based on Thompson sampling (or a greedy algorithm ifgreedy=True
);animation.py
contains an example animated run of the search algorithm;benchmark.py
andanalysis.py
respectively contain code to collect and process data on larger-scale runs of both Thompson algorithm and greedy algorithm (the former produces a pickle file which is to be read by the latter).