Games on the Atari 2600 platform have served as a benchmark for reinforcement learning algorithms in recent years, and while deep reinforcement learning approaches make progress on most games, there are still some games that the majority of these algorithms struggle with. These are called hard exploration games. We introduce two new developments for the Random Network Distillation (RND) architecture. We apply self-attention and the mechanism of *ego motion* on the RND architecture and we evaluate them on three hard exploration tasks from the Atari platform. We find that the proposed ego network model improve the baseline of the RND architecture on these tasks.
0 commit comments