Characterizing Agent Behavior Under Meta Reinforcement Learning With Gridworld



Journal Title

Journal ISSN

Volume Title



The capabilities of meta reinforcement learning agents tend to be heavily depend on the complexity and scope of the meta task over which they perform requiring different models, learning algorithms, and strategies to perform well. In this thesis, we show the fragility of agent design and limitations of agents across Gridworld-based meta tasks of increasing complexity. We begin by building a characterization of the complexity of meta tasks within a domain generalization context. We run experiments that demonstrate the ability of agents to perform effectively on meta tasks parameterized with different environmental states, but similar underlying rules. Next, we perform experiments that expose the limitations of those same agents over tasks with different underlying rules, but similar observational spaces. These experiments show that generalization-based strategies succeed with meta tasks that sample from a small scope of base tasks with similar underlying rules, but break beyond that complexity. We also infer from observed agent behaviors that the limitations of agents are attributable to the nature of the model architecture and the meta task design. Furthermore, we run experiments that identify the sensitivity of agent behavior to physical features by augmenting the agent observation size. These experiments show a resilience to limited environmental information, but a lack of spatial awareness to abundant environmental information. Overall, this work provides a baseline for meta reinforcement learning with the Gridworld task and exposes the necessary considerations of agent and environmental design.