Decision-Making for Autonomous Systems in Partially Observable Environments
Decision-making for autonomous systems acting in real world domains are complex and difficult to formalize. For instance, consider the task of autonomously navigating a mobile robot in an automated manufacturing facility. Its task is to transport hazardous material from a collection site to a disposal site. This is a navigation problem where the robot has to consider numerous variables such as collision avoidance, recognition of goal locations, accurate selection of the desired material, and knowledge of its location within the facility. The difficulty is often to reliably model the uncertainties and dynamics of the robot-environment interaction when the robot can only partially observe the states of the environment. Therefore a principal problem in designing mobile robots that can efficiently navigate indoor domains to achieve a desired task autonomously is to construct robust models for efficient planning and motion control in stochastic domains. This is still a difficult and open problem despite significant advances. The robot must generate efficient policies to reliably accomplish its tasks while accounting for uncertainties in both its action and perception. In this dissertation we model the uncertainties in action selection and perception using a sequential decision-making model. The mathematical formalism adopted is the Partially Observable Markov Decision Process (POMDP), a generalization of the well-known Markov Decision Process (MDP). Though POMDPs represent a robust formalism for the modeling of agent-based decision making, it is still very difficult and often intractable to compute optimal solutions for problems with large state space due to the high dimensionality of the underlying belief space. We propose a technique called Goal-Specific Representation (GSR) that exploits domain structure and generates policies over a subset of the state space given a map of the domain, a starting location and a goal location. We solve the resulting POMDP model using a Point-Based Value Iteration (PBVI) solver and apply the generated policies for navigation on an autonomous robot. We anticipate that the results from this work can be applied in manufacturing facilities to enhance automation and healthcare domains for assisted care.