SHOW publication #661 - ABLE Publications

Sequential decision-making problems with multiple objectives are natural to many application domains of AI-enabled systems. As these systems are increasingly used to work with people or to make decisions that impact people, it is important that their reasoning is intelligible to the end-users and stakeholders, to foster trust and effective human-agent collaborations. However, understanding the reasoning behind solving sequential decision problems is difficult for end-users even when white-box decision models such as Markov decision processes (MDPs) are used. Such intelligibility challenge is due to the combinatorial explosion of possible strategies for solving long-horizon problems. The multi-objective optimization aspect further complicates the problem as different objectives may conflict and reasoning about tradeoffs is required. These complexities pose a barrier for end-users to know whether the agent has made the right decisions for a given context, and may prohibit them from intervening if the agent was wrong. The goal of this thesis is to develop an explainability framework that enables the agent making sequential decisions to communicate its goals and rationale for its behavior to the end-users. We present an explainable planning framework for MDP, particularly to support problem domains with multiple optimization objectives. We propose consequence-oriented contrastive explanations, in which an argument for an agent's policy is in terms of its expected consequences on the task objectives, put in context of the selected viable alternatives to demonstrate the optimization and tradeoff reasoning of the agent. Our modeling framework supports reward decomposition, and augments MDP representation to ground the components of the reward or cost function in the domain-level concepts and semantics, to facilitate explanation generation. Our explanation generation method computes policy-level contrastive foils that describe the inflection points in the agent's decision making in terms of optimization and trade-off reasoning of the decomposed task objectives. We demonstrate the applicability of our explainable planning framework by applying it to three planning problem domains: waypoint-based navigation, UAV mission planning, and clinic scheduling. We design and conduct a human subjects experiment to evaluate the effectiveness of explanations based on measurable task performance. We design the users' task in the experiment to be: assessing the agent's planning decisions to determine whether they are the best decisions for a given problem context. Our experimental results show that our proposed consequence-oriented contrastive explanation approach significantly improves the users' ability to correctly assess the agent's planning decisions, as well as the users' confidence in their assessment. Lastly, we investigate the feasibility of a user-guided approach to our consequence oriented contrastive explanation paradigm. We propose a theoretical framework and approaches to formulate Why Not behavioral questions as state-action constraints and linear temporal logic constraints on the planning problem, and to solve for satisfying policies in order to explain the full impact that the queried behavior has on the subsequent decisions and on the task objectives.