Integrating Policy Summaries with Reward Decomposition for Explaining Reinforcement Learning Agents.
Yael SeptonTobias HuberElisabeth AndréOfra AmirPublished in: CoRR (2022)
Keyphrases
- reinforcement learning agents
- reinforcement learning
- optimal policy
- reward function
- dynamic environments
- average reward
- policy gradient
- state abstraction
- action selection
- inverse reinforcement learning
- expected reward
- function approximation
- policy iteration
- markov decision process
- multi agent
- markov decision problems
- state space
- partially observable
- model free
- reinforcement learning algorithms
- infinite horizon
- search algorithm
- transfer learning
- initial state
- markov decision processes
- action space
- data points
- dynamic programming
- long run
- multi agent environments
- machine learning
- data mining