Integrating Policy Summaries with Reward Decomposition for Explaining Reinforcement Learning Agents.
Yael SeptonTobias HuberElisabeth AndréOfra AmirPublished in: PAAMS (2023)
Keyphrases
- reinforcement learning agents
- reinforcement learning
- optimal policy
- reward function
- state abstraction
- dynamic environments
- expected reward
- policy gradient
- markov decision process
- average reward
- inverse reinforcement learning
- markov decision processes
- partially observable
- state space
- policy iteration
- transfer learning
- markov decision problems
- action selection
- long run
- reinforcement learning algorithms
- function approximators
- function approximation
- multi agent
- learning algorithm
- partially observable markov decision processes
- machine learning
- model free
- action space
- semi supervised learning
- dynamic programming
- search algorithm