Explaining Learned Reward Functions with Counterfactual Trajectories.
Jan WehnerFrans A. OliehoekLuciano Cavalcante SiebertPublished in: CoRR (2024)
Keyphrases
- reward function
- markov decision processes
- reinforcement learning
- state space
- optimal policy
- trajectory data
- simple examples
- inverse reinforcement learning
- multiple agents
- reinforcement learning algorithms
- moving objects
- state variables
- transition probabilities
- markov decision process
- machine learning
- search engine