Addressing reward bias in Adversarial Imitation Learning with neutral reward functions.
Rohit JenaSiddharth AgrawalKatia P. SycaraPublished in: CoRR (2020)
Keyphrases
- reward function
- imitation learning
- reinforcement learning
- multi agent
- state space
- markov decision processes
- reinforcement learning algorithms
- inverse reinforcement learning
- reinforcement learning methods
- function approximation
- optimal policy
- maximum margin
- average reward
- multiple agents
- humanoid robot
- robotic systems
- transfer learning
- machine learning
- markov decision process
- dynamic programming
- model free
- transition probabilities
- state variables
- supervised learning
- policy iteration
- optimal control