Discriminator-Actor-Critic: Addressing Sample Inefficiency and Reward Bias in Adversarial Imitation Learning.
Ilya KostrikovKumar Krishna AgrawalDebidatta DwibediSergey LevineJonathan TompsonPublished in: ICLR (Poster) (2019)
Keyphrases
- imitation learning
- reinforcement learning
- actor critic
- policy gradient
- multi agent
- reinforcement learning algorithms
- temporal difference
- reinforcement learning methods
- function approximation
- average reward
- variance reduction
- state space
- optimal control
- reward function
- sample size
- model free
- machine learning
- policy iteration
- control problems
- optimal policy
- markov decision processes
- gradient method
- partially observable
- control strategies
- learning capabilities
- action selection
- maximum margin
- long run
- humanoid robot
- learning algorithm
- state action
- transfer learning
- mobile robot
- search space
- feature space