Action Advising with Advice Imitation in Deep Reinforcement Learning.
Ercument IlhanJeremy GowDiego Perez LiebanaPublished in: CoRR (2021)
Keyphrases
- reinforcement learning
- action selection
- state action
- action space
- partially observable domains
- state space
- function approximation
- reward shaping
- agent learns
- transition model
- reinforcement learning algorithms
- human actions
- temporal difference learning
- model free
- temporal difference
- markov decision processes
- optimal policy
- partially observable
- sensory inputs
- robotic control
- function approximators
- machine learning
- markov decision process
- reward function
- optimal control
- learning problems
- action recognition
- data sets
- dynamic programming
- learning algorithm
- multi agent
- fitted q iteration
- deep learning
- agent receives
- state and action spaces
- learning process
- autonomous learning
- markov decision problems
- reasoning about actions