Pretraining Deep Actor-Critic Reinforcement Learning Algorithms With Expert Demonstrations.
Xiaoqin ZhangHuimin MaPublished in: CoRR (2018)
Keyphrases
- reinforcement learning algorithms
- actor critic
- reinforcement learning
- markov decision processes
- model free
- state space
- temporal difference
- policy gradient
- reinforcement learning problems
- reinforcement learning methods
- learning algorithm
- temporal difference learning
- function approximation
- reward function
- stochastic games
- optimal policy
- dynamic environments
- function approximators
- decision making
- least squares
- dynamic programming
- multi agent
- neural network