Sample-Efficient Reinforcement Learning via Conservative Model-Based Actor-Critic.
Zhihai WangJie WangQi ZhouBin LiHouqiang LiPublished in: AAAI (2022)
Keyphrases
- reinforcement learning
- actor critic
- function approximation
- model free
- temporal difference
- reinforcement learning algorithms
- policy gradient
- neuro fuzzy
- optimal control
- approximate dynamic programming
- dynamic programming
- linear program
- natural actor critic
- gradient method
- evaluation function
- learning problems
- markov decision processes
- transfer learning
- optimal policy
- cost function