CASA: A Bridge Between Gradient of Policy Improvement and Policy Evaluation.
Changnan XiaoHaosen ShiJiajun FanShihong DengPublished in: CoRR (2021)
Keyphrases
- policy evaluation
- least squares
- policy gradient
- policy iteration
- reinforcement learning
- monte carlo
- temporal difference
- markov decision processes
- model free
- variance reduction
- optimal policy
- function approximation
- semi parametric
- td learning
- statistical inference
- fixed point
- partially observable markov decision processes
- gradient method
- optical flow
- markov decision problems
- gaussian process
- markov decision process
- machine learning
- bayesian inference
- dynamical systems
- dynamic programming
- decision making