Stochastic Optimization Methods for Policy Evaluation in Reinforcement Learning.
Yi ZhouShaocong MaPublished in: Found. Trends Optim. (2024)
Keyphrases
- optimization methods
- policy evaluation
- reinforcement learning
- monte carlo
- stochastic methods
- temporal difference
- least squares
- model free
- stage stochastic programs
- policy iteration
- markov decision processes
- optimization method
- function approximation
- optimization problems
- simulated annealing
- td learning
- variance reduction
- reinforcement learning algorithms
- semi parametric
- optimal policy
- state space
- gradient method
- markov chain
- genetic algorithm
- evolutionary algorithm
- markov decision problems
- learning algorithm
- evaluation function
- markov decision process
- average reward
- importance sampling
- linear model
- multi agent
- machine learning