Guided Exploration in Reinforcement Learning via Monte Carlo Critic Optimization.
Igor KuznetsovPublished in: CoRR (2022)
Keyphrases
- monte carlo
- temporal difference
- guided exploration
- reinforcement learning
- monte carlo methods
- stochastic approximation
- reinforcement learning algorithms
- actor critic
- function approximation
- markov chain
- markovian decision
- policy evaluation
- monte carlo simulation
- adaptive sampling
- importance sampling
- exploratory learning
- particle filter
- temporal difference learning
- state space
- function approximators
- monte carlo tree search
- policy iteration
- variance reduction
- markov decision processes
- quasi monte carlo
- global illumination
- machine learning
- optimal solution
- optimal policy
- model free
- confidence intervals