Actor-Critic Reinforcement Learning with Energy-Based Policies.
Nicolas HeessDavid SilverYee Whye TehPublished in: EWRL (2012)
Keyphrases
- actor critic
- reinforcement learning
- policy gradient methods
- optimal policy
- policy gradient
- temporal difference
- optimal control
- reinforcement learning algorithms
- natural actor critic
- approximate dynamic programming
- policy iteration
- neuro fuzzy
- partially observable markov decision processes
- markov decision problems
- average reward
- function approximation
- gradient method
- state space
- markov decision process
- control policy
- markov decision processes
- model free
- reward function
- machine learning
- dynamic programming
- transfer learning
- decision problems
- long run
- finite state
- learning algorithm
- monte carlo
- rl algorithms
- temporal difference learning
- control problems
- partially observable
- state action
- infinite horizon
- approximation methods
- adaptive control
- multi agent
- reinforcement learning problems
- average cost