Reinforcement Learning with Deep Energy-Based Policies.
Tuomas HaarnojaHaoran TangPieter AbbeelSergey LevinePublished in: CoRR (2017)
Keyphrases
- reinforcement learning
- optimal policy
- policy search
- control policies
- markov decision process
- markov decision processes
- partially observable markov decision processes
- reinforcement learning agents
- reward function
- cooperative multi agent systems
- reinforcement learning algorithms
- total reward
- control policy
- function approximation
- fitted q iteration
- temporal difference
- markov decision problems
- hierarchical reinforcement learning
- dynamic programming
- model free
- policy gradient methods
- state space
- long run
- approximate policy iteration
- decision problems
- state and action spaces
- machine learning
- infinite horizon
- management policies
- deep learning
- state abstraction
- search space
- multi agent reinforcement learning
- decision processes
- action space
- macro actions
- sufficient conditions
- average cost
- partially observable
- data sets