Model-Free Trajectory Optimization for Reinforcement Learning.
Riad AkrourGerhard NeumannHany AbdulsamadAbbas AbdolmalekiPublished in: ICML (2016)
Keyphrases
- model free
- reinforcement learning
- reinforcement learning algorithms
- function approximation
- temporal difference
- policy iteration
- markov decision processes
- rl algorithms
- reinforcement learning methods
- neural network
- average reward
- temporal difference learning
- state space
- policy evaluation
- dynamic programming
- multi agent
- feature selection