Soft-Robust Actor-Critic Policy-Gradient.
Esther DermanDaniel J. MankowitzTimothy A. MannShie MannorPublished in: CoRR (2018)
Keyphrases
- policy gradient
- actor critic
- reinforcement learning
- optimal control
- gradient method
- function approximation
- temporal difference
- policy gradient methods
- reinforcement learning algorithms
- neuro fuzzy
- approximate dynamic programming
- average reward
- variance reduction
- approximation methods
- neural network
- markov decision processes
- optimization method
- single agent
- partially observable markov decision processes
- least squares
- multi agent systems
- natural actor critic