Analyzing the Effect of Stochastic Transitions in Policy Gradients in Deep Reinforcement Learning.
Ângelo Gregório LovattoThiago Pereira BuenoLeliane Nunes de BarrosPublished in: BRACIS (2019)
Keyphrases
- transition model
- state transition
- reinforcement learning
- state transitions
- state space
- control policies
- optimal policy
- model free reinforcement learning
- reward function
- direct policy search
- policy iteration algorithm
- policy search
- markov decision process
- machine learning
- learning automata
- function approximators
- markov decision processes
- control policy
- state action
- multi agent
- continuous state spaces
- continuous state
- dynamic programming
- stochastic approximation
- temporal difference
- action space
- partially observable markov decision processes
- partially observable
- optimal control