Quantile-Based Policy Optimization for Reinforcement Learning.
Jinyang JiangJiaqiao HuYijie PengPublished in: CoRR (2022)
Keyphrases
- reinforcement learning
- optimal policy
- policy search
- markov decision process
- action selection
- state space
- markov decision processes
- learning algorithm
- partially observable environments
- policy iteration
- reward function
- optimization algorithm
- decision problems
- policy gradient
- control policy
- function approximators
- partially observable
- optimization problems
- optimization process
- global optimization
- temporal difference
- partially observable markov decision processes
- probability distribution
- reinforcement learning algorithms
- model free
- asymptotically optimal
- multi objective
- learning process
- optimal control
- control policies
- multi agent
- genetic algorithm