R2PG: Risk-Sensitive and Reliable Policy Gradient.
Bo LiuJi LiuKenan XiaoPublished in: AAAI Workshops (2018)
Keyphrases
- risk sensitive
- policy gradient
- optimal control
- markov decision processes
- dynamic programming
- reinforcement learning algorithms
- utility function
- model free
- reinforcement learning
- average reward
- infinite horizon
- control policies
- function approximation
- control strategy
- average cost
- sufficient conditions
- learning algorithm
- control strategies
- gradient method
- decision makers